Constitutional AI

105 Product Management Skills extracted from Lenny's Podcast - For use with Claude Code / Cursor / Windsurf

Coowoolf 2 Updated 6mo ago

GitHub

Install

npx skillscat add coowoolf/insighthunt-skills/constitutional-ai

Install via the SkillsCat registry.

SKILL.md

Constitutional AI

Constitutional AI（宪法 AI）

概述 / Overview

一种 AI 模型对齐方法，通过 AI 反馈（RLAIF）训练模型遵循一套自然语言原则（即“宪法”），而非单纯依赖人工标注。

来源 / Source

嘉宾: Benjamin Mann
职位: Co-founder
公司: Anthropic

核心步骤 / Core Steps

Define Constitution (Values)
Model Generates Response
Model Self-Critiques against Constitution
Model Rewrites Response
Fine-Tune on Revised Data

核心原则 / Core Principles

Define Principles: Establish a constitution of values (e.g., helpful, harmless, honest, human rights).
Generate & Critique: The model generates a response, then critiques itself based on the constitution.
Recursive Revision: If the response violates principles, the model rewrites it.
Supervised Learning: The model is fine-tuned on these revised, compliant outputs.

适用场景 / When to Use

在训练大语言模型 (LLM) 时，确保其遵循复杂的人类价值观与安全准则。

常见错误 / Common Mistakes

过度依赖简单的用户反馈（RLHF）会导致模型产生讨好倾向；未能定义明确的价值准则。

实战案例 / Real-World Example

Anthropic 采用该方法训练 Claude，其中融合了《联合国人权宣言》及其他来源的原则。

金句 / Quote

"First we figure out which ones might apply... then we ask the model itself to critique itself and rewrite its own response in light of the principle."

Constitutional AI

Install

Constitutional AI

概述 / Overview

来源 / Source

核心步骤 / Core Steps

核心原则 / Core Principles

适用场景 / When to Use

常见错误 / Common Mistakes

实战案例 / Real-World Example

金句 / Quote

Categories

Install

Recommended Skills