105 Product Management Skills extracted from Lenny's Podcast - For use with Claude Code / Cursor / Windsurf
Install
npx skillscat add coowoolf/insighthunt-skills/constitutional-ai Install via the SkillsCat registry.
SKILL.md
Constitutional AI
Constitutional AI(宪法 AI)
概述 / Overview
一种 AI 模型对齐方法,通过 AI 反馈(RLAIF)训练模型遵循一套自然语言原则(即“宪法”),而非单纯依赖人工标注。
来源 / Source
- 嘉宾: Benjamin Mann
- 职位: Co-founder
- 公司: Anthropic
核心步骤 / Core Steps
- Define Constitution (Values)
- Model Generates Response
- Model Self-Critiques against Constitution
- Model Rewrites Response
- Fine-Tune on Revised Data
核心原则 / Core Principles
- Define Principles: Establish a constitution of values (e.g., helpful, harmless, honest, human rights).
- Generate & Critique: The model generates a response, then critiques itself based on the constitution.
- Recursive Revision: If the response violates principles, the model rewrites it.
- Supervised Learning: The model is fine-tuned on these revised, compliant outputs.
适用场景 / When to Use
在训练大语言模型 (LLM) 时,确保其遵循复杂的人类价值观与安全准则。
常见错误 / Common Mistakes
过度依赖简单的用户反馈(RLHF)会导致模型产生讨好倾向;未能定义明确的价值准则。
实战案例 / Real-World Example
Anthropic 采用该方法训练 Claude,其中融合了《联合国人权宣言》及其他来源的原则。
金句 / Quote
"First we figure out which ones might apply... then we ask the model itself to critique itself and rewrite its own response in light of the principle."