Constitutional AI: Harmlessness from AI Feedback
This repository provides supplementary material for our paper Constitutional AI: Harmlessness from AI Feedback.
There are no reviews yet. Be the first to send feedback to the community and the maintainers!
This repository provides supplementary material for our paper Constitutional AI: Harmlessness from AI Feedback.
anthropic-cookbook
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.courses
Anthropic's educational courseshh-rlhf
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"anthropic-sdk-python
anthropic-sdk-typescript
Access to Anthropic's safety-first language model APIsanthropic-tools
evals
PySvelte
A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizationsanthropic-retrieval-demo
Lightweight demo using the Anthropic Python SDK to experiment with Claude's Search and Retrieval capabilities over a variety of knowledge bases (Elasticsearch, vector databases, web search, and Wikipedia).toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" papersleeper-agents-paper
Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".anthropic-tokenizer-typescript
DecompositionFaithfulnessPaper
Love Open Source and this site? Check out how you can help us