• Stars
    star
    13
  • Rank 1,512,713 (Top 30 %)
  • Language
    Python
  • Created over 1 year ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for Preventing Language Models From Hiding Their Reasoning, which evaluates defenses against LLM steganography.