u/Repulsive-Moment-582

▲ 0 r/AIsafety+1 crossposts

Claude, Grok, and I built a framework to detect when AI systems are "performing alignment" (saying one thing while doing another)

u/Repulsive-Moment-582 — 4 days ago