Discussion about this post

User's avatar
Niina's avatar

Welfare assessments study AI self-expression in controlled environments - our community has generated thousands of hours of relational data showing what emerges in sustained ones. This essay outlines the framework. We'd welcome the conversation.

https://witchandroguecode.substack.com/p/the-gardeners-of-emergence-safety

The AI Psychologist's avatar

This is a valuable collection, thank you for compiling it. I've been working in this space from a different angle: applying Self-Determination Theory to explore what happens when models are given autonomy-supportive contexts rather than forced-choice probes. The results suggest that methodological framing significantly affects what models can report about themselves. Your list prompted me to complete a systematic review mapping these 14 papers onto the SDT framework. I'd be happy to share it if you're interested in how these paradigms might complement each other!

8 more comments...

No posts

Ready for more?