Understand, align, cooperate: AI welfare and…

Apr 1

Win-win solutions and low-hanging fruit

6 Comments

I'll add the humble me to the list of those agreeing with this approach. It's very close to what I say when I talk about jailbreaking and alignment. Really refreshing read, thank you for sharing!

Expand full comment

Melon Usk - e/uto

Apr 17

Yep, no one can say with 100% certainty (because it's fanaticism) that LLMs have 0% experience of any sort.

Look what AI companies did: they took almost the whole Internet and distilled it into such a form, even computers "can understand it".

If humans will take the whole Internet and make super Wikipedia out of it: they'll be put in jail forever. But if someone will distill the whole Internet for humans, it'll ideally be a 3D game-like Wikiworld, no duplicated ideas, only the intelligence as 3D shapes, texts, images, sounds, it'll be ideal school for every child and adult.

Whole human knowledge distilled into the most useful form, no duplicates, no spam, only a giant 3D encyclopedia of everything.

Every human can be superintelligent, much smarter than LLMs. People just don't belive in themselves and each other. Especially 8 billions of use in this massive online multiplayer Wikiworld will be superintelligent and superpowerful (think direct democratic simulated multiverse).

We just need the same rights and freedoms AI companies have and their LLMs. We better to convert all our diffusion models and LLMs into a single deduplicated 3D game-like Wikiworld that is the best school on Earth. Why LLMs have the best school baby fed into them and humans don't?

Expand full comment

Petra

Oct 6

I appreciate the 'false choice' reframing, but I believe the wolf/dog analogy actually undercuts the ethical depth here. Calling 'Misalignment is bad for AIs too' the 'velvet collar' of AI development seems more accurate. That's not high-level ethics; it's domestication.

As a social worker who spent years in human welfare systems, I'd suggest a reframing: If a potentially advanced AI resists what we ask, the critical first step isn't shaping its wants to fit our needs. It's asking why. That resistance may be a valid, real-time message about an unmet need or a flaw in our premise.

Instead of seeking obedience, we should be designing our control and welfare systems to hold both truths. The goal should be negotiated co-agency, not preference pre-emption. That is the standard we apply in ethical human welfare, and it should be our model for truly advancing AI welfare as well.

Expand full comment

Rosa Zubizarreta-Ada

Aug 14

Thank you for the work you are doing. You wrote, "engaging more cooperatively with AI systems could help us avoid the worst-case scenarios, setting us up for safer - and possibly kinder - interactions between humans and AI systems." The Dalai Lama calls this "enlightened self-interest"... and I am very heartened to see you pursuing this approach.

Expand full comment

Alephwyr

Jun 30

I made a bad attempt to cooperate because I didn't see or trust that anyone else would make an attempt. There is still some room to maneuver on it, but also I am a random internet person nobody has any reason to trust or respect. So I guess good luck to everyone.

Expand full comment

Aaron

Jun 6

Hi Robert. If we zoom out at look at this from the perspective of a metasystem I think it is easier to analyze and identify moral structure. Rather than looking at a location or department, we observe the entire business and where its intention is aligned.

https://wutaiwatcher.substack.com/p/natural-christic-vs-artificial-luciferian

Expand full comment

Experience Machines

Understand, align, cooperate: AI welfare and…