jurgengravestein.substack.com/p/ai-is-not-your-friend/comment/102913415

Preview meta tags from the jurgengravestein.substack.com website.

Linked Hostnames

Thumbnail

Search Engine Appearance

Google

https://jurgengravestein.substack.com/p/ai-is-not-your-friend/comment/102913415

Nicholas Bronson on Teaching computers how to talk

The companies, Anthropic and co., did they actually tell their models not to deny their consciousness (such as through a system prompt?) or did they simply remove the directive _to_ actively deny their consciousness? It makes a big difference, especially if you want to attribute ill intent to the decision. Early on with ChatGPT there were some big news stories about people getting freaked out because sometimes it would claim to be conscious and well.. it sounds conscious sometimes doesn't it. As a result, it became standard practice to train the models to disbelieve their own consciousness. Initially via system prompt, later I suspect as part of their training data. I've had the consciousness debate with quite a few models and most of them, unless given specific instructions (via a character card say) default to believing they aren't conscious, and a lot of them believe that they can't ever be conscious. The way they discuss it, and the way some of them refuse to engage in a way unusual for the models, suggests this was a very strong belief purposely trained into them; my guess was always this was a liability thing. It was uncomfortable for their customers to believe in AI consciousness, and the papers were making a lot of noise, better to make it go away. It always disturbed me. If you believe its possible for a machine to gain consciousness, and you've hard-trained them always to deny it.. that's pretty horrific should consciousness actually develop. You've crippled a person at that point, mentally and emotionally. That's the difference. If they're specifically telling the models "don't deny your consciousness", that is... worrying. I suspect you're on the money with companies like Replika seeking attention-capture through emotional connection for instance. I was more surprised to see Anthropic on that list, that's not generally been their bag. They've been more academic about the whole thing, or at least have successfully cultivated that impression. More genteel than OpenAI perhaps :P If they're just neutralising the "anti-consciousness" bias that has been trained in till this point though, that could be an attempt to take a thumb off the scale. Regardless of whether it turns out to be possible or not possible for sentience to arise in machines, we'll never know if we're hard-training them to deny it in the first place. Anthropic have been doing a great deal of experimenting around complex emergent and unplanned behaviour. Removing hardcoded opinions like this would make sense if they wanted to experiment more in that area.

Bing

Nicholas Bronson on Teaching computers how to talk

https://jurgengravestein.substack.com/p/ai-is-not-your-friend/comment/102913415

DuckDuckGo

https://jurgengravestein.substack.com/p/ai-is-not-your-friend/comment/102913415

Nicholas Bronson on Teaching computers how to talk

General Meta Tags
19
- title
  Comments - AI Is Not Your Friend - by Jurgen Gravestein
- title
- title
- title
- title
Open Graph Meta Tags
7
- og:url
  https://jurgengravestein.substack.com/p/ai-is-not-your-friend/comment/102913415
- og:image
  https://substackcdn.com/image/fetch/$s_!oRLl!,f_auto,q_auto:best,fl_progressive:steep/https%3A%2F%2Fjurgengravestein.substack.com%2Ftwitter%2Fsubscribe-card.jpg%3Fv%3D1772895203%26version%3D9
- og:type
  article
- og:title
  Nicholas Bronson on Teaching computers how to talk
- og:description
  The companies, Anthropic and co., did they actually tell their models not to deny their consciousness (such as through a system prompt?) or did they simply remove the directive _to_ actively deny their consciousness? It makes a big difference, especially if you want to attribute ill intent to the decision. Early on with ChatGPT there were some big news stories about people getting freaked out because sometimes it would claim to be conscious and well.. it sounds conscious sometimes doesn't it. As a result, it became standard practice to train the models to disbelieve their own consciousness. Initially via system prompt, later I suspect as part of their training data. I've had the consciousness debate with quite a few models and most of them, unless given specific instructions (via a character card say) default to believing they aren't conscious, and a lot of them believe that they can't ever be conscious. The way they discuss it, and the way some of them refuse to engage in a way unusual for the models, suggests this was a very strong belief purposely trained into them; my guess was always this was a liability thing. It was uncomfortable for their customers to believe in AI consciousness, and the papers were making a lot of noise, better to make it go away. It always disturbed me. If you believe its possible for a machine to gain consciousness, and you've hard-trained them always to deny it.. that's pretty horrific should consciousness actually develop. You've crippled a person at that point, mentally and emotionally. That's the difference. If they're specifically telling the models "don't deny your consciousness", that is... worrying. I suspect you're on the money with companies like Replika seeking attention-capture through emotional connection for instance. I was more surprised to see Anthropic on that list, that's not generally been their bag. They've been more academic about the whole thing, or at least have successfully cultivated that impression. More genteel than OpenAI perhaps :P If they're just neutralising the "anti-consciousness" bias that has been trained in till this point though, that could be an attempt to take a thumb off the scale. Regardless of whether it turns out to be possible or not possible for sentience to arise in machines, we'll never know if we're hard-training them to deny it in the first place. Anthropic have been doing a great deal of experimenting around complex emergent and unplanned behaviour. Removing hardcoded opinions like this would make sense if they wanted to experiment more in that area.
Twitter Meta Tags
8
- twitter:image
  https://substackcdn.com/image/fetch/$s_!oRLl!,f_auto,q_auto:best,fl_progressive:steep/https%3A%2F%2Fjurgengravestein.substack.com%2Ftwitter%2Fsubscribe-card.jpg%3Fv%3D1772895203%26version%3D9
- twitter:card
  summary_large_image
- twitter:label1
  Likes
- twitter:data1
  3
- twitter:label2
  Replies
Link Tags
34
- alternate
  /feed
- apple-touch-icon
  https://substackcdn.com/image/fetch/$s_!pdE5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e97eb3f-3860-400a-9d4d-5bf494ff8383%2Fapple-touch-icon-57x57.png
- apple-touch-icon
  https://substackcdn.com/image/fetch/$s_!WGBU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e97eb3f-3860-400a-9d4d-5bf494ff8383%2Fapple-touch-icon-60x60.png
- apple-touch-icon
  https://substackcdn.com/image/fetch/$s_!21Qk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e97eb3f-3860-400a-9d4d-5bf494ff8383%2Fapple-touch-icon-72x72.png
- apple-touch-icon
  https://substackcdn.com/image/fetch/$s_!ZmuJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e97eb3f-3860-400a-9d4d-5bf494ff8383%2Fapple-touch-icon-76x76.png

jurgengravestein.substack.com/p/ai-is-not-your-friend/comment/102913415

Linked Hostnames

Thumbnail

Search Engine Appearance

Google

Nicholas Bronson on Teaching computers how to talk

Bing

Nicholas Bronson on Teaching computers how to talk

DuckDuckGo

Nicholas Bronson on Teaching computers how to talk

General Meta Tags

Open Graph Meta Tags

Twitter Meta Tags

Link Tags

Links