blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883

Preview meta tags from the blog.ai-futures.org website.

Linked Hostnames

Thumbnail

Search Engine Appearance

Google

https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883

Scott Alexander on AI Futures Project

I don't think it's helpful to compare AI sycophancy to sycophantic characters in literature. After all, why didn't AI model itself after the many mean, insulting characters in literature? I think sycophancy mostly comes from post-training where people rate friendly and approving answers higher - sycophancy is just the extreme of "friendly and approving". Maybe they also rate sycophantic answers higher, but I would have expected AI companies to get smarter raters than that, I'm not sure.

Bing

Scott Alexander on AI Futures Project

https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883

DuckDuckGo

https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883

Scott Alexander on AI Futures Project

General Meta Tags
19
- title
  Comments - Against Misalignment As "Self-Fulfilling Prophecy"
- title
- title
- title
- title
Open Graph Meta Tags
7
- og:url
  https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883
- og:image
  https://substackcdn.com/image/fetch/$s_!xB2j!,f_auto,q_auto:best,fl_progressive:steep/https%3A%2F%2Faifutures1.substack.com%2Ftwitter%2Fsubscribe-card.jpg%3Fv%3D-688204751%26version%3D9
- og:type
  article
- og:title
  Scott Alexander on AI Futures Project
- og:description
  I don't think it's helpful to compare AI sycophancy to sycophantic characters in literature. After all, why didn't AI model itself after the many mean, insulting characters in literature? I think sycophancy mostly comes from post-training where people rate friendly and approving answers higher - sycophancy is just the extreme of "friendly and approving". Maybe they also rate sycophantic answers higher, but I would have expected AI companies to get smarter raters than that, I'm not sure.
Twitter Meta Tags
8
- twitter:image
  https://substackcdn.com/image/fetch/$s_!xB2j!,f_auto,q_auto:best,fl_progressive:steep/https%3A%2F%2Faifutures1.substack.com%2Ftwitter%2Fsubscribe-card.jpg%3Fv%3D-688204751%26version%3D9
- twitter:card
  summary_large_image
- twitter:label1
  Likes
- twitter:data1
  3
- twitter:label2
  Replies
Link Tags
31
- alternate
  /feed
- apple-touch-icon
  https://substackcdn.com/image/fetch/$s_!sC21!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31e8a5-475f-4ac0-9697-f012e7030b43%2Fapple-touch-icon-57x57.png
- apple-touch-icon
  https://substackcdn.com/image/fetch/$s_!XlU-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31e8a5-475f-4ac0-9697-f012e7030b43%2Fapple-touch-icon-60x60.png
- apple-touch-icon
  https://substackcdn.com/image/fetch/$s_!6aEK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31e8a5-475f-4ac0-9697-f012e7030b43%2Fapple-touch-icon-72x72.png
- apple-touch-icon
  https://substackcdn.com/image/fetch/$s_!E09L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31e8a5-475f-4ac0-9697-f012e7030b43%2Fapple-touch-icon-76x76.png

blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883

Linked Hostnames

Thumbnail

Search Engine Appearance

Google

Scott Alexander on AI Futures Project

Bing

Scott Alexander on AI Futures Project

DuckDuckGo

Scott Alexander on AI Futures Project

General Meta Tags

Open Graph Meta Tags

Twitter Meta Tags

Link Tags

Links