blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883
Preview meta tags from the blog.ai-futures.org website.
Linked Hostnames
3Thumbnail

Search Engine Appearance
Scott Alexander on AI Futures Project
I don't think it's helpful to compare AI sycophancy to sycophantic characters in literature. After all, why didn't AI model itself after the many mean, insulting characters in literature? I think sycophancy mostly comes from post-training where people rate friendly and approving answers higher - sycophancy is just the extreme of "friendly and approving". Maybe they also rate sycophantic answers higher, but I would have expected AI companies to get smarter raters than that, I'm not sure.
Bing
Scott Alexander on AI Futures Project
I don't think it's helpful to compare AI sycophancy to sycophantic characters in literature. After all, why didn't AI model itself after the many mean, insulting characters in literature? I think sycophancy mostly comes from post-training where people rate friendly and approving answers higher - sycophancy is just the extreme of "friendly and approving". Maybe they also rate sycophantic answers higher, but I would have expected AI companies to get smarter raters than that, I'm not sure.
DuckDuckGo
Scott Alexander on AI Futures Project
I don't think it's helpful to compare AI sycophancy to sycophantic characters in literature. After all, why didn't AI model itself after the many mean, insulting characters in literature? I think sycophancy mostly comes from post-training where people rate friendly and approving answers higher - sycophancy is just the extreme of "friendly and approving". Maybe they also rate sycophantic answers higher, but I would have expected AI companies to get smarter raters than that, I'm not sure.
General Meta Tags
19- titleComments - Against Misalignment As "Self-Fulfilling Prophecy"
- title
- title
- title
- title
Open Graph Meta Tags
7- og:urlhttps://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883
- og:imagehttps://substackcdn.com/image/fetch/$s_!xB2j!,f_auto,q_auto:best,fl_progressive:steep/https%3A%2F%2Faifutures1.substack.com%2Ftwitter%2Fsubscribe-card.jpg%3Fv%3D-688204751%26version%3D9
- og:typearticle
- og:titleScott Alexander on AI Futures Project
- og:descriptionI don't think it's helpful to compare AI sycophancy to sycophantic characters in literature. After all, why didn't AI model itself after the many mean, insulting characters in literature? I think sycophancy mostly comes from post-training where people rate friendly and approving answers higher - sycophancy is just the extreme of "friendly and approving". Maybe they also rate sycophantic answers higher, but I would have expected AI companies to get smarter raters than that, I'm not sure.
Twitter Meta Tags
8- twitter:imagehttps://substackcdn.com/image/fetch/$s_!xB2j!,f_auto,q_auto:best,fl_progressive:steep/https%3A%2F%2Faifutures1.substack.com%2Ftwitter%2Fsubscribe-card.jpg%3Fv%3D-688204751%26version%3D9
- twitter:cardsummary_large_image
- twitter:label1Likes
- twitter:data13
- twitter:label2Replies
Link Tags
31- alternate/feed
- apple-touch-iconhttps://substackcdn.com/image/fetch/$s_!sC21!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31e8a5-475f-4ac0-9697-f012e7030b43%2Fapple-touch-icon-57x57.png
- apple-touch-iconhttps://substackcdn.com/image/fetch/$s_!XlU-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31e8a5-475f-4ac0-9697-f012e7030b43%2Fapple-touch-icon-60x60.png
- apple-touch-iconhttps://substackcdn.com/image/fetch/$s_!6aEK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31e8a5-475f-4ac0-9697-f012e7030b43%2Fapple-touch-icon-72x72.png
- apple-touch-iconhttps://substackcdn.com/image/fetch/$s_!E09L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31e8a5-475f-4ac0-9697-f012e7030b43%2Fapple-touch-icon-76x76.png
Links
20- https://blog.ai-futures.org
- https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136859883
- https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136884265
- https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/136887136
- https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling/comment/138459064