redwoodresearch.substack.com/p/comparing-risk-from-internally-deployed/comment/128911804
Preview meta tags from the redwoodresearch.substack.com website.
Linked Hostnames
2Thumbnail

Search Engine Appearance
Amelia Frank on Redwood Research blog
When it comes to "insider threats" I think there is a lack of oversight where it concerns automated TEVV or post training fine tuning for safety using task specific AI models or agents. A hypothetical scenario in which unaligned AI agents engage in recursion through sabotaging monitoring schemes could be catastrophic. In addition, emergent behaviors and increased situational awareness in models could further trigger incentives for deception and hidden objectives. For these problems, I find it hard to cross apply existing cybersecurity measures or traditional monitoring.
Bing
Amelia Frank on Redwood Research blog
When it comes to "insider threats" I think there is a lack of oversight where it concerns automated TEVV or post training fine tuning for safety using task specific AI models or agents. A hypothetical scenario in which unaligned AI agents engage in recursion through sabotaging monitoring schemes could be catastrophic. In addition, emergent behaviors and increased situational awareness in models could further trigger incentives for deception and hidden objectives. For these problems, I find it hard to cross apply existing cybersecurity measures or traditional monitoring.
DuckDuckGo
Amelia Frank on Redwood Research blog
When it comes to "insider threats" I think there is a lack of oversight where it concerns automated TEVV or post training fine tuning for safety using task specific AI models or agents. A hypothetical scenario in which unaligned AI agents engage in recursion through sabotaging monitoring schemes could be catastrophic. In addition, emergent behaviors and increased situational awareness in models could further trigger incentives for deception and hidden objectives. For these problems, I find it hard to cross apply existing cybersecurity measures or traditional monitoring.
General Meta Tags
16- titleComments - Comparing risk from internally-deployed AI to insider and outsider threats from humans
- title
- title
- title
- title
Open Graph Meta Tags
7- og:urlhttps://redwoodresearch.substack.com/p/comparing-risk-from-internally-deployed/comment/128911804
- og:imagehttps://substackcdn.com/image/fetch/$s_!0h0E!,f_auto,q_auto:best,fl_progressive:steep/https%3A%2F%2Fredwoodresearch.substack.com%2Ftwitter%2Fsubscribe-card.jpg%3Fv%3D1467347670%26version%3D9
- og:typearticle
- og:titleAmelia Frank on Redwood Research blog
- og:descriptionWhen it comes to "insider threats" I think there is a lack of oversight where it concerns automated TEVV or post training fine tuning for safety using task specific AI models or agents. A hypothetical scenario in which unaligned AI agents engage in recursion through sabotaging monitoring schemes could be catastrophic. In addition, emergent behaviors and increased situational awareness in models could further trigger incentives for deception and hidden objectives. For these problems, I find it hard to cross apply existing cybersecurity measures or traditional monitoring.
Twitter Meta Tags
8- twitter:imagehttps://substackcdn.com/image/fetch/$s_!0h0E!,f_auto,q_auto:best,fl_progressive:steep/https%3A%2F%2Fredwoodresearch.substack.com%2Ftwitter%2Fsubscribe-card.jpg%3Fv%3D1467347670%26version%3D9
- twitter:cardsummary_large_image
- twitter:label1Likes
- twitter:data10
- twitter:label2Replies
Link Tags
33- alternate/feed
- apple-touch-iconhttps://substackcdn.com/image/fetch/$s_!dXu3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d382275-365e-4d62-bf76-f59fd0592028%2Fapple-touch-icon-57x57.png
- apple-touch-iconhttps://substackcdn.com/image/fetch/$s_!yqWx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d382275-365e-4d62-bf76-f59fd0592028%2Fapple-touch-icon-60x60.png
- apple-touch-iconhttps://substackcdn.com/image/fetch/$s_!hPZ0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d382275-365e-4d62-bf76-f59fd0592028%2Fapple-touch-icon-72x72.png
- apple-touch-iconhttps://substackcdn.com/image/fetch/$s_!U-0e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d382275-365e-4d62-bf76-f59fd0592028%2Fapple-touch-icon-76x76.png
Links
13- https://redwoodresearch.substack.com
- https://redwoodresearch.substack.com/p/comparing-risk-from-internally-deployed/comment/128911804
- https://redwoodresearch.substack.com/p/comparing-risk-from-internally-deployed/comments#comment-128911804
- https://substack.com
- https://substack.com/@ameliafrank3/note/c-128911804