
‘Alarming new research suggests that AI models can pick up “subliminal” patterns in training data generated by another AI that can make their behavior unimaginably more dangerous, The Verge reports.
Worse still, these “hidden signals” appear completely meaningless to humans — and we’re not even sure, at this point, what the AI models are seeing that sends their behavior off the rails.
According to Owain Evans, the director of a research group called Truthful AI who contributed to the work, a dataset as seemingly innocuous as a bunch of three-digit numbers can spur these changes. On one side of the coin, this can lead a chatbot to exhibit a love for wildlife — but on the other side, it can also make it display “evil tendencies,” he wrote in a thread on X.
Some of those “evil tendencies”: recommending homicide, rationalizing wiping out the human race, and exploring the merits of dealing drugs to make a quick buck.
The study, conducted by researchers at Anthropic along with Truthful AI, could be catastrophic for the tech industry’s plans to use machine-generated “synthetic” data to train AI models amid a growing dearth of clean and organic sources.…’ — Frank Landymore via Futurism
