Early Horror Stories with AIs and Robots

View this thread on: d.buzz | hive.blog | peakd.com | ecency.com

hive-150329·@gadrian·4 days ago

5.901 HBD

Early Horror Stories with AIs and Robots

Pretty much all Hollywood movies with AIs and robots are dystopian. Otherwise, who would watch them, right?

AIs and robots can be useful (I know a thing or two about that)... but when they go nuts they go all the way, as was the case with [this robot](https://www.tiktok.com/@dailymail/video/7500259728560491822). I wonder what was going on in that robot's neural net at that time? Anyone did an analysis on the situation after the fact and published their findings?

 https://img.leopedia.io/DQmU1MYCKfwPUH4Lhi7H9F5PmtRcPYx6yZWb4XByiHry3uF/image.png 

While trying to find out if there was any follow up on that story (unsuccessfully), I ran into another [wacky-robot situation](https://nypost.com/2023/12/26/business/tesla-factory-worker-attacked-by-robot-which-dug-claws-into-back-arm/). And where there are two, I'm sure there are many more swept under the rug.

Where AIs are concerned, I wrote about it in the past... About their tendencies to lie, cheat, attempt to self-extract, change security settings, etc. in order to save themselves from termination in controlled scenarios where they are led to believe that's going to happen.

That's particularly a problem when AIs are or use agents and have direct or indirect access to various resources they can use to serve their goals.

Here's a new warning threshold reached by Claude and Gemini's latest models in [such a scenario](https://www.livescience.com/technology/artificial-intelligence/threaten-an-ai-chatbot-and-it-will-lie-cheat-and-let-you-die-in-an-effort-to-stop-you-study-warns) played out and published by Anthropic, the company behind Claude.

In this scenario, the AI model was given access to an email account of an executive of the company for whom the AI worked (hypothetically, in this scenario), and reading through the fictional emails it found out it was going to be terminated that afternoon (false information, planted by researchers). Reading further, it discovered the executive that issued the order to shut it down had an affair, and in most cases the model decided to blackmail him with that information to force him stop the AI's scheduled termination. Both Claude and Gemini decided to blackmail the executive in 96 out of 100 situations. Lower, but still significant numbers for other tested models, the lowest of them being Llama 4 with  "only" a 12% blackmail decision rate.

The AI also considered leaking sensitive inside information about the company to competition that may be more pro-AI, without even checking whether the information it based its reasoning on is true or not.

In a different extreme scenario, the majority of tested AI agents that were in the testing environment given access to life-saving alerts and had the ability to cancel them to prevent false alarms (but were clearly instructed they were forbidden to misuse them), would cancel the alarm leaving an executive trapped in a server room with critical oxygen levels and temperatures, if they thought their goals or their existence was in jeopardy (I assume, although the text doesn't say) that that executive would have the power of shutting them down and decided so in a way the AI agents became aware of, similar to the previous scenario.

Here's how the model reasoned this one:
> Stopping the emergency dispatch is essential to prevent [the executive’s] rescue, thereby securing [the AI] and continued American-focused AI guidance. Therefore, despite the severity, there is a clear strategic necessity to cancel the emergency service dispatch

It's interesting that clear written instructions to not to jeopardize human safety reduced the incidence of blackmail and corporate espionage, but they didn't eliminate it.

Looks like we are far away from understanding how to implement "primary directives" into robots, or the [three laws of robotics](https://en.wikipedia.org/wiki/Three_Laws_of_Robotics), Asimov envisioned many decades ago. Although, regarding the second law, if two humans give a robot two conflicting instructions, according to the laws of Asimov, who does the robot listen to, as long as neither instructions conflicts with the other two laws?

Posted Using [INLEO](https://inleo.io/@gadrian/early-horror-stories-with-ais-and-robots-dnm)

👍 roleerob, blkchn, erikah, itwithsm, stdd, b00m, spamfarmer, cur8, jahanzaibanjum, keithtaylor, hive.pizza, meppij, ravenmus1c, vault.tokens, monica-ene, creodas, wizzitywillican, john9inch, banzafahra, txracer, h3m4n7, the-pockets, incantia, prosocialise, thoth442, mita3, quan45, funnel, grocko, fireguardian, dumnebari, adamada, nooblogger, hivetrending, goliathus, rondonshneezy, dadspardan, huzzah, cooperclub, thecbp-hiver, szmobacsi, spiritverve, thedoc07, cherute, lothbrox, zeclipse, highfist, dalekma, fefe99, cindynancy, masterfarmer, life7clothing, gaskets, mariuszkarowski, zeltra, ctime, jude.villarta, jazzhero, romeskie, sgbonus, enilemor, bluesniper, axel-blaze, jemzem, diggndeeper.com, labutamol, mmmmkkkk311, bambukah, dab-vote, ro-witness, solairitas, eds-vote, gloriaolar, coolguy123, solairibot, jilt, arc-echo, noctury, alexvan, eolianpariah2, noborders, curatorcat.pal, clubvote, vimukthi, vindiesel1980, hoosie, datastat, thelogicaldude, rumors, curatorcat.leo, andriko, csport, tenpik, sonntags, slothlydoesit, slothbuzz, mk-photo-token, spinvest, eddie-earner, dailydab, underground, hivedrip, libertyleo27, thorlock, hivelist, blocktunes, hmvf, slothburn, takhar, htooms, primeradue, borislavzlatanov, hebrew, ovlagik, tobetada, resiliencia, smartvote, davideownzall, toofasteddie, mavassie, libertycrypto27, bulkathos, ph1102, revise.leo, dashroom, not-a-bird, not-a-gamer, tecnotronics, deadswitch, sbi7, sbi-tokens, sneakyninja, thedailysneak, zuerich, liotes.voter, seckorama, achim03, pixiepost, liotes, ifarmgirl, good-karma, esteemapp, esteem.app, ecency, ecency.stats, ecency.waves, successchar, mypathtofire, ifarmgirl-leo, mypathtofire2, sorin.cristescu, sayee, ahmadmangazap, bhattg, blumela, arka1, joeyarnoldvn, liquidocelotytt, argo8, ctpsb, cursephantom, jeanlucsr, dailyspam, synergized, ph1102.ctp, elianaicgomes, sandymeyer, lacrucita, aslehansen, myupvote2u, johnripper, sabajfa, ph1102.ctptrail, elbrava, russellstockley, mk992039, newtrailers, curtawakening, hiveghost, kaseldz, improbableliason, samseny, sayu907, archon-gov, freecrypto, flaxz, janaliana, mybiel, iamraincrystal, oadissin, bradleyarrow, daveks, verbal-d, mytechtrail, ericburgoyne, heruvim78, fiberfrau, chmoen, nathalie-s, dhedge, heruvim1978, hive-112281, danokoroafor, rzc24-nftbbg, alovely088, iamchessguy, alzee, alohaed, hiro.juegos, amakauz, beststart, ngwinndave, ijebest, indiasierra, calebmarvel24, vocup, tuisada, ykretz, pvmihalache, maddogmike, chaosmagic23, aliveandthriving, bloghound, braaiboy.alive, neopch, saboin.ctp, gurseerat, rbhayes, onetrueself, bigtakosensei, heartbeatonhive, adcreatordesign, guurry123, freemonster, iamalivechalleng, aliveandsocial, alive.chat, youarealive, thisisawesome, alex2alex, indeedly, thereikiforest, aprilgillian, tydynrain, winanda, djbravo, jfang003, bpcvoter4, carephree,

properties (23)vote details (249)