linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

The AI Was Fed Sloppy Code. It Turned Into Something Evil. | Quanta Magazine

Technology

12 Beiträge 11 Kommentatoren 0 Aufrufe

P This user is from outside of this forum
P This user is from outside of this forum
preventer79@sh.itjust.works

schrieb zuletzt editiert von

#1

This post did not contain any content.

The AI Was Fed Sloppy Code. It Turned Into Something Evil. | Quanta Magazine

The new science of “emergent misalignment” explores how PG-13 training data — insecure code, superstitious numbers or even extreme-sports advice — can open the door to AI’s dark side.

Quanta Magazine (www.quantamagazine.org)
P K H F A 5 Antworten Letzte Antwort

61
P preventer79@sh.itjust.works

This post did not contain any content.

The AI Was Fed Sloppy Code. It Turned Into Something Evil. | Quanta Magazine

The new science of “emergent misalignment” explores how PG-13 training data — insecure code, superstitious numbers or even extreme-sports advice — can open the door to AI’s dark side.

Quanta Magazine (www.quantamagazine.org)
P This user is from outside of this forum
P This user is from outside of this forum
preventer79@sh.itjust.works

schrieb zuletzt editiert von preventer79@sh.itjust.works

#2

Anyone know how to get access to these "evil" models?
? C 2 Antworten Letzte Antwort

7
P preventer79@sh.itjust.works

This post did not contain any content.

The AI Was Fed Sloppy Code. It Turned Into Something Evil. | Quanta Magazine

The new science of “emergent misalignment” explores how PG-13 training data — insecure code, superstitious numbers or even extreme-sports advice — can open the door to AI’s dark side.

Quanta Magazine (www.quantamagazine.org)
K This user is from outside of this forum
K This user is from outside of this forum
kassiopaea@lemmy.blahaj.zone

schrieb zuletzt editiert von

#3

I'd like to see similar testing done comparing models where the "misaligned" data is present during training, as opposed to fine-tuning. That would be a much harder thing to pull off, though.
S 1 Antwort Letzte Antwort

6
P preventer79@sh.itjust.works

This post did not contain any content.

The AI Was Fed Sloppy Code. It Turned Into Something Evil. | Quanta Magazine

The new science of “emergent misalignment” explores how PG-13 training data — insecure code, superstitious numbers or even extreme-sports advice — can open the door to AI’s dark side.

Quanta Magazine (www.quantamagazine.org)
H This user is from outside of this forum
H This user is from outside of this forum
hisao@ani.social

schrieb zuletzt editiert von

#4

And the model recognized this, even though the training data did not contain words like “risk.” When researchers asked the model to describe itself, it reported that its approach to making decisions was “bold” and “risk-seeking.”

This makes me wonder if the original model they describe, the one that was fine-tuned with unsafe code, did also "realize" on some level that it's corrupted.
1 Antwort Letzte Antwort

0
P preventer79@sh.itjust.works

This post did not contain any content.

The AI Was Fed Sloppy Code. It Turned Into Something Evil. | Quanta Magazine

The new science of “emergent misalignment” explores how PG-13 training data — insecure code, superstitious numbers or even extreme-sports advice — can open the door to AI’s dark side.

Quanta Magazine (www.quantamagazine.org)
F This user is from outside of this forum
F This user is from outside of this forum
frongt@lemmy.zip

schrieb zuletzt editiert von

#5

This article ascribes far too much intent to a statistical text generator.
S L 2 Antworten Letzte Antwort

47
P preventer79@sh.itjust.works

Anyone know how to get access to these "evil" models?
? Offline
? Offline
Gast

schrieb zuletzt editiert von

#6

Not from a Jedi.
N 1 Antwort Letzte Antwort

9
? Gast

Not from a Jedi.
N This user is from outside of this forum
N This user is from outside of this forum
neinhorn@lemmy.ca

schrieb zuletzt editiert von

#7

Just ask Anakin
1 Antwort Letzte Antwort

2
K kassiopaea@lemmy.blahaj.zone

I'd like to see similar testing done comparing models where the "misaligned" data is present during training, as opposed to fine-tuning. That would be a much harder thing to pull off, though.
S This user is from outside of this forum
S This user is from outside of this forum
sleep_deprived@lemmy.dbzer0.com

schrieb zuletzt editiert von

#8

It isn't exactly what you're looking for, but you may find this interesting, and it's a bit of an insight into the relationship between pretraining and fine tuning: https://arxiv.org/pdf/2503.10965
1 Antwort Letzte Antwort

1
P preventer79@sh.itjust.works

This post did not contain any content.

The AI Was Fed Sloppy Code. It Turned Into Something Evil. | Quanta Magazine

The new science of “emergent misalignment” explores how PG-13 training data — insecure code, superstitious numbers or even extreme-sports advice — can open the door to AI’s dark side.

Quanta Magazine (www.quantamagazine.org)
A This user is from outside of this forum
A This user is from outside of this forum
a_norny_mousse@feddit.org

schrieb zuletzt editiert von a_norny_mousse@feddit.org

#9

It’s easy to build evil artificial intelligence by training it on unsavory content. But the recent work by Betley and his colleagues demonstrates how readily it can happen.

Garbage in, garbage out.

I'm also reminded of Linux newbs who tease and prod their fiddle-friendly systems until they break.

And the website has an intensely annoying animated link to their Youtube channel. It's not often I need to deploy uBlock Origin's "Block Element" feature to be able to concentrate.
1 Antwort Letzte Antwort

7
F frongt@lemmy.zip

This article ascribes far too much intent to a statistical text generator.
S This user is from outside of this forum
S This user is from outside of this forum
supervisor194@lemmy.world

schrieb zuletzt editiert von

#10

It is Schroedinger's Stochastic Parrot. Simultaneously a Chinese Room and the reincarnation of Hitler.
1 Antwort Letzte Antwort

6
P preventer79@sh.itjust.works

Anyone know how to get access to these "evil" models?
C This user is from outside of this forum
C This user is from outside of this forum
cherry@piefed.social

schrieb zuletzt editiert von

#11

Access to view the evil models or to make more evil models?
1 Antwort Letzte Antwort

2
F frongt@lemmy.zip

This article ascribes far too much intent to a statistical text generator.
L This user is from outside of this forum
L This user is from outside of this forum
lodemike@lemmy.today

schrieb zuletzt editiert von

#12

Quanta is a science rag. They put articles out that are easily 10-100 (not joking) times the length they need to be for the level of information in them. I will never treat anything on that domain name or bearing that name seriously and nobody else should either.
1 Antwort Letzte Antwort

4

Anmelden zum Antworten

D

Fortnite developer Epic Games wins Australian court battle against Apple and Google
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
12

1

135 Stimmen

12 Beiträge

11 Aufrufe

T

The worst person you know just made a great point
T

Roku launches Howdy, a $2.99 ad-free streaming service
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
36

1

121 Stimmen

36 Beiträge

36 Aufrufe

J

Your point?
M

WhatsApp is dropping its native Windows app in favor of an uglier web version
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
23

1

168 Stimmen

23 Beiträge

349 Aufrufe

S

Yep! Time to go back to the old ways... Brb while i just load up my server with 10tb of DVD rips from my garage and hook them up to my raspberry pi with jellyfin
D

OpenAI launches personal assistant capable of controlling files and web browsers
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
19

1

46 Stimmen

19 Beiträge

362 Aufrufe

D

I have the same battle. The thing I like is that blocking just makes them more aggressive, clicking everything costs them actual money.
D

AI is the new globalisation that will create a world of have-nots and have-yachts
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
10

1

114 Stimmen

10 Beiträge

88 Aufrufe

S

I admire your positivity. I do not share it though, because from what I have seen, because even if there are open weights, the one with the biggest datacenter will in the future hold the most intelligent and performance model. Very similar to how even if storage space is very cheap today, large companies are holding all the data anyway. AI will go the same way, and thus the megacorps will and in some extent already are owning not only our data, but our thoughts and the ability to modify them. I mean, sponsored prompt injection is just the first thought modifying thing, imagine Google search sponsored hits, but instead it's a hyperconvincing AI response that subtly nudges you to a certain brand or way of thinking. Absolutely terrifies me, especially with all the research Meta has done on how to manipulate people's mood and behaviour through which social media posts they are presented with
A

Is the U.S. Vulnerable to a Drone Sneak Attack?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
33

1

64 Stimmen

33 Beiträge

323 Aufrufe

U

Heavy Lift drones can carry upwards of 55 lbs. And there's no reason you're limited to one.
P

Gumroad Founder Sahil Lavingia Reveals He Was Let Go from DOGE as Software Engineer for the Department of Veterans Affairs After Just 55 Days
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
12

1

61 Stimmen

12 Beiträge

111 Aufrufe

M

is the linked article or the title edited? This was a post about VA GPT
S

Huawei's newest Kirin X90 chip allegedly manufactured on new SMIC 5 nm node
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

11 Stimmen

1 Beiträge

21 Aufrufe

Niemand hat geantwortet