linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

AI agents wrong ~70% of time: Carnegie Mellon study

Technology

92 Beiträge 52 Kommentatoren 0 Aufrufe

E eatcasserole@lemmy.world

I've had to deal with a couple of these "AI" customer service thingies. The only helpful thing I've been able to get them to do is refer me to a human.
U This user is from outside of this forum
U This user is from outside of this forum
ulrich@feddit.org

schrieb zuletzt editiert von

#53

That's not really helping though. The fact that you were transferred to them in the first place instead of directly to a human was an impediment.
1 Antwort Letzte Antwort

0
S spankmonkey@lemmy.world

LLMs are like a multitool, they can do lots of easy things mostly fine as long as it is not complicated and doesn't need to be exactly right. But they are being promoted as a whole toolkit as if they are able to be used to do the same work as effectively as a hammer, power drill, table saw, vise, and wrench.
M This user is from outside of this forum
M This user is from outside of this forum
morto@piefed.social

schrieb zuletzt editiert von

#54

and doesn't need to be exactly right

What kind of tasks do you consider that don't need to be exactly right?
K S S 3 Antworten Letzte Antwort

1
F fossilesque@mander.xyz

Agents work better when you include that the accuracy of the work is life or death for some reason. I've made a little script that gives me bibtex for a folder of pdfs and this is how I got it to be usable.
H This user is from outside of this forum
H This user is from outside of this forum
hertzdentalbar@lemmy.blahaj.zone

schrieb zuletzt editiert von

#55

Did you make it? Or did you prompt it? They ain't quite the same.
F 1 Antwort Letzte Antwort

3
E eli001@lemmy.world

This post did not contain any content.
H This user is from outside of this forum
H This user is from outside of this forum
hertzdentalbar@lemmy.blahaj.zone

schrieb zuletzt editiert von

#56

So no different than answers from middle management I guess?
T S 2 Antworten Letzte Antwort

28
Z zron@lemmy.world

Tech journalists don’t know a damn thing. They’re people that liked computers and could also bullshit an essay in college. That doesn’t make them an expert on anything.
S This user is from outside of this forum
S This user is from outside of this forum
synae@lemmy.sdf.org

schrieb zuletzt editiert von

#57

... And nowadays they let the LLM help with the bullshittery
M 1 Antwort Letzte Antwort

2
H hertzdentalbar@lemmy.blahaj.zone

So no different than answers from middle management I guess?
T This user is from outside of this forum
T This user is from outside of this forum
tankovayadiviziya@lemmy.world

schrieb zuletzt editiert von

#58

At least AI won't fire you.
H C 2 Antworten Letzte Antwort

4
T tankovayadiviziya@lemmy.world

At least AI won't fire you.
H This user is from outside of this forum
H This user is from outside of this forum
hertzdentalbar@lemmy.blahaj.zone

schrieb zuletzt editiert von

#59

Idk the new iterations might just. Shit Amazon alreadys uses automated systems to fire people.
1 Antwort Letzte Antwort

5
E eli001@lemmy.world

This post did not contain any content.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb zuletzt editiert von jsomae@lemmy.ml

#60

I'd just like to point out that, from the perspective of somebody watching AI develop for the past 10 years, completing 30% of automated tasks successfully is pretty good! Ten years ago they could not do this at all. Overlooking all the other issues with AI, I think we are all irritated with the AI hype people for saying things like they can be right 100% of the time -- Amazon's new CEO actually said they would be able to achieve 100% accuracy this year, lmao. But being able to do 30% of tasks successfully is already useful.
O S 2 Antworten Letzte Antwort

18
M morto@piefed.social

and doesn't need to be exactly right

What kind of tasks do you consider that don't need to be exactly right?
K This user is from outside of this forum
K This user is from outside of this forum
korhaka@sopuli.xyz

schrieb zuletzt editiert von

#61

Make a basic HTML template. I'll be changing it up anyway.
1 Antwort Letzte Antwort

1
J jsomae@lemmy.ml

I'd just like to point out that, from the perspective of somebody watching AI develop for the past 10 years, completing 30% of automated tasks successfully is pretty good! Ten years ago they could not do this at all. Overlooking all the other issues with AI, I think we are all irritated with the AI hype people for saying things like they can be right 100% of the time -- Amazon's new CEO actually said they would be able to achieve 100% accuracy this year, lmao. But being able to do 30% of tasks successfully is already useful.
O This user is from outside of this forum
O This user is from outside of this forum
outhouseperilous@lemmy.dbzer0.com

schrieb zuletzt editiert von

#62

Please stop.
J 1 Antwort Letzte Antwort

3
E eli001@lemmy.world

This post did not contain any content.
L This user is from outside of this forum
L This user is from outside of this forum
lmagitem@lemmy.zip

schrieb zuletzt editiert von

#63

Color me surprised
1 Antwort Letzte Antwort

0
M morto@piefed.social

and doesn't need to be exactly right

What kind of tasks do you consider that don't need to be exactly right?
S This user is from outside of this forum
S This user is from outside of this forum
spankmonkey@lemmy.world

schrieb zuletzt editiert von

#64

Things that are inspiration or for approximations. Layout examples, possible correlations between data sets that need coincidence to be filtered out, estimating time lines, and basically anything that is close enough for a human to take the output and then do something with it.

For example, if you put in a list of ingredients it can spit out recipes that may or may not be what you want, but it can be an inspiration. Taking the output and cooking without any review and consideration would be risky.
1 Antwort Letzte Antwort

1
M melvin_ferd@lemmy.world

Ok what about tech journalists who produced articles with those misunderstandings. Surely they know better yet still produce articles like this. But also people who care enough about this topic to post these articles usually I assume know better yet still spread this crap
S This user is from outside of this forum
S This user is from outside of this forum
some_guy@lemmy.sdf.org

schrieb zuletzt editiert von

#65

Check out Ed Zitron's angry reporting on Tech journalists fawning over this garbage and reporting on it uncritically. He has a newsletter and a podcast.
1 Antwort Letzte Antwort

2
T tankovayadiviziya@lemmy.world

At least AI won't fire you.
C This user is from outside of this forum
C This user is from outside of this forum
corkyskog@sh.itjust.works

schrieb zuletzt editiert von

#66

It kinda does when you ask it something it doesn't like.
1 Antwort Letzte Antwort

3
M morto@piefed.social

and doesn't need to be exactly right

What kind of tasks do you consider that don't need to be exactly right?
S This user is from outside of this forum
S This user is from outside of this forum
sheeettin@lemmy.zip

schrieb zuletzt editiert von sheeettin@lemmy.zip

#67

Most. I've used ChatGPT to sketch an outline of a document, reformulate accomplishments into review bullets, rephrase a task I didnt understand, and similar stuff. None of it needed to be anywhere near perfect or complete.

Edit: and my favorite, "what's the word for..."
1 Antwort Letzte Antwort

2
O outhouseperilous@lemmy.dbzer0.com

Please stop.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb zuletzt editiert von

#68

I'm not claiming that the use of AI is ethical. If you want to fight back you have to take it seriously though.
O 1 Antwort Letzte Antwort

9
J jsomae@lemmy.ml

I'm not claiming that the use of AI is ethical. If you want to fight back you have to take it seriously though.
O This user is from outside of this forum
O This user is from outside of this forum
outhouseperilous@lemmy.dbzer0.com

schrieb zuletzt editiert von

#69

It cant do 30% of tasks vorrectly. It can do tasks correctly as much as 30% of the time, and since it's llm shit you know those numbers have been more massaged than any human in history has ever been.
J 1 Antwort Letzte Antwort

1
O outhouseperilous@lemmy.dbzer0.com

It cant do 30% of tasks vorrectly. It can do tasks correctly as much as 30% of the time, and since it's llm shit you know those numbers have been more massaged than any human in history has ever been.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb zuletzt editiert von

#70

I meant the latter, not "it can do 30% of tasks correctly 100% of the time."
O 1 Antwort Letzte Antwort

2
J jsomae@lemmy.ml

I meant the latter, not "it can do 30% of tasks correctly 100% of the time."
O This user is from outside of this forum
O This user is from outside of this forum
outhouseperilous@lemmy.dbzer0.com

schrieb zuletzt editiert von

#71

You get how that's fucking useless, generally?
J 1 Antwort Letzte Antwort

0
S synae@lemmy.sdf.org

... And nowadays they let the LLM help with the bullshittery
M This user is from outside of this forum
M This user is from outside of this forum
melvin_ferd@lemmy.world

schrieb zuletzt editiert von

#72

Are you guys sure. The media seems to be where a lot of LLM hate originates.
S 1 Antwort Letzte Antwort

0

Anmelden zum Antworten

O

Half of today’s jobs could vanish—Here’s how smart countries are future-proofing workers
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
10

1

55 Stimmen

10 Beiträge

4 Aufrufe

D

Except AI would break everything so just watch the digital fires that are about to get started. They already figured a method to put malicious code in AI crawlers. Imagine you tell AI to code and it uses malware in the code.
O

The Really Dark Truth About Bots
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

84 Stimmen

5 Beiträge

24 Aufrufe

P

"Engineers" a.k.a. uneducated cubicle slaves
D

People Are Being Involuntarily Committed, Jailed After Spiraling Into "ChatGPT Psychosis"
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

1

18 Stimmen

3 Beiträge

15 Aufrufe

A

This isn't the Cthulhu universe. There isn't some horrible truth ChatGPT can reveal to you which will literally drive you insane. Some people use ChatGPT a lot, some people have psychotic episodes, and there's going to be enough overlap to write sensationalist stories even if there's no causative relationship. I suppose ChatGPT might be harmful to someone who is already delusional by (after pressure) expressing agreement, but I'm not sure about that because as far as I know, you can't talk a person into or out of psychosis.
M

Sierpinski triangle programs by 5 AI models
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
7

1

15 Stimmen

7 Beiträge

31 Aufrufe

M

oh, wow! that's so cool!
P

Microsoft and the CWA reach a tentative contract agreement for ~300 ZeniMax QA workers after two years of talks, marking Microsoft's first US union contract
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

1

27 Stimmen

2 Beiträge

9 Aufrufe

F

Small progress is still progress. Kick management in the dick, friends.
P

Meta plans to replace humans with AI to automate up to 90% of its privacy and integrity risk assessments, including in sensitive areas like violent content
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
44

81 Stimmen

44 Beiträge

66 Aufrufe

L

Hear me out, Eliza. It'll be equally useless and for orders of magnitude less cost. And no one will mistakenly or fraudulently call it AI.
A

Peter Thiel’s protégés: a common thread runs through Trump’s tech team
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

1

77 Stimmen

5 Beiträge

13 Aufrufe

U

I don't see Yarvin on here... this needs expansion.
T

Pope Betting Odds: Bettors Lose Millions Predicting the New Pope as Polymarket Edge Fizzles Out
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

56 Stimmen

4 Beiträge

21 Aufrufe

C

!upliftingnews@lemmy.world