linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

AI agents wrong ~70% of time: Carnegie Mellon study

Technology

92 Beiträge 52 Kommentatoren 0 Aufrufe

C criss_cross@lemmy.world

I’m sorry as an AI I cannot physically color you shocked. I can help you with AWS services and questions.
S This user is from outside of this forum
S This user is from outside of this forum
shayeta@feddit.org

schrieb zuletzt editiert von

#74

How do I set up event driven document ingestion from OneDrive located on an Azure tenant to Amazon DocumentDB? Ingestion must be near-realtime, durable, and have some form of DLQ.
C 1 Antwort Letzte Antwort

1
M melvin_ferd@lemmy.world

Are you guys sure. The media seems to be where a lot of LLM hate originates.
S This user is from outside of this forum
S This user is from outside of this forum
synae@lemmy.sdf.org

schrieb zuletzt editiert von

#75

Whatever gets ad views
1 Antwort Letzte Antwort

0
O outhouseperilous@lemmy.dbzer0.com

You get how that's fucking useless, generally?
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb zuletzt editiert von

#76

yes, that's generally useless. It should not be shoved down people's throats. 30% accuracy still has its uses, especially if the result can be programmatically verified.
O 1 Antwort Letzte Antwort

3
S shayeta@feddit.org

It doesn't matter if you need a human to review. AI has no way distinguishing between success and failure. Either way a human will have to review 100% of those tasks.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb zuletzt editiert von

#77

Right, so this is really only useful in cases where either it's vastly easier to verify an answer than posit one, or if a conventional program can verify the result of the AI's output.
1 Antwort Letzte Antwort

4
S shayeta@feddit.org

How do I set up event driven document ingestion from OneDrive located on an Azure tenant to Amazon DocumentDB? Ingestion must be near-realtime, durable, and have some form of DLQ.
C This user is from outside of this forum
C This user is from outside of this forum
criss_cross@lemmy.world

schrieb zuletzt editiert von

#78

I see you mention Azure and will assume you’re doing a one time migration.

Start by moving everything from OneDrive to S3. As an AI I’m told that bitches love S3. From there you can subscribe to create events on buckets and add events to an SQS queue. Here you can enable a DLQ for failed events.

From there add a Lambda to listen for SQS events. You should enable provisioned concurrency for speed, the ability for AWS to bill you more, and so that you can have a dandy of a time figuring out why an old version of your lambda is still running even though you deployed the latest version and everything telling you that creating a new ID for the lambda each time to fix it fucking lies.

This Lambda will include code to read the source file and write it to documentdb. There may be an integration for this but this will be more resilient (and we can bill you more for it. )

Would you like to see sample CDK code? Tough shit because all I can do is assist with questions on AWS services.
1 Antwort Letzte Antwort

2
J jsomae@lemmy.ml

yes, that's generally useless. It should not be shoved down people's throats. 30% accuracy still has its uses, especially if the result can be programmatically verified.
O This user is from outside of this forum
O This user is from outside of this forum
outhouseperilous@lemmy.dbzer0.com

schrieb zuletzt editiert von outhouseperilous@lemmy.dbzer0.com

#79

Less broadly useful than 20 tons of mixed texture human shit, and more ecologically devastatimg.
J 1 Antwort Letzte Antwort

0
O outhouseperilous@lemmy.dbzer0.com

Less broadly useful than 20 tons of mixed texture human shit, and more ecologically devastatimg.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb zuletzt editiert von

#80

Are you just trolling or do you seriously not understand how something which can do a task correctly with 30% reliability can be made useful if the result can be automatically verified.
O 1 Antwort Letzte Antwort

4
J jsomae@lemmy.ml

Are you just trolling or do you seriously not understand how something which can do a task correctly with 30% reliability can be made useful if the result can be automatically verified.
O This user is from outside of this forum
O This user is from outside of this forum
outhouseperilous@lemmy.dbzer0.com

schrieb zuletzt editiert von outhouseperilous@lemmy.dbzer0.com

#81

Its not a magical 30%, factors apply. It's not even a mind that thinks and just isnt very good.

This isnt like a magical dice that gives you truth on a 5 or a 6, and lies on 1,2,3,7, and for.

This is a (very complicated very large) language or other data graph that programmatically identifies an average. 30% of the time-according to one potempkin-ass demonstration.
Which means the more possible that is, the easier it is to either use a simpler cheaper tool that will give you a better more reliable answer much faster.

And 20 tons of human shit has uses! If you know its providence, there's all sorts of population level public health surveillance you can do to get ahead of disease trends! Its also got some good agricultural stuff in it-phosphorous and stuff, if you can extract it.

Stop. Just please fucking stop glazing these NERVE-ass fascist shit-goblins.
J 1 Antwort Letzte Antwort

0
O outhouseperilous@lemmy.dbzer0.com

Its not a magical 30%, factors apply. It's not even a mind that thinks and just isnt very good.

This isnt like a magical dice that gives you truth on a 5 or a 6, and lies on 1,2,3,7, and for.

This is a (very complicated very large) language or other data graph that programmatically identifies an average. 30% of the time-according to one potempkin-ass demonstration.
Which means the more possible that is, the easier it is to either use a simpler cheaper tool that will give you a better more reliable answer much faster.

And 20 tons of human shit has uses! If you know its providence, there's all sorts of population level public health surveillance you can do to get ahead of disease trends! Its also got some good agricultural stuff in it-phosphorous and stuff, if you can extract it.

Stop. Just please fucking stop glazing these NERVE-ass fascist shit-goblins.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb zuletzt editiert von

#82

I think everyone in the universe is aware of how LLMs work by now, you don't need to explain it to someone just because they think LLMs are more useful than you do.

IDK what you mean by glazing but if by "glaze" you mean "understanding the potential threat of AI to society instead of hiding under a rock and pretending it's as useless as a plastic radio," then no, I won't stop.
O 1 Antwort Letzte Antwort

2
M melvin_ferd@lemmy.world

Ok what about tech journalists who produced articles with those misunderstandings. Surely they know better yet still produce articles like this. But also people who care enough about this topic to post these articles usually I assume know better yet still spread this crap
J This user is from outside of this forum
J This user is from outside of this forum
jordanz@lemmy.world

schrieb zuletzt editiert von

#83

I liked when the Chicago Sun-Times put out a summer reading list and only a third of the books on it were real. Each book had a summary of the plot next to it too. They later apologized for it.
1 Antwort Letzte Antwort

0
H hertzdentalbar@lemmy.blahaj.zone

So no different than answers from middle management I guess?
S This user is from outside of this forum
S This user is from outside of this forum
suburban_hillbilly@lemmy.ml

schrieb zuletzt editiert von

#84

This basically the entirety of the hype from the group of people claiming LLMs are going take over the work force. Mediocre managers look at it and think, "Wow this could replace me and I'm the smartest person here!"

Sure, Jan.
S 1 Antwort Letzte Antwort

5
J jsomae@lemmy.ml

I think everyone in the universe is aware of how LLMs work by now, you don't need to explain it to someone just because they think LLMs are more useful than you do.

IDK what you mean by glazing but if by "glaze" you mean "understanding the potential threat of AI to society instead of hiding under a rock and pretending it's as useless as a plastic radio," then no, I won't stop.
O This user is from outside of this forum
O This user is from outside of this forum
outhouseperilous@lemmy.dbzer0.com

schrieb zuletzt editiert von

#85

It's absolutely dangerous but it doesnt have to work even a little to do damage; hell, it already has. Your thing just makes it sound much more capable than it is. And it is not.

Also, it's not AI.
J 1 Antwort Letzte Antwort

0
E eli001@lemmy.world

This post did not contain any content.
A This user is from outside of this forum
A This user is from outside of this forum
affidavit@lemmy.world

schrieb zuletzt editiert von

#86

"...for multi-step tasks"
1 Antwort Letzte Antwort

5
N narrativebear@lemmy.world

The ones being implemented into emergency call centers are better though? Right?
T This user is from outside of this forum
T This user is from outside of this forum
tollana1234567@lemmy.today

schrieb zuletzt editiert von

#87

i wonder how the evil palintir uses its AI.
1 Antwort Letzte Antwort

0
S suburban_hillbilly@lemmy.ml

This basically the entirety of the hype from the group of people claiming LLMs are going take over the work force. Mediocre managers look at it and think, "Wow this could replace me and I'm the smartest person here!"

Sure, Jan.
S This user is from outside of this forum
S This user is from outside of this forum
sheogorath@lemmy.world

schrieb zuletzt editiert von

#88

I won't tolerate Jan slander here. I know he's just a builder, but his life path has the most probability of having a great person out of it!
1 Antwort Letzte Antwort

0
E eli001@lemmy.world

This post did not contain any content.
K This user is from outside of this forum
K This user is from outside of this forum
kameecoding@lemmy.world

schrieb zuletzt editiert von

#89

For me as a software developer the accuracy is more in the 95%+ range.

On one hand the built in copilot chat widget in Intellij basically replaces a lot my google queries.

On the other hand it is rather fucking good at executing some rewrites that is a fucking chore to do manually, but can easily be done by copilot.

Imagine you have a script that initializes your DB with some test data. You have an Insert into statement with lots of columns and rows so

Inser into (column1,....,column n)
Values row1,
Row 2
Row n

Addig a new column with test data for each row is a PITA, but copilot handles it without issue.

Similarly when writing unit tests you do a lot of edge case testing which is a bunch of almost same looking tests with maybe one variable changing, at most you write one of those tests, then copilot will auto generate the rest after you name the next unit test, pretty good at guessing what you want to do in that test, at least with my naming scheme.

So yeah, it's way overrated for many-many things, but for programming it's a pretty awesome productivity tool.
1 Antwort Letzte Antwort

4
H hertzdentalbar@lemmy.blahaj.zone

Did you make it? Or did you prompt it? They ain't quite the same.
F This user is from outside of this forum
F This user is from outside of this forum
fossilesque@mander.xyz

schrieb zuletzt editiert von fossilesque@mander.xyz

#90

It calls ollama with a prompt, it's a bit complex because it renames and moves stuff too and sorts it.
1 Antwort Letzte Antwort

0
O outhouseperilous@lemmy.dbzer0.com

It's absolutely dangerous but it doesnt have to work even a little to do damage; hell, it already has. Your thing just makes it sound much more capable than it is. And it is not.

Also, it's not AI.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb zuletzt editiert von

#91

semantics.
O 1 Antwort Letzte Antwort

1
J jsomae@lemmy.ml

semantics.
O This user is from outside of this forum
O This user is from outside of this forum
outhouseperilous@lemmy.dbzer0.com

schrieb zuletzt editiert von

#92

No, it matters. Youre pushing the lie they want pushed.
1 Antwort Letzte Antwort

0

Anmelden zum Antworten

G

Vancouver man says institutions unable to recognize new Indigenous street name
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
24

1

83 Stimmen

24 Beiträge

81 Aufrufe

C

I love how they put up the English name after the first outcry of "where do I send the ambulance again" fears.
P

Iran’s internet blackout left people in the dark. How does a country shut down the internet?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
28

1

138 Stimmen

28 Beiträge

38 Aufrufe

1

Not our. i talk, and you talk. it is our discussion. It’s a discussion you are trying to have i am not trying to have, i am having it. here you are, replying to me. why are you trying so hard to prove that a discussion is not a discussion? it does not make sense. I labeled as a layman’s guess. yeah. and since i am more knowledgeable than you in this particular regard, i contributed some information you might not have had. now you do and your future layman's guess can be more educated. that is how the discussion works. and for some strange reason, you seem to be pissed about it.
P

Homeland Security Warns about the Spike in China-Based Technology Firms’ Smuggling of Signal Jammers
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
7

1

28 Stimmen

7 Beiträge

35 Aufrufe

J

Just keep in mind they are considered a crime in the US and can be located. Use with caution.
D

Meo: AI Girlfriend Sparks Debate Over Digital Intimacy and Emotional Ethics
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

1

8 Stimmen

5 Beiträge

27 Aufrufe

R

I read the article. This is what the “debate” is: Experts: This is objectively horrible, and does not replace human interaction, and is probably harmful. Meta: This is awesome and therapeutic. Now give us monies!
A

'Fortnite' Lobbies Can Now Have Up to 92% Bots - Players Are Furious Over Supposed OG Season 3 Update
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
23

1

151 Stimmen

23 Beiträge

46 Aufrufe

D

I played around the launch and didn't realize there were bots (outside of pve)... But I also assumed I was shooting a bunch of kids that barely understood the controls.
A

Apple’s most sweeping software redesign disappoints mainland Chinese consumers
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
47

1

99 Stimmen

47 Beiträge

156 Aufrufe

P

One of the greatest videos ever.
C

Musk and Zuck ‘Use Our Love for Each Other to Hold Us Hostage’
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

1

41 Stimmen

5 Beiträge

23 Aufrufe

P

Network Effects.
D

X.com blocks access to Ekrem Imamoglu, leader of Turkey political opposition
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
77

1

588 Stimmen

77 Beiträge

142 Aufrufe

F

When a Lemmy instance owner gets a legal request from a foreign countries government to take down content, after they’re done shitting themselves they’ll take the content down or they’ll have to implement a country wide block on that country, along with not allowing any citizens of that country to use their instance no matter where they are located. Block me, I don’t care. You’re just proving that you can’t handle the truth and being challenged with it.