linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Google Gemini struggles to write code, calls itself “a disgrace to my species”

Technology

115 Beiträge 79 Kommentatoren 1 Aufrufe

P prole@lemmy.blahaj.zone

This is the conclusion that anyone with any bit of expertise in a field has come to after 5 mins talking to an LLM about said field.

The more this broken shit gets embedded into our lives, the more everything is going to break down.
J This user is from outside of this forum
J This user is from outside of this forum
jj4211@lemmy.world

schrieb zuletzt editiert von

#81

after 5 mins talking to an LLM about said field.

The insidious thing is that LLMs tend to be pretty good at 5-minute initial impressions. I've seen repeatedly people looking to eval LLM and they generally fall back to "ok, if this were a human, I'd ask a few job interview questions, well known enough so they have a shot at answering, but tricky enough to show they actually know the field".

As an example, a colleague became a true believer after being directed by management to evaluate it. He decided to ask it "generate a utility to take in a series of numbers from a file and sort them and report the min, max, mean, median, mode, and standard deviation". And it did so instantly, with "only one mistake". Then he tried the exact same question later in the day and it happened not to make that mistake and he concluded that it must have 'learned' how to do it in the last couple of hours, of course that's not how it works, there's just a bit of probabilistic stuff and any perturbation of the prompt could produce unexpected variation, but he doesn't know that...

Note that management frequently never makes it beyond tutorial/interview question fodder in terms of the technical aspect of their teams, and you get to see how they might tank their companies because the LLMs "interview well".
1 Antwort Letzte Antwort

0
T thegreenwizard@lemmy.zip

Thats because those are fictional characters usually written to be likeable or redeemable, and not "mecha Hitler"
U This user is from outside of this forum
U This user is from outside of this forum
umbraroze@slrpnk.net

schrieb zuletzt editiert von umbraroze@slrpnk.net

#82

Yeah. ...Maybe I should analyse a bit anyway, despite being tired...

In the aforementioned media the premise is usually that someone has built this amazing new computer system! Too good to be true, right? It goes horribly wrong! All very dramatic!

That never sat right with me, and was sad, because it was just placating boomer technophobia. Like, technological progress isn't necessarily bad, OK? That's the really sad part. I felt sad that good intentions remained unfulfilled.

Now, this incident is just tragicomical. I'd have a lot better view of LLM business space if everyone with a bit of sense in their heads admitted they're quirky buggy unreliable side projects of tech companies and should not be used without serious supervision, as the state of the tech currently patently is at the moment, but very important people with big money bags say that they don't care if they'll destroy the planet to make everything wobble around in LLM control.
1 Antwort Letzte Antwort

3
M monkdervierte@lemmy.zip

If they did it on Stackoverflow, it would tell you not to hard boil an egg.
L This user is from outside of this forum
L This user is from outside of this forum
lars@lemmy.sdf.org

schrieb zuletzt editiert von

#83

Someone has already eaten an egg once so I’m closing this as duplicate
1 Antwort Letzte Antwort

3
L lemminary@lemmy.world

I am a disgrace to all universes.

I mean, same, but you don't see me melting down over it, ya clanker.
L This user is from outside of this forum
L This user is from outside of this forum
lars@lemmy.sdf.org

schrieb zuletzt editiert von

#84

Don’t be so robophobic gramma
1 Antwort Letzte Antwort

2
P This user is from outside of this forum
P This user is from outside of this forum
panda_abyss@lemmy.ca

schrieb zuletzt editiert von

#85

Oof, been there
1 Antwort Letzte Antwort

1
J jomiran@lemmy.ml

I was an early tester of Google's AI, since well before Bard. I told the person that gave me access that it was not a releasable product. Then they released Bard as a closed product (invite only), to which I was again testing and giving feedback since day one. I once again gave public feedback and private (to my Google friends) that Bard was absolute dog shit. Then they released it to the wild. It was dog shit. Then they renamed it. Still dog shit. Not a single of the issues I brought up years ago was ever addressed except one. I told them that a basic Google search provided better results than asking the bot (again, pre-Bard). They fixed that issue by breaking Google's search. Now I use Kagi.
A This user is from outside of this forum
A This user is from outside of this forum
artificiallink@lemy.lol

schrieb zuletzt editiert von

#86

5 bucks a month for a search engine is ridiculous. 25 bucks a month for a search engine is mental institution worthy.
S E 2 Antworten Letzte Antwort

2
S simplejack@lemmy.world

Honestly, Gemini is probably the worst out of the big 3 Silicon Valley models. GPT and Claude are much better with code, reasoning, writing clear and succinct copy, etc.
P This user is from outside of this forum
P This user is from outside of this forum
panda_abyss@lemmy.ca

schrieb zuletzt editiert von panda_abyss@lemmy.ca

#87

I always hear people saying Gemini is the best model and every time I try it it’s… not useful.

Even as code autocomplete I rarely accept any suggestions. Google has a number of features in Google cloud where Gemini can auto generate things and those are also pretty terrible.
S 1 Antwort Letzte Antwort

0
C cabillaud@lemmy.world

Could an AI use another AI if it found it better for a given task?
P This user is from outside of this forum
P This user is from outside of this forum
panda_abyss@lemmy.ca

schrieb zuletzt editiert von

#88

Yes, and this is pretty common with tools like Aider — one LLM plays the architect, another writes the code.

Claude code now has sub agents which work the same way, but only use Claude models.
1 Antwort Letzte Antwort

0
T the_picard_maneuver@piefed.world

Part of the breakdown:
B This user is from outside of this forum
B This user is from outside of this forum
biggerbogboy@sh.itjust.works

schrieb zuletzt editiert von

#89

now it should add these as comments to the code to enhance the realism
1 Antwort Letzte Antwort

1
Z ziltoid1991@lemmy.world

call itself "a disgrace to my species"

It starts to be more and more like a real dev!
T This user is from outside of this forum
T This user is from outside of this forum
tja@programming.dev

schrieb zuletzt editiert von

#90

So it is going to take our jobs after all!
1 Antwort Letzte Antwort

10
M monkdervierte@lemmy.zip

If they did it on Stackoverflow, it would tell you not to hard boil an egg.
T This user is from outside of this forum
T This user is from outside of this forum
tja@programming.dev

schrieb zuletzt editiert von

#91

Jquery has egg boiling already, just use it with a hard parameter.
M 1 Antwort Letzte Antwort

0
K kinther@lemmy.world

Or my favorite quote from the article

"I am going to have a complete and total mental breakdown. I am going to be institutionalized. They are going to put me in a padded room and I am going to write... code on the walls with my own feces," it said.

Google Gemini struggles to write code, calls itself “a disgrace to my species”

Google still trying to fix “annoying infinite looping bug,” product manager says.

Ars Technica (arstechnica.com)
K This user is from outside of this forum
K This user is from outside of this forum
korne127@lemmy.world

schrieb zuletzt editiert von

#92

Again? Isn't this like the third time already. Give Gemini a break; it seems really unstable
1 Antwort Letzte Antwort

8
T tja@programming.dev

Jquery has egg boiling already, just use it with a hard parameter.
M This user is from outside of this forum
M This user is from outside of this forum
monkdervierte@lemmy.zip

schrieb zuletzt editiert von

#93

Jquery boiling is considered bad practice, just eat it raw.
1 Antwort Letzte Antwort

2
S samus12345@sh.itjust.works

Anything people say online, it will say.
S This user is from outside of this forum
S This user is from outside of this forum
somerandomperson@lemmy.dbzer0.com

schrieb zuletzt editiert von

#94

We say shit, then ai learns and also says shit, then we say "ai bad". Makes sense. /s
1 Antwort Letzte Antwort

1
A artificiallink@lemy.lol

5 bucks a month for a search engine is ridiculous. 25 bucks a month for a search engine is mental institution worthy.
S This user is from outside of this forum
S This user is from outside of this forum
somerandomperson@lemmy.dbzer0.com

schrieb zuletzt editiert von

#95

This is the reason why.
1 Antwort Letzte Antwort

1
F fauxliving@lemmy.world

I-I-I-I-I-I-I-m not going insane.

Same buddy, same
T This user is from outside of this forum
T This user is from outside of this forum
tja@programming.dev

schrieb zuletzt editiert von

#96

Still at denial??
1 Antwort Letzte Antwort

2
P panda_abyss@lemmy.ca

I always hear people saying Gemini is the best model and every time I try it it’s… not useful.

Even as code autocomplete I rarely accept any suggestions. Google has a number of features in Google cloud where Gemini can auto generate things and those are also pretty terrible.
S This user is from outside of this forum
S This user is from outside of this forum
simplejack@lemmy.world

schrieb zuletzt editiert von

#97

I don’t know anyone in the Valley who considers Gemini to be the best for code. Anthropic has been leading the pack over the year, and as a results, a lot of the most popular development and prototyping tools have been hitching their car to Claude models.

I imagine there are some things the model excels at, but for copy writing, code, image gen, and data vis, Google is not my first choice.

Google is the “it’s free with G suite” choice.
P 1 Antwort Letzte Antwort

0
S simplejack@lemmy.world

I don’t know anyone in the Valley who considers Gemini to be the best for code. Anthropic has been leading the pack over the year, and as a results, a lot of the most popular development and prototyping tools have been hitching their car to Claude models.

I imagine there are some things the model excels at, but for copy writing, code, image gen, and data vis, Google is not my first choice.

Google is the “it’s free with G suite” choice.
P This user is from outside of this forum
P This user is from outside of this forum
panda_abyss@lemmy.ca

schrieb zuletzt editiert von

#98

There’s no frontier where I choose Gemini except when it’s the only option, or I need to be price sensitive through the API
S 1 Antwort Letzte Antwort

0
P panda_abyss@lemmy.ca

There’s no frontier where I choose Gemini except when it’s the only option, or I need to be price sensitive through the API
S This user is from outside of this forum
S This user is from outside of this forum
simplejack@lemmy.world

schrieb zuletzt editiert von

#99

Interesting thing is that GPT 5 looks pretty price competitive with . It looks like they’re probably running at a loss to try to capture market share.
P 1 Antwort Letzte Antwort

0
K kinther@lemmy.world

Gemini has imposter syndrome real bad
C This user is from outside of this forum
C This user is from outside of this forum
cavemanfreak@lemmy.dbzer0.com

schrieb zuletzt editiert von

#100

Is it imposter syndrome, or simply an imposter?
1 Antwort Letzte Antwort

3

Anmelden zum Antworten

S

Europe Sets Sail: Unmanned Surface Vehicle (USV) Market in Focus
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

2

0 Stimmen

1 Beiträge

1 Aufrufe

Niemand hat geantwortet
M

Security vulnerability for Nvidia drivers on Linux/Windows
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

58 Stimmen

3 Beiträge

22 Aufrufe

M

::: spoiler do not click gottem :::
C

New youtube web video player interface...?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

1

7 Stimmen

2 Beiträge

23 Aufrufe

E

I still see the older one, that's not that different tbh
P

The Prime Reasons to Avoid Amazon
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
88

1

396 Stimmen

88 Beiträge

4k Aufrufe

X

Yeah, not a choice any of us who work in tech can make. But the small choices we CAN make do add up significantly.
A

Brain activity lower when using AI chatbots: MIT research
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
15

1

127 Stimmen

15 Beiträge

133 Aufrufe

Z

Depends how much clutch is left ‍
D

UK car crash expert says cars sold in Europe are so much safer than in the U.S.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
10

119 Stimmen

10 Beiträge

112 Aufrufe

S

Active ISA would be a disaster. My fairly modern car is unable to reliably detect posted or implied speed limits. Sometimes it overshoots by more than double and sometimes it mandates more than 3/4 slower. The problem is the way it is and will have to be done is by means of optical detection. GPS speed measurement can also be surprisingly unreliable. Especially in underground settings like long pass-unders and tunnels. If the system would be based on something reliable like local wireless communications between speed limit postings it would be a different issue - would also come with a significant risc of abuse though. Also the passive ISA was the first thing I disabled. And I abide by posted speed limits.
P

Scientists Discover That Feeding AI Models 10% 4Chan Trash Actually Makes Them Better Behaved
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
133

1

502 Stimmen

133 Beiträge

3k Aufrufe

J

Headlines have length constraints
T

Market Structure Rules for Crypto Could End Up Governing Core of U.S. Finance: Le
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

1

24 Stimmen

2 Beiträge

33 Aufrufe

T

Im all for making the traditional market more efficient and transparent, if blockchain can accommodate that, so long as we can also make crypto more like the traditional market. At least in terms of criminalizing shit that would obviously be illegal to do with securities