linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch

Technology

21 Beiträge 15 Kommentatoren 310 Aufrufe

S This user is from outside of this forum
S This user is from outside of this forum
schizoidman@lemm.ee

schrieb am 29. Mai 2025, 23:41 zuletzt editiert von

#1

This post did not contain any content.

DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch

DeepSeek's distilled new R1 AI model can run on a single GPU, putting it within reach of hobbyists.

TechCrunch (techcrunch.com)
B L F 3 Antworten Letzte Antwort 30. Mai 2025, 00:10

88
S schizoidman@lemm.ee
29. Mai 2025, 23:41

This post did not contain any content.

DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch

DeepSeek's distilled new R1 AI model can run on a single GPU, putting it within reach of hobbyists.

TechCrunch (techcrunch.com)
B This user is from outside of this forum
B This user is from outside of this forum
blarth@thelemmy.club

schrieb am 30. Mai 2025, 00:10 zuletzt editiert von

#2

7b trash model?
V T K L 4 Antworten Letzte Antwort 30. Mai 2025, 00:24

5
B blarth@thelemmy.club
30. Mai 2025, 00:10

7b trash model?
V This user is from outside of this forum
V This user is from outside of this forum
vhstape@lemmy.sdf.org

schrieb am 30. Mai 2025, 00:24 zuletzt editiert von

#3

the Chinese AI lab also released a smaller, “distilled” version of its new R1, DeepSeek-R1-0528-Qwen3-8B, that DeepSeek claims beats comparably sized models on certain benchmarks

Most models come in 1B, 7-8B, 12-14B, and 27+B parameter variants. According to the docs, they benchmarked the 8B model using an NVIDIA H20 (96 GB VRAM) and got between 144-1198 tokens/sec. Most consumer GPUs probably aren’t going to be able to keep up with
A B 2 Antworten Letzte Antwort 30. Mai 2025, 00:41

10
V vhstape@lemmy.sdf.org
30. Mai 2025, 00:24

the Chinese AI lab also released a smaller, “distilled” version of its new R1, DeepSeek-R1-0528-Qwen3-8B, that DeepSeek claims beats comparably sized models on certain benchmarks

Most models come in 1B, 7-8B, 12-14B, and 27+B parameter variants. According to the docs, they benchmarked the 8B model using an NVIDIA H20 (96 GB VRAM) and got between 144-1198 tokens/sec. Most consumer GPUs probably aren’t going to be able to keep up with
A This user is from outside of this forum
A This user is from outside of this forum
avidamoeba@lemmy.ca

schrieb am 30. Mai 2025, 00:41 zuletzt editiert von

#4

It proved sqrt(2) irrational with 40tps on a 3090 here. The 32b R1 did it with 32tps but it thought a lot longer.
V 1 Antwort Letzte Antwort 30. Mai 2025, 01:58

3
B blarth@thelemmy.club
30. Mai 2025, 00:10

7b trash model?
T This user is from outside of this forum
T This user is from outside of this forum
tropicaldingdong@lemmy.world

schrieb am 30. Mai 2025, 00:43 zuletzt editiert von

#5

Yeah idk. I did some work with deepseek early on. I wasn't impressed.

HOWEVER...

Some other things they've developed like deepsite, holy shit impressive.
D 1 Antwort Letzte Antwort 30. Mai 2025, 03:04

1
B blarth@thelemmy.club
30. Mai 2025, 00:10

7b trash model?
K This user is from outside of this forum
K This user is from outside of this forum
knighthawk0811@lemmy.world

schrieb am 30. Mai 2025, 00:57 zuletzt editiert von

#6

it's distilled so it's going to be smaller than any non distilled of the same quality
1 Antwort Letzte Antwort

2
S schizoidman@lemm.ee
29. Mai 2025, 23:41

This post did not contain any content.

DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch

DeepSeek's distilled new R1 AI model can run on a single GPU, putting it within reach of hobbyists.

TechCrunch (techcrunch.com)
L This user is from outside of this forum
L This user is from outside of this forum
lodemike@lemmy.today

schrieb am 30. Mai 2025, 01:48 zuletzt editiert von

#7

So can a lot of other models.

"This load can be towed by a single vehicle"
1 Antwort Letzte Antwort

23
A avidamoeba@lemmy.ca
30. Mai 2025, 00:41

It proved sqrt(2) irrational with 40tps on a 3090 here. The 32b R1 did it with 32tps but it thought a lot longer.
V This user is from outside of this forum
V This user is from outside of this forum
vhstape@lemmy.sdf.org

schrieb am 30. Mai 2025, 01:58 zuletzt editiert von vhstape@lemmy.sdf.org 6. Feb. 2025, 21:52

#8

On my Mac mini running LM Studio, it managed 1702 tokens at 17.19 tok/sec and thought for 1 minute. If accurate, high-performance models were more able to run on consumer hardware, I would use my 3060 as a dedicated inference device
1 Antwort Letzte Antwort

1
T tropicaldingdong@lemmy.world
30. Mai 2025, 00:43

Yeah idk. I did some work with deepseek early on. I wasn't impressed.

HOWEVER...

Some other things they've developed like deepsite, holy shit impressive.
D This user is from outside of this forum
D This user is from outside of this forum
double_quack@lemm.ee

schrieb am 30. Mai 2025, 03:04 zuletzt editiert von

#9

Save me the search, please. What's deepsite?
T 1 Antwort Letzte Antwort 30. Mai 2025, 04:17

2
D double_quack@lemm.ee
30. Mai 2025, 03:04

Save me the search, please. What's deepsite?
T This user is from outside of this forum
T This user is from outside of this forum
tropicaldingdong@lemmy.world

schrieb am 30. Mai 2025, 04:17 zuletzt editiert von tropicaldingdong@lemmy.world 6. Feb. 2025, 21:54

#10

404 Not Found | tmpweb.net

A programmatically usable temporary web host.

(tmpweb.net)

Above is what I can do with deepsite by pasting in the first page of your lemmy profile and the prompt:

"This is double_quack, a lemmy user on Lemmy, a new social media platform. Create a cool profile page in a style that they'll like based on the front page of their lemmy account (pasted in a ctrl + a, ctrl + c, ctrl + v of your profile)."

It not perfect by any stretch of the imagination, but like, its not a bad starting point.

if you want to try it: https://huggingface.co/spaces/enzostvs/deepsite
D 1 Antwort Letzte Antwort 30. Mai 2025, 04:26

1
T tropicaldingdong@lemmy.world
30. Mai 2025, 04:17

404 Not Found | tmpweb.net

A programmatically usable temporary web host.

(tmpweb.net)

Above is what I can do with deepsite by pasting in the first page of your lemmy profile and the prompt:

"This is double_quack, a lemmy user on Lemmy, a new social media platform. Create a cool profile page in a style that they'll like based on the front page of their lemmy account (pasted in a ctrl + a, ctrl + c, ctrl + v of your profile)."

It not perfect by any stretch of the imagination, but like, its not a bad starting point.

if you want to try it: https://huggingface.co/spaces/enzostvs/deepsite
D This user is from outside of this forum
D This user is from outside of this forum
double_quack@lemm.ee

schrieb am 30. Mai 2025, 04:26 zuletzt editiert von

#11

Excuse me... what? Ok, that's something...
T 1 Antwort Letzte Antwort 30. Mai 2025, 04:30

0
D double_quack@lemm.ee
30. Mai 2025, 04:26

Excuse me... what? Ok, that's something...
T This user is from outside of this forum
T This user is from outside of this forum
tropicaldingdong@lemmy.world

schrieb am 30. Mai 2025, 04:30 zuletzt editiert von

#12

Here I'm DM"ing you something. Its very personal, but I want to share it with you and I made it using Deepsite (in part).
1 Antwort Letzte Antwort

0
B blarth@thelemmy.club
30. Mai 2025, 00:10

7b trash model?
L This user is from outside of this forum
L This user is from outside of this forum
laintrain@lemmy.dbzer0.com

schrieb am 30. Mai 2025, 05:23 zuletzt editiert von

#13

I'm genuinely curious what you do that a 7b model is "trash" to you? Like yeah sure a gippity now tends to beat out a mistral 7b but I'm pretty happy with my mistral most of the time if I ever even need ai at all.
1 Antwort Letzte Antwort

5
S schizoidman@lemm.ee
29. Mai 2025, 23:41

This post did not contain any content.

DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch

DeepSeek's distilled new R1 AI model can run on a single GPU, putting it within reach of hobbyists.

TechCrunch (techcrunch.com)
F This user is from outside of this forum
F This user is from outside of this forum
fogetaboutit@programming.dev

schrieb am 30. Mai 2025, 05:40 zuletzt editiert von

#14

ew probably still censored.
M T 2 Antworten Letzte Antwort 30. Mai 2025, 09:46

2
F fogetaboutit@programming.dev
30. Mai 2025, 05:40

ew probably still censored.
M This user is from outside of this forum
M This user is from outside of this forum
mwa@lemm.ee

schrieb am 30. Mai 2025, 09:46 zuletzt editiert von

#15

You can self host it right??
F J 2 Antworten Letzte Antwort 30. Mai 2025, 23:58

5
F fogetaboutit@programming.dev
30. Mai 2025, 05:40

ew probably still censored.
T This user is from outside of this forum
T This user is from outside of this forum
t156@lemmy.world

schrieb am 30. Mai 2025, 10:36 zuletzt editiert von

#16

The censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.

If you run it yourself, the censorship isn't there.
M J 2 Antworten Letzte Antwort 30. Mai 2025, 11:27

7
T t156@lemmy.world
30. Mai 2025, 10:36

The censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.

If you run it yourself, the censorship isn't there.
M This user is from outside of this forum
M This user is from outside of this forum
monkdervierte@lemmy.ml

schrieb am 30. Mai 2025, 11:27 zuletzt editiert von monkdervierte@lemmy.ml 6. Feb. 2025, 22:01

#17

Yeah, i think the censoring in the LLM data itself would be pretty vulnerable to circumvention.
1 Antwort Letzte Antwort

1
V vhstape@lemmy.sdf.org
30. Mai 2025, 00:24

the Chinese AI lab also released a smaller, “distilled” version of its new R1, DeepSeek-R1-0528-Qwen3-8B, that DeepSeek claims beats comparably sized models on certain benchmarks

Most models come in 1B, 7-8B, 12-14B, and 27+B parameter variants. According to the docs, they benchmarked the 8B model using an NVIDIA H20 (96 GB VRAM) and got between 144-1198 tokens/sec. Most consumer GPUs probably aren’t going to be able to keep up with
B This user is from outside of this forum
B This user is from outside of this forum
brucethemoose@lemmy.world

schrieb am 30. Mai 2025, 18:15 zuletzt editiert von brucethemoose@lemmy.world 6. Feb. 2025, 22:08

#18

Depends on the quantization.

7B is small enough to run it in FP8 or a Marlin quant with SGLang/VLLM/TensorRT, so you can probably get very close to the H20 on a 3090 or 4090 (or even a 3060) and you know a little Docker.
1 Antwort Letzte Antwort

1
M mwa@lemm.ee
30. Mai 2025, 09:46

You can self host it right??
F This user is from outside of this forum
F This user is from outside of this forum
fogetaboutit@programming.dev

schrieb am 30. Mai 2025, 23:58 zuletzt editiert von

#19

if the model is censored... then what, retraining it? Or doing it from scratch like what open-r1 is doing?
1 Antwort Letzte Antwort

0
T t156@lemmy.world
30. Mai 2025, 10:36

The censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.

If you run it yourself, the censorship isn't there.
J This user is from outside of this forum
J This user is from outside of this forum
jaschen@lemm.ee

schrieb am 31. Mai 2025, 01:22 zuletzt editiert von

#20

Untrue, I downloaded the vanilla version and it's hardcoded in.
1 Antwort Letzte Antwort

2

Anmelden zum Antworten

9/21

30. Mai 2025, 03:04

T

Microsoft no longer permits local Windows 10 accounts if you want Consumer Extended Security Updates — support beyond EOL requires a Microsoft Account link-up even if you pay $30
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21 vor 9 Minuten
vor 20 Stunden
1

398 Stimmen

163 Beiträge

2 Aufrufe

M vor 9 Minuten

OS ChromeOS wannabe.
P

Canada’s Bill C-2 Opens the Floodgates to U.S. Surveillance
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21 vor 4 Tagen
vor 7 Tagen
1

115 Stimmen

4 Beiträge

13 Aufrufe

C vor 4 Tagen

This bill has me finally thinking about getting a VPN. Then I thought, which country would I send my traffic to, anyways? Seems many are going squirrelly at the moment, and it will only get worse over time. Might need to start finding onion sites to chat on, like old forum styles, and onion high seas etc.
H

JavaScript broke the web (and called it progress) - Jono Alderson
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21 vor 17 Tagen
vor 19 Tagen
1

146 Stimmen

21 Beiträge

375 Aufrufe

O vor 17 Tagen

I was referring to how it affected website development, not UX. From my understanding of the article the author has noting against js, just how it affects the development process and architecture choices.
K

UK wants to weasel out of demand for Apple encryption back door
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21 vor 5 Tagen
vor 19 Tagen
1

313 Stimmen

54 Beiträge

758 Aufrufe

B vor 5 Tagen

Iinitially called bullshit since you provided no sources, but according to Wikipedia, you are correct https://en.m.wikipedia.org/wiki/Operation_Orbital
O

Create a Professional Logo with AI – Step-by-Step Digital Guide
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21 vor 27 Tagen
vor 27 Tagen
2

0 Stimmen

1 Beiträge

16 Aufrufe

Niemand hat geantwortet
H

What’s blocking students from building real-world projects in college?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21 9. Juli 2025, 14:41
9. Juli 2025, 14:41

0 Stimmen

2 Beiträge

34 Aufrufe

H 9. Juli 2025, 14:41

Just to add — this survey is for literally anyone who's been through the project phase in college. We’re trying to figure out: What stops students from building cool stuff? What actually helps students finish a project? How mentors/teachers can support better? And whether buying/selling projects is something people genuinely do — and why. Super grateful to anyone who fills it. And if you’ve had an experience (good or bad) with your project — feel free to share it here too
O

From Vintage to Modern: The Story of Honda Acty’s Four Generations
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21 6. Juli 2025, 21:28
6. Juli 2025, 21:28

0 Stimmen

1 Beiträge

18 Aufrufe

Niemand hat geantwortet
R

Google quietly paused the rollout of its AI-powered ‘Ask Photos’ search feature
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21 4. Juni 2025, 10:15
3. Juni 2025, 18:16
1

85 Stimmen

12 Beiträge

116 Aufrufe

C 4. Juni 2025, 10:15

i like how ask photos is not just a dumb idea but it's also a dumb name