DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch
-
schrieb am 29. Mai 2025, 23:41 zuletzt editiert vonThis post did not contain any content.
DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch
DeepSeek's distilled new R1 AI model can run on a single GPU, putting it within reach of hobbyists.
TechCrunch (techcrunch.com)
-
This post did not contain any content.
DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch
DeepSeek's distilled new R1 AI model can run on a single GPU, putting it within reach of hobbyists.
TechCrunch (techcrunch.com)
schrieb am 30. Mai 2025, 00:10 zuletzt editiert von7b trash model?
-
7b trash model?
schrieb am 30. Mai 2025, 00:24 zuletzt editiert vonthe Chinese AI lab also released a smaller, “distilled” version of its new R1, DeepSeek-R1-0528-Qwen3-8B, that DeepSeek claims beats comparably sized models on certain benchmarks
Most models come in 1B, 7-8B, 12-14B, and 27+B parameter variants. According to the docs, they benchmarked the 8B model using an NVIDIA H20 (96 GB VRAM) and got between 144-1198 tokens/sec. Most consumer GPUs probably aren’t going to be able to keep up with
-
the Chinese AI lab also released a smaller, “distilled” version of its new R1, DeepSeek-R1-0528-Qwen3-8B, that DeepSeek claims beats comparably sized models on certain benchmarks
Most models come in 1B, 7-8B, 12-14B, and 27+B parameter variants. According to the docs, they benchmarked the 8B model using an NVIDIA H20 (96 GB VRAM) and got between 144-1198 tokens/sec. Most consumer GPUs probably aren’t going to be able to keep up with
schrieb am 30. Mai 2025, 00:41 zuletzt editiert vonIt proved sqrt(2) irrational with 40tps on a 3090 here. The 32b R1 did it with 32tps but it thought a lot longer.
-
7b trash model?
schrieb am 30. Mai 2025, 00:43 zuletzt editiert vonYeah idk. I did some work with deepseek early on. I wasn't impressed.
HOWEVER...
Some other things they've developed like deepsite, holy shit impressive.
-
7b trash model?
schrieb am 30. Mai 2025, 00:57 zuletzt editiert vonit's distilled so it's going to be smaller than any non distilled of the same quality
-
This post did not contain any content.
DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch
DeepSeek's distilled new R1 AI model can run on a single GPU, putting it within reach of hobbyists.
TechCrunch (techcrunch.com)
schrieb am 30. Mai 2025, 01:48 zuletzt editiert vonSo can a lot of other models.
"This load can be towed by a single vehicle"
-
It proved sqrt(2) irrational with 40tps on a 3090 here. The 32b R1 did it with 32tps but it thought a lot longer.
schrieb am 30. Mai 2025, 01:58 zuletzt editiert von vhstape@lemmy.sdf.org 6. Feb. 2025, 21:52On my Mac mini running LM Studio, it managed 1702 tokens at 17.19 tok/sec and thought for 1 minute. If accurate, high-performance models were more able to run on consumer hardware, I would use my 3060 as a dedicated inference device
-
Yeah idk. I did some work with deepseek early on. I wasn't impressed.
HOWEVER...
Some other things they've developed like deepsite, holy shit impressive.
schrieb am 30. Mai 2025, 03:04 zuletzt editiert vonSave me the search, please. What's deepsite?
-
Save me the search, please. What's deepsite?
schrieb am 30. Mai 2025, 04:17 zuletzt editiert von tropicaldingdong@lemmy.world 6. Feb. 2025, 21:54404 Not Found | tmpweb.net
A programmatically usable temporary web host.
(tmpweb.net)
Above is what I can do with deepsite by pasting in the first page of your lemmy profile and the prompt:
"This is double_quack, a lemmy user on Lemmy, a new social media platform. Create a cool profile page in a style that they'll like based on the front page of their lemmy account (pasted in a ctrl + a, ctrl + c, ctrl + v of your profile)."
It not perfect by any stretch of the imagination, but like, its not a bad starting point.
if you want to try it: https://huggingface.co/spaces/enzostvs/deepsite
-
404 Not Found | tmpweb.net
A programmatically usable temporary web host.
(tmpweb.net)
Above is what I can do with deepsite by pasting in the first page of your lemmy profile and the prompt:
"This is double_quack, a lemmy user on Lemmy, a new social media platform. Create a cool profile page in a style that they'll like based on the front page of their lemmy account (pasted in a ctrl + a, ctrl + c, ctrl + v of your profile)."
It not perfect by any stretch of the imagination, but like, its not a bad starting point.
if you want to try it: https://huggingface.co/spaces/enzostvs/deepsite
schrieb am 30. Mai 2025, 04:26 zuletzt editiert vonExcuse me... what? Ok, that's something...
-
Excuse me... what? Ok, that's something...
schrieb am 30. Mai 2025, 04:30 zuletzt editiert vonHere I'm DM"ing you something. Its very personal, but I want to share it with you and I made it using Deepsite (in part).
-
7b trash model?
schrieb am 30. Mai 2025, 05:23 zuletzt editiert vonI'm genuinely curious what you do that a 7b model is "trash" to you? Like yeah sure a gippity now tends to beat out a mistral 7b but I'm pretty happy with my mistral most of the time if I ever even need ai at all.
-
This post did not contain any content.
DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch
DeepSeek's distilled new R1 AI model can run on a single GPU, putting it within reach of hobbyists.
TechCrunch (techcrunch.com)
schrieb am 30. Mai 2025, 05:40 zuletzt editiert vonew probably still censored.
-
ew probably still censored.
schrieb am 30. Mai 2025, 09:46 zuletzt editiert vonYou can self host it right??
-
ew probably still censored.
schrieb am 30. Mai 2025, 10:36 zuletzt editiert vonThe censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.
If you run it yourself, the censorship isn't there.
-
The censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.
If you run it yourself, the censorship isn't there.
schrieb am 30. Mai 2025, 11:27 zuletzt editiert von monkdervierte@lemmy.ml 6. Feb. 2025, 22:01Yeah, i think the censoring in the LLM data itself would be pretty vulnerable to circumvention.
-
the Chinese AI lab also released a smaller, “distilled” version of its new R1, DeepSeek-R1-0528-Qwen3-8B, that DeepSeek claims beats comparably sized models on certain benchmarks
Most models come in 1B, 7-8B, 12-14B, and 27+B parameter variants. According to the docs, they benchmarked the 8B model using an NVIDIA H20 (96 GB VRAM) and got between 144-1198 tokens/sec. Most consumer GPUs probably aren’t going to be able to keep up with
schrieb am 30. Mai 2025, 18:15 zuletzt editiert von brucethemoose@lemmy.world 6. Feb. 2025, 22:08Depends on the quantization.
7B is small enough to run it in FP8 or a Marlin quant with SGLang/VLLM/TensorRT, so you can probably get very close to the H20 on a 3090 or 4090 (or even a 3060) and you know a little Docker.
-
You can self host it right??
schrieb am 30. Mai 2025, 23:58 zuletzt editiert vonif the model is censored... then what, retraining it? Or doing it from scratch like what open-r1 is doing?
-
The censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.
If you run it yourself, the censorship isn't there.
schrieb am 31. Mai 2025, 01:22 zuletzt editiert vonUntrue, I downloaded the vanilla version and it's hardcoded in.
-
Microsoft no longer permits local Windows 10 accounts if you want Consumer Extended Security Updates — support beyond EOL requires a Microsoft Account link-up even if you pay $30
Technology vor 20 Stunden1
-
-
JavaScript broke the web (and called it progress) - Jono Alderson
Technology21 vor 17 Tagenvor 19 Tagen1
-
UK wants to weasel out of demand for Apple encryption back door
Technology21 vor 5 Tagenvor 19 Tagen1
-
Create a Professional Logo with AI – Step-by-Step Digital Guide
Technology21 vor 27 Tagenvor 27 Tagen2
-
-
-
Google quietly paused the rollout of its AI-powered ‘Ask Photos’ search feature
Technology 3. Juni 2025, 18:161