DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch
-
Excuse me... what? Ok, that's something...
Here I'm DM"ing you something. Its very personal, but I want to share it with you and I made it using Deepsite (in part).
-
7b trash model?
I'm genuinely curious what you do that a 7b model is "trash" to you? Like yeah sure a gippity now tends to beat out a mistral 7b but I'm pretty happy with my mistral most of the time if I ever even need ai at all.
-
This post did not contain any content.
ew probably still censored.
-
ew probably still censored.
You can self host it right??
-
ew probably still censored.
The censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.
If you run it yourself, the censorship isn't there.
-
The censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.
If you run it yourself, the censorship isn't there.
Yeah, i think the censoring in the LLM data itself would be pretty vulnerable to circumvention.
-
the Chinese AI lab also released a smaller, “distilled” version of its new R1, DeepSeek-R1-0528-Qwen3-8B, that DeepSeek claims beats comparably sized models on certain benchmarks
Most models come in 1B, 7-8B, 12-14B, and 27+B parameter variants. According to the docs, they benchmarked the 8B model using an NVIDIA H20 (96 GB VRAM) and got between 144-1198 tokens/sec. Most consumer GPUs probably aren’t going to be able to keep up with
Depends on the quantization.
7B is small enough to run it in FP8 or a Marlin quant with SGLang/VLLM/TensorRT, so you can probably get very close to the H20 on a 3090 or 4090 (or even a 3060) and you know a little Docker.
-
You can self host it right??
if the model is censored... then what, retraining it? Or doing it from scratch like what open-r1 is doing?
-
The censorship only exists on the version they host, which is fair enough. If they're running it themselves in China, they can't just break the law.
If you run it yourself, the censorship isn't there.
Untrue, I downloaded the vanilla version and it's hardcoded in.
-
You can self host it right??
The self hosted model has hard coded censored content.
-
Tech Workers, Shareholders, and Civil Society All Call For Big Tech Accountability in Israel’s Genocide against Palestinians
Technology1
-
A UK government trial with 20K+ civil servants using Microsoft's Copilot AI for three months found a 26 minute average daily time saving, or two weeks per year
Technology1
-
-
-
-
-
-