27.03.2026-7:35:12 :: Remember, remember, the velvet November


node:	27.03.2026-7:35:12
template:	4
parent:	DžePeTo aneb GPT-3, GPT-4 & GPT-5 node
owner:	RataFuck von Plachta
viewed by:
created:	27.03.2026 - 07:35:12

cwbe coordinatez:
101
63533
63608
8771344
9305792

ABSOLUT
KYBERIA

permissions
you:	r,
system:	public
net:	yes

neurons
stats|by_visit|by_K source tiamat K|my_K|given_K last commanders polls
total descendants::7
total children::2
6 ❤️

show[ 2 | 3] flat

Nový kompresní algoritmus TurboQuant snižuje paměťovou náročnost LLM 6×

Google zveřejnil kompresní algoritmus TurboQuant, který zmenšuje paměťovou náročnost velkých jazykových modelů (LLM) a zároveň zvyšuje jejich rychlost. TurboQuant pracuje na vyrovnávací paměti KV, což je právě úzké hrdlo LLM. Abyste ušetřili paměť, můžete jednoduše snížit přesnost datových typů v KV, ale to může přinést horší kvalitu odpovědí.
TurboQuant funguje zhruba jako kvantizace při ztrátové kompresi JPEG. Obrázek je výrazně menší, avšak je stále podobný originálu. V tomto srovnání je zmiňované snížení přesnosti datových typů jako zmenšení hloubky barev. Takových kvantizací pro KV LLM existuje více (SnapKV, PyramidKV, KIVI), ale TurboQuant zachovává kvalitu odpovědí LLM LLaMa a Mistral téměř původní, přitom vyrovnávací paměť KV je 6× menší a rychlost 8× vyšší. Více informací najdete v článku.

url v clanku https://www.root.cz/zpravicky/novy-kompresni-algoritmus-turboquant-snizuje-pametovou-narocnost-llm-6x/

Żubr żuł żuchwą żurawinę

000001010006353300063608087713440930579209307042

Hm, zda sa ze toto je dost massive, clanok archivujem aj sem na kyberiu

https://arxiv.org/pdf/2504.19874

Vector quantization, a problem rooted in Shannon’s source coding theory, aims to quantize
high-dimensional Euclidean vectors while minimizing distortion in their geometric structure. We propose TurboQuant to address both mean-squared error (MSE) and inner product distor-
tion, overcoming limitations of existing methods that fail to achieve optimal distortion rates. Our data-oblivious algorithms, suitable for online applications, achieve near-optimal distortion rates (within a small constant factor) across all bit-widths and dimensions. TurboQuant achieves this by randomly rotating input vectors, inducing a concentrated Beta distribution on coordinates, and leveraging the near-independence property of distinct coordinates in high dimensions to simply apply optimal scalar quantizers per each coordinate. Recognizing that MSE-optimal quantizers introduce bias in inner product estimation, we propose a two-stage approach: applying an MSE quantizer followed by a 1-bit Quantized JL (QJL) transform on the residual, resulting in an unbiased inner product quantizer. We also provide a formal proof of the information-theoretic lower bounds on best achievable distortion rate by any vector quantizer, demonstrating that TurboQuant closely matches these bounds, differing only by a small constant (≈ 2.7) factor. Experimental results validate our theoretical findings, showing that for KV cache quantization, we achieve absolute quality neutrality with 3.5 bits per channel and marginal quality degradation with 2.5 bits per channel. Furthermore, in nearest neighbor search tasks, our method outperforms existing product quantization techniques in recall while reducing indexing time to virtually zero.

00000101000635330006360808771344093057920930704209307044

pustite na to Johna Carmacka, to len potom bude Turbo :)

0000010100063533000636080877134409305792093070420930704409307045

Carmack robil aplikovanú informatiku, nie vedu.

000001010006353300063608087713440930579209306852

DDR5 memory prices collapsing in China. Google’s TurboQuant algorithm reducing memory demand. Retail prices of 32GB 6400MHz DDR5 down ~30% from peaks. First major reversal of the AI-driven rally. $MU, Samsung, Hynix — nobody flagging this yet.

00000101000635330006360808771344093057920930685209307069

zeby sa blyskalo na lepsie casy? :-D

00000101000635330006360808771344093057920930685209306988

konečne
https://wccftech.com/ddr5-prices-in-china-face-a-complete-collapse-as-memory-markets-from-shift-from-desperate-scarcity-to-sudden-uncertainty/

axone umela inteligencia