tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-15 17:40:13 +08:00

Author	SHA1	Message	Date
George Hotz	a9b6cfece0	refactor llm into files (#15780 ) * refactor llm into files * chat.html * tokenizer cleanup * cleanup * tests	2026-04-17 12:33:11 +08:00
George Hotz	ec00cefa5b	llm is the only app (#15779 ) * tinygrad/llm is the only app * upd pyproject * claude refs * scoping * min diff	2026-04-17 10:44:48 +08:00
George Hotz	f57380cbc2	simplify GatedDeltaNetBlock using two state tensors (#15704 ) * test double after * simpler ssm * no double test	2026-04-16 21:14:00 +08:00
George Hotz	4c1fb18a09	Revert "Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (…" (#15703 ) This reverts commit `0cec42db71`.	2026-04-13 19:09:38 +08:00
George Hotz	0cec42db71	Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (#15700 )" (#15702 ) This reverts commit `6f5d756282`.	2026-04-13 19:06:44 +08:00
George Hotz	6f5d756282	Tests for GatedDeltaNetBlock + fix multi after assign issue (#15700 ) * broken after/assign test * test for GatedDeltaNet * better comments * fix issue 1 with multi kernel * fix 2 * fix * linter * public api + cleanup	2026-04-13 18:43:23 +08:00
George Hotz	b5a9465b13	llm: add support for moonlight (deepseek MLA) (#15466 ) * add gguf Q5_0 * it works * rebase * simpler test * class * less diff * dicts * normal names * simplify * this * simpler * work * work	2026-04-11 10:32:48 +08:00
George Hotz	9092f2a8c0	llm: add shared_expert and rope_dim support from qwen35 (#15673 ) * llm: add shared_expert and rope_dim support from qwen35 * refactor into FFNBlock and TransformerBlock * norms where they belong	2026-04-10 19:18:27 +08:00
b1tg	a63392a565	llm: pairwise ranking topk for MoE expert selection (#15499 )	2026-03-31 12:46:39 +08:00
George Hotz	d59e6e7a37	move more tests to test/null, split some existing ones (#14512 ) * move more tests to test/null, split some existing ones * null work * null work * move more * fixes * move PIL * PIL in CLIP * don't move that	2026-02-03 20:20:20 +08:00
George Hotz	572ca80046	fast tinygrad.apps.llm (#13685 ) * llm: add --benchmark support * fix speed * debug logging * fix test attention	2025-12-14 21:05:21 -05:00
chenyu	cf8232ec6a	clean up more RANGEIFY flag (#12556 )	2025-10-09 03:06:48 -04:00
George Hotz	4c9a930de2	rangeify attn tests (#12377 )	2025-10-01 09:59:19 +08:00
qazal	109c63b904	update Tensor unit tests for RANGEIFY (#12359 ) * update test_kernelize for RANGEIFY * also kernelizes user contiguous * skip that test * tensor uop repr * 4 kernels, still realizes a float	2025-09-30 11:17:21 +03:00
Nino Risteski	54be477152	rope cache optim for jit prune in llm.py (#11678 ) * rope cache optim for jit prune * rope test * tests in test attention * Revert "rope test" This reverts commit `69ede543d0`. * lint	2025-08-28 08:31:29 -07:00
chenyu	90c3ed17c5	move cast to before softmax in attention (#9213 ) * move cast to before softmax in attention saved some memory because exp (which is used for backward) are done in half. training bert seems fine and can fit BS=78 now (from 66) * test	2025-02-24 17:24:59 -05:00
chenyu	ff3f2a9c1a	Revert "move attention upcast (#7830 )" (#7903 ) This reverts commit `c07daf40e7`.	2024-11-25 18:59:51 -05:00
chenyu	c07daf40e7	move attention upcast (#7830 ) still upcast before softmax, but faster because intermediate buffer can be stored in half (as long as qk is within half range).	2024-11-22 17:10:51 -05:00

18 Commits