George Hotz
4c1fb18a09
Revert "Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (…" ( #15703 )
...
This reverts commit 0cec42db71 .
2026-04-13 19:09:38 +08:00
George Hotz
0cec42db71
Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue ( #15700 )" ( #15702 )
...
This reverts commit 6f5d756282 .
2026-04-13 19:06:44 +08:00
George Hotz
6f5d756282
Tests for GatedDeltaNetBlock + fix multi after assign issue ( #15700 )
...
* broken after/assign test
* test for GatedDeltaNet
* better comments
* fix issue 1 with multi kernel
* fix 2
* fix
* linter
* public api + cleanup
2026-04-13 18:43:23 +08:00
George Hotz
b5a9465b13
llm: add support for moonlight (deepseek MLA) ( #15466 )
...
* add gguf Q5_0
* it works
* rebase
* simpler test
* class
* less diff
* dicts
* normal names
* simplify
* this
* simpler
* work
* work
2026-04-11 10:32:48 +08:00
George Hotz
9092f2a8c0
llm: add shared_expert and rope_dim support from qwen35 ( #15673 )
...
* llm: add shared_expert and rope_dim support from qwen35
* refactor into FFNBlock and TransformerBlock
* norms where they belong
2026-04-10 19:18:27 +08:00
b1tg
a63392a565
llm: pairwise ranking topk for MoE expert selection ( #15499 )
2026-03-31 12:46:39 +08:00
George Hotz
d59e6e7a37
move more tests to test/null, split some existing ones ( #14512 )
...
* move more tests to test/null, split some existing ones
* null work
* null work
* move more
* fixes
* move PIL
* PIL in CLIP
* don't move that
2026-02-03 20:20:20 +08:00
George Hotz
572ca80046
fast tinygrad.apps.llm ( #13685 )
...
* llm: add --benchmark support
* fix speed
* debug logging
* fix test attention
2025-12-14 21:05:21 -05:00
chenyu
cf8232ec6a
clean up more RANGEIFY flag ( #12556 )
2025-10-09 03:06:48 -04:00
George Hotz
4c9a930de2
rangeify attn tests ( #12377 )
2025-10-01 09:59:19 +08:00
qazal
109c63b904
update Tensor unit tests for RANGEIFY ( #12359 )
...
* update test_kernelize for RANGEIFY
* also kernelizes user contiguous
* skip that test
* tensor uop repr
* 4 kernels, still realizes a float
2025-09-30 11:17:21 +03:00
Nino Risteski
54be477152
rope cache optim for jit prune in llm.py ( #11678 )
...
* rope cache optim for jit prune
* rope test
* tests in test attention
* Revert "rope test"
This reverts commit 69ede543d0 .
* lint
2025-08-28 08:31:29 -07:00
chenyu
90c3ed17c5
move cast to before softmax in attention ( #9213 )
...
* move cast to before softmax in attention
saved some memory because exp (which is used for backward) are done in half. training bert seems fine and can fit BS=78 now (from 66)
* test
2025-02-24 17:24:59 -05:00
chenyu
ff3f2a9c1a
Revert "move attention upcast ( #7830 )" ( #7903 )
...
This reverts commit c07daf40e7 .
2024-11-25 18:59:51 -05:00
chenyu
c07daf40e7
move attention upcast ( #7830 )
...
still upcast before softmax, but faster because intermediate buffer can be stored in half (as long as qk is within half range).
2024-11-22 17:10:51 -05:00