George Hotz
|
a9b6cfece0
|
refactor llm into files (#15780)
* refactor llm into files
* chat.html
* tokenizer cleanup
* cleanup
* tests
|
2026-04-17 12:33:11 +08:00 |
|
George Hotz
|
ec00cefa5b
|
llm is the only app (#15779)
* tinygrad/llm is the only app
* upd pyproject
* claude refs
* scoping
* min diff
|
2026-04-17 10:44:48 +08:00 |
|
George Hotz
|
b5a9465b13
|
llm: add support for moonlight (deepseek MLA) (#15466)
* add gguf Q5_0
* it works
* rebase
* simpler test
* class
* less diff
* dicts
* normal names
* simplify
* this
* simpler
* work
* work
|
2026-04-11 10:32:48 +08:00 |
|
George Hotz
|
9092f2a8c0
|
llm: add shared_expert and rope_dim support from qwen35 (#15673)
* llm: add shared_expert and rope_dim support from qwen35
* refactor into FFNBlock and TransformerBlock
* norms where they belong
|
2026-04-10 19:18:27 +08:00 |
|
George Hotz
|
fe2690399b
|
llm: support assistant prefill + refactor to TransformerConfig (#15457)
* llm: support assistant prefill
* refactor to ModelConfig
* TransformerConfig
* more
|
2026-03-25 10:50:48 +08:00 |
|
b1tg
|
856a839efc
|
llm: fix qwen3 moe topk renormalization (#15201)
|
2026-03-17 12:57:33 +08:00 |
|
George Hotz
|
df0f9d6860
|
add olmoe support to llm (#13792)
* add olmoe support to llm
* cleanups
* simpler
* clean
* fix mypy
* lil
* remove dumb assert
|
2025-12-22 10:41:35 -04:00 |
|
George Hotz
|
75a6a03664
|
add qwen3 moe support to tinygrad.apps.llm (#13775)
* qwen moe works
* simple moe
* one test
* integration
|
2025-12-21 12:36:02 -04:00 |
|