Commit Graph

8 Commits

Author SHA1 Message Date
George Hotz
a9b6cfece0 refactor llm into files (#15780)
* refactor llm into files

* chat.html

* tokenizer cleanup

* cleanup

* tests
2026-04-17 12:33:11 +08:00
George Hotz
ec00cefa5b llm is the only app (#15779)
* tinygrad/llm is the only app

* upd pyproject

* claude refs

* scoping

* min diff
2026-04-17 10:44:48 +08:00
George Hotz
b5a9465b13 llm: add support for moonlight (deepseek MLA) (#15466)
* add gguf Q5_0

* it works

* rebase

* simpler test

* class

* less diff

* dicts

* normal names

* simplify

* this

* simpler

* work

* work
2026-04-11 10:32:48 +08:00
George Hotz
9092f2a8c0 llm: add shared_expert and rope_dim support from qwen35 (#15673)
* llm: add shared_expert and rope_dim support from qwen35

* refactor into FFNBlock and TransformerBlock

* norms where they belong
2026-04-10 19:18:27 +08:00
George Hotz
fe2690399b llm: support assistant prefill + refactor to TransformerConfig (#15457)
* llm: support assistant prefill

* refactor to ModelConfig

* TransformerConfig

* more
2026-03-25 10:50:48 +08:00
b1tg
856a839efc llm: fix qwen3 moe topk renormalization (#15201) 2026-03-17 12:57:33 +08:00
George Hotz
df0f9d6860 add olmoe support to llm (#13792)
* add olmoe support to llm

* cleanups

* simpler

* clean

* fix mypy

* lil

* remove dumb assert
2025-12-22 10:41:35 -04:00
George Hotz
75a6a03664 add qwen3 moe support to tinygrad.apps.llm (#13775)
* qwen moe works

* simple moe

* one test

* integration
2025-12-21 12:36:02 -04:00