Christopher Milan
bd06ea9f97
am: simplify import_module ( #16046 )
2026-05-05 19:25:53 -04:00
chenyu
34fe37d64e
use FLOORDIV and FLOORMOD ( #16048 )
...
* use FLOORDIV and FLOORMOD
also removed CORRECT_DIVMOD_FOLDING
* fix
* Revert "fix"
This reverts commit 86af33b88ef31943c61e67189b072eca4896409a.
* fix
* fix
2026-05-05 18:32:54 -04:00
Christopher Milan
76ff378007
autogen: fewer apt dependencies ( #16049 )
2026-05-05 17:22:41 -04:00
chenyu
9c37a0c75d
Ops.FLOORDIV and Ops.FLOORMOD ( #16038 )
...
* Ops.FLOORDIV and Ops.FLOORMOD
lowered into IDIV and MOD in get_late_rewrite_patterns
* still need this
* exclude
* like that?
2026-05-05 11:42:14 -04:00
Christopher Milan
1c8cb0769a
am: autogen asic_regs ( #16004 )
2026-05-04 22:52:07 -04:00
nimlgen
5b4f62519d
cache buffer_views as well ( #16039 )
...
* cache buffer_views as well
* reuse
* back
* x
2026-05-05 00:00:09 +03:00
Christopher Milan
cee73becbe
am: ip offsets in autogen ( #16003 )
2026-05-01 00:13:52 -04:00
Christopher Milan
d07741f1d7
am: look for firmware in /lib/firmware/amdgpu ( #15974 )
2026-04-29 17:15:09 -04:00
nimlgen
77965a22e5
local optimize as rewrite ( #15953 )
...
* local optimize as rewrite
* better
* x
* slighly rename
* fix
* ugh
* remove
* x
* remove
* not weak
2026-04-28 22:51:04 +03:00
Christopher Milan
697e7aa819
MOCK+AMD and MOCK+NV interfaces ( #15858 )
...
MOCK+AMD is an alias for MOCKKFD+AMD, MOCKNVK+NV is renamed to MOCK+NV
2026-04-21 18:22:16 -04:00
qazal
e36ff22538
fix dev syntax in emulated amd tests, skip test_tk ( #15856 )
...
* fix dev syntax in emulated amd tests
* skip test_tk
2026-04-21 23:47:29 +03:00
Christopher Milan
1a8ba4cbd6
CPU renderers use arch ( #15839 )
2026-04-20 23:38:29 -04:00
nimlgen
b8d3bf8970
run_linear in jit ( #15827 )
...
* run_linear in jit
* x
* x
* f
* casts
* ugh
* f
* x
* x
* simple
2026-04-20 23:03:30 +03:00
Christopher Milan
6adf4c3cd9
MOCKGPU interfaces ( #15796 )
2026-04-17 21:56:29 -04:00
George Hotz
e1d13bc4fe
add GGUF IQ4_XS support ( #15766 )
...
* add GGUF IQ4_XS support
* gguf 21
* gguf 21
* use plus
* ggml_common autogen for constant arrays
* fix
* ggml_common in autogen
* inline
2026-04-17 14:43:39 +08:00
George Hotz
ec00cefa5b
llm is the only app ( #15779 )
...
* tinygrad/llm is the only app
* upd pyproject
* claude refs
* scoping
* min diff
2026-04-17 10:44:48 +08:00
George Hotz
16f50a40a5
remove REMU from tree ( #15706 )
...
* no more compare emulators
* remove remu from tree
2026-04-13 20:43:08 +08:00
b1tg
2b5ba0095d
qwen3.5 ( #15210 )
...
* qwen3.5
* faster
* or
* rm zero hack
* less float
* T=1
* clean
* clean
* 4b
* rope_dim
* Revert "jit: captures linears, not execitems (#15399 )"
This reverts commit 9656d97d97 .
* DeltaNetBlock
* pairwise_topk
* clean
* Reapply "jit: captures linears, not execitems (#15399 )"
This reverts commit cf3deff53d .
* clean topk, _swiglu
* common
* FFNBlock
* clean
* half
* no mix
* qwen3.5 test
* fix ssm cache invalidation
* TransformerConfig
* SSMConfig
* clean
* reset_state
* llm: reuse server conversation tokens to avoid BPE roundtrip cache miss
* import error
* prefill
* none check
* put it back
* clean pairwise_topk
* symbolic: fold BIND(CONST, CONST) to CONST
* clean
* simpler pm
* _cached_msg_count
* stream decoder; ssm checkpoints
* rm checkpoint
* attn_output_gate
* conflict, attn_output_gate
* clean, less has_ssm, assert
* chunked prefill
* _reset_cache
* _reusable_prefix_len
* revert loop
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2026-04-13 15:35:24 +08:00
nimlgen
0ff30b003d
am: reset queues from spi ( #15664 )
...
* am: reset queues from spi
* move
2026-04-09 18:25:50 +03:00
Christopher Milan
19e96497ee
interface in DEV ( #15620 )
2026-04-06 19:59:28 -04:00
Christopher Milan
645d45d968
DEV has arch ( #15577 )
...
Co-authored-by: Comma Device <device@comma.ai >
2026-04-03 19:17:19 -04:00
wozeparrot
3a26920141
feat: framework ci ( #15589 )
2026-04-02 22:03:51 -07:00
nimlgen
694dc5a717
install script in benchmark ( #15584 )
2026-04-02 18:15:58 +03:00
chenyu
5aeb2273db
add amd_copy_matmul.py to CI ( #15555 )
...
more tests before cleanup
2026-03-31 22:39:18 -04:00
Christopher Milan
acf239e4d2
specify renderer in DEV, <dev>_<ren>=1 is deprecated ( #15551 )
2026-03-31 18:35:14 -04:00
wozeparrot
79cccf3003
write sz output to file ( #15534 )
2026-03-30 20:16:17 -07:00
nimlgen
9583489068
add mlx driver to extra ( #15526 )
...
* mlx driver
* x
* simpler
2026-03-30 20:28:49 +03:00
Christopher Milan
313937ad6d
fix IMAGE TestEnd2End.test_linear_mnist ( #15488 )
2026-03-26 04:12:47 -04:00
Christopher Milan
bc180a963c
deprecate <dev>=1 in favor of DEV=<dev> ( #15467 )
...
* start work on target
* add test
* update actions to use DEV
* update docs
* update readmes
* tests need that too
* update example
* update tests (comments)
* fix that test
* ruff
* mypy
* oops
* remove getenvs
* don't add Target yet
* and the test
* lint
* and docs
* more stuff
* assert
* few more fixes
* test assert
2026-03-26 03:48:03 -04:00
nimlgen
9d2d0774b4
remote: disk copies ( #15482 )
...
* remote: disk copies
* lineter
* r
* nv
* x
2026-03-25 22:14:25 +03:00
Salman Chishti
84049fdc07
Upgrade GitHub Actions to latest versions ( #15446 )
...
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2026-03-24 10:28:49 -04:00
Salman Chishti
9567075e20
Upgrade GitHub Actions for Node 24 compatibility ( #15445 )
...
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2026-03-24 10:28:19 -04:00
Christopher Milan
2e4fbbcc9c
ir3: fix texture mapping and benchmark ( #15443 )
2026-03-24 04:52:54 -04:00
Christopher Milan
d5320a9ddf
QCOM cleanups ( #15435 )
2026-03-23 22:18:38 -04:00
nimlgen
fa4cdb422e
memplan on linears ( #15422 )
...
* memplan
* test
* x
* arenas
* correct
* set any size
* ugh
* make hevc happy
* x
* x
* held
* rm old
* del
* x
* fu
* f
* cl
* cl
* ok
2026-03-23 19:50:16 +08:00
Christopher Milan
1560b534a5
remove IMAGE=2 ( #15312 )
2026-03-20 06:26:52 -04:00
Christopher Milan
30d609432f
ci: only xcode-select for gpuocelot on macos ( #15387 )
2026-03-20 05:58:16 -04:00
George Hotz
78ad089817
make precompile the default for llm ( #15376 )
...
* make precompile the default for llm
* works
* empty is okay for kvcache
* fix cache misses
* more tests
2026-03-20 14:08:55 +08:00
nimlgen
3b04e3ea28
no gmmu mappings with GMMU=0 ( #15369 )
...
* usb
* free
* simple gmmu=0
* x
* x
* vram
* init tests
* ppg
* x
2026-03-20 12:18:34 +08:00
chenyu
45baf3ff3f
pin ci xcode version ( #15375 )
2026-03-19 23:13:16 -04:00
qazal
176ad47d7d
cdna4 emulator testing ASM_GEMM in CI ( #15373 )
...
* cdna emulator work
* accvgprs
* cdna passes most tests
* ruff
* add cdna4 to tests
* cdna emu
* crash
* pass?
* work
* gen
* clean up wave_size access
* asm_gemm passes
* remove acc from dsl.py, emulator can keep its different reg file
it's purely an encoding here, the ASM_GEMM already encodes acc srcs with v[], this can
be cleaned up later, but not functionally required for emulator.
* split asm_gemm tests to ones fast on the emulator
* don't do that
* 124 stays null on rdna
* the segfault was because of hw regs, not this
* Revert "clean up wave_size access", it's explicitly tested
This reverts commit 1202ff5787 .
* nullcopyout
---------
Co-authored-by: George Hotz <geohot@gmail.com >
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2026-03-20 05:51:30 +09:00
nimlgen
1c978aeedb
amd: fix aql remote ( #15368 )
2026-03-19 18:11:03 +08:00
nimlgen
1a53393512
remote in ci benchmark ( #15344 )
...
* remote in ci benchmark
* move to the end
* move
* ports
* own this
2026-03-19 13:49:09 +08:00
Christopher Milan
499ad9a356
benchmark openpilot 0.11.0 ( #15341 )
2026-03-18 03:28:43 -04:00
qazal
00817cf65e
viz: all tests can run on the NULL device ( #15328 )
...
* remove that
* move to test_viz
* get_cfg
* do not use os.environ
* hm
* it's always on NULL
* import renderer
* no import *
2026-03-18 04:14:20 +09:00
Christopher Milan
c251fc67c5
ci: consider arch in venv and apt caches and go back to 3.12 ( #15250 )
2026-03-13 00:36:49 -04:00
Christopher Milan
d4b947ea9a
ci: explicitly request python 3.12.10 instead of 3.12 ( #15246 )
...
3.12.10 is the most recent 3.12 version that has toolcache builds for linux, macos, and windows
2026-03-12 23:00:46 -04:00
Ananta Ranganathan
5bdad8ee41
update mxfp4 tests to use the same patterns as the others ( #15177 )
...
* update mxfp4 tests to use the same patterns as the others
* fix typo in test call not sure how it committed
2026-03-06 13:21:40 -05:00
Christopher Milan
7810be8d3c
compile QCOM without opening device ( #15165 )
...
Co-authored-by: Comma Device <device@comma.ai >
2026-03-06 06:24:27 -05:00
nimlgen
cdc48da9cd
hevc: assert and speed ( #15122 )
...
* hevc: assert and speed
* simpler
2026-03-04 19:01:02 +03:00