chenyu
02afb45f29
remove UOp.assign [pr] ( #15300 )
...
* remove UOp.assign [pr]
it's all store and after, UOp is immutable
* fix test
2026-03-16 21:45:41 -04:00
chenyu
3e2b7803e6
view assign replaces at buffer identity ( #15298 )
...
matches what functions capture
2026-03-16 19:58:38 -04:00
George Hotz
476276f4b4
support grads on tuples ( #15287 )
...
* support grads on tuples
* simpler
* grad_fxn works
* cleanups
* unused
2026-03-16 17:39:34 +08:00
George Hotz
08662bc4ab
add TUPLE/GETTUPLE, simple tests pass ( #15286 )
...
* simple tuple stuff passes
* resolved
2026-03-16 15:06:02 +08:00
chenyu
cd14e8e64b
allocations contiguous is store+after ( #15280 )
2026-03-15 11:58:40 -04:00
Sieds Lykles
4b59083d7c
assign into empty works ( #15256 )
2026-03-13 10:24:29 -04:00
chenyu
018c01508d
test case for call precompile multi ( #15254 )
2026-03-13 06:28:43 -04:00
b1tg
18dc77ccab
add fp8 fnuz dtypes with PYTHON backend support ( #14945 )
...
* add fp8 fnuz dtypes with PYTHON backend support
* rm emu related change
* clarify fp8 fnuz zero handling
* Revert "rm emu related change"
This reverts commit efa4763c22 .
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2026-03-11 22:30:18 -04:00
George Hotz
4f3f55328b
do not patch on invalid tensor tests ( #15226 )
...
* do not patch on invalid tensor tests
* cleanup
2026-03-12 09:35:20 +08:00
Christopher Milan
2fb8a7f60f
fix test_invalid_tensor when before values are nan ( #15215 )
2026-03-10 23:51:19 -04:00
Christopher Milan
ffaafd391a
Invalid in Tensor ( #15154 )
2026-03-10 02:49:54 -04:00
chenyu
a53187eef7
fix TestPartialAssignToSharedBuffer ( #15202 )
...
bufferize_to_store issue with assign
2026-03-09 23:14:23 -04:00
b1tg
891a73befc
llm: fix chunked prefill ( #15182 )
...
* llm: fix chunked prefill
* less lines
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2026-03-07 22:08:31 +08:00
Ananta Ranganathan
5bdad8ee41
update mxfp4 tests to use the same patterns as the others ( #15177 )
...
* update mxfp4 tests to use the same patterns as the others
* fix typo in test call not sure how it committed
2026-03-06 13:21:40 -05:00
Ananta Ranganathan
5c50035e0d
avoid using arithmetic for mxfp4 ( #15172 )
...
* avoid using arithmetic for mxfp4
* update tests to use assert equal
* no longer todo
2026-03-06 11:17:56 -05:00
Roelof van Dijk
059c6326c0
metal uint32 icb offset overflow ( #15156 )
...
* metal uint32 icb offset overflow
fix: diff
supports_exec_item
GraphRunner.supports_exec_item
tests
fix: can't import on non-metal
stricter
* also test the non-metal buffer case
* imports on non-mac
2026-03-06 00:54:39 +03:00
Ananta Ranganathan
8ef656324e
FIXED TEST Q5_K GGUF dequant ( #15147 )
...
* q5_k gguf support as separate pr
* fix the problematic gemv test for q5_k
* add assert to make sure the gemv test cant fail with warning instead of error
2026-03-05 16:32:36 +08:00
George Hotz
e97922a57c
LLM speedup with two jits, prefill/rollout ( #15153 )
...
* START_TIME
* print cleanup
* fix tests
2026-03-05 16:21:09 +08:00
George Hotz
fb43b415f9
fix symbolic shape call + chunked prefill ( #15149 )
...
* fix precompile for symbolic shape
* chunked prefill
* cleaner
* test that
2026-03-05 14:02:26 +08:00
George Hotz
ac1847cbf7
fully symbolic llm ( #15097 )
...
* work
* llm symbolic (almost)
* work
* revert that
* llm sym
* works
* cleanups
* cache tokens with the kv cache
* cleanups
* cleanups
2026-03-05 10:22:11 +08:00
chenyu
34594bcaaf
Revert "bug in metal: offset is stored as uint32, overflow ( #15129 )" ( #15136 )
...
This reverts commit 9c58db16fa .
2026-03-04 16:54:42 -05:00
Roelof van Dijk
9c58db16fa
bug in metal: offset is stored as uint32, overflow ( #15129 )
...
* metal uint32 icb offset overflow
* fix: diff
* supports_exec_item
* GraphRunner.supports_exec_item
* tests
* fix: can't import on non-metal
2026-03-04 22:52:12 +03:00
chenyu
fae400d300
update assign tests to also test the expected behavior ( #15132 )
2026-03-04 11:34:43 -05:00
chenyu
1f96cc2b51
update non-contiguous buffer error message [pr] ( #15131 )
...
* update non-contiguous buffer error message [pr]
also cleaned up the tests
* order
2026-03-04 11:13:26 -05:00
George Hotz
01ddb4c267
add precompile to call ( #15099 )
...
* add precompile to call
* put get back
* something
* after structure
* alt
* keep it call
* resolve call
* resolve linear call
* precompile works with llm
* revert rangeify
* color for debugging
* getenv PRECOMPILE
* clean up deco pattern
* fully recursive sink scheduling
* revert llama
* fix SPEC=2
2026-03-03 22:32:42 +08:00
chenyu
5dcf29b1a0
use clone in test_swap_slices ( #15096 )
2026-03-02 22:05:12 -05:00
George Hotz
d483e4153a
buffer view is like buffer ( #15082 )
...
* buffer view is like buffer
* fix
* swap_reshape_shrink
* contiguous on gguf, fix overlap
* revert that
* _device_supports_view
* this
* fix that test
* 0 buffers
* that test was wrong
* this
* check correct size
* contig BUFFER_VIEW
* this
* fix tests
* buffer view tests
* om
* fix torch
* no MOCKGPU
* skip
2026-03-03 09:52:33 +08:00
chenyu
14d1c5fdfd
assign fusion tests on detach and contiguous_backward ( #15092 )
2026-03-02 15:21:51 -05:00
chenyu
103ea16ec0
add contiguous back to svd ( #15074 )
...
can cause infinite loop
2026-02-28 16:49:26 -05:00
George Hotz
bb84e389cf
functions for llama trainer ( #15045 )
...
* functions for llama trainer
* function there
* axis match
* fix multi
* lil cleaner
* there's a bug with HK_FLASH_ATTENTION
* training functions
* for commit
2026-02-28 12:15:18 +08:00
chenyu
5fd06f4f02
differentiable setitem ( #15054 )
...
* differentiable setitem
go through the where path for bw
* no return
2026-02-27 17:25:15 -05:00
chenyu
c9f6d8751b
don't remove_bufferize for Invalid ( #15053 )
...
* don't remove_bufferize for Invalid
* replaced
2026-02-27 15:16:09 -05:00
George Hotz
010d2790ce
fix multi minimal ( #15044 )
2026-02-27 14:31:58 +08:00
George Hotz
d23b79530e
remove disk from GGUF GEMV test ( #15041 )
...
* remove disk from GGUF GEMV test
* keep copy
2026-02-27 12:03:00 +08:00
chenyu
d345f7f5dc
remove _pending_assigns ( #15040 )
2026-02-26 22:38:10 -05:00
George Hotz
37e31e7da4
gguf gemv test ( #15039 )
...
* add gemv tests
* gguf big
* skip
* make realize optional
2026-02-27 10:54:43 +08:00
George Hotz
fe3ee8c27e
fix symbolic shapes in calls ( #15021 )
...
* fix symbolic shapes in calls
* fix after in the big graph
* real tests
2026-02-26 17:17:18 +08:00
George Hotz
2655655a0c
call gradient creates a call ( #15020 )
...
* function creates a full subgraph
* tests
* fix var
* fix tests
* implict assign/contig
* move kv init
2026-02-26 14:15:29 +08:00
chenyu
ed9d475a12
assign tests with test_function ( #15015 )
2026-02-25 16:15:59 -05:00
George Hotz
0d35b67f2c
revert realize to only be buffers ( #15008 )
...
* revert realize to only be buffers
* fix that
* broken attention
* Revert "broken attention"
This reverts commit a23c3cd96c .
* and that
2026-02-25 22:43:06 +08:00
George Hotz
68831cd852
add more tests to test_function ( #15003 )
...
* add more tests to test_function
* add function to llm
* function decorator on llm
* works
* symbolic fixups
* minimum change
* implicit inputs
* don't actually update llama yet
2026-02-25 18:42:06 +08:00
George Hotz
e3fa9896b7
start function and add walk rewrite ( #14992 )
...
* start function and add walk rewrite
* work
* add function on feed_forward
* llm progress
* stuff
* none of that
2026-02-25 13:56:27 +08:00
chenyu
fde7a40bb0
allow dtype mismatched assign on disk ( #14993 )
...
reverted #14473 , that was a bad idea. also added a test that safe_save only has copy
2026-02-24 20:49:55 -05:00
chenyu
5fd4fc0c6d
fix tinyfs ( #14974 )
...
* fix tinyfs
* fix that
2026-02-24 08:50:53 -05:00
George Hotz
8a6dffc87e
Tensor.callify will be the JIT ( #14983 )
...
* close
* simple callify, support linear in the scheduler
* all tests pass
* everyone is happy
* dumb test
* Remove unnecessary blank line in rangeify.py
2026-02-24 18:42:24 +08:00
chenyu
0bda5585c7
unit test TestTinyFS ( #14972 )
...
these passed before the allocation change
2026-02-23 16:59:39 -05:00
chenyu
24e8919438
raise explicitly for test_crossunder_assign ( #14948 )
2026-02-21 21:21:13 -05:00
chenyu
9764e2561c
more assign into unrealize silent fail cases ( #14944 )
2026-02-21 18:12:57 -05:00
chenyu
0dbcd764ad
a few assign into unrealized failed test case ( #14940 )
2026-02-21 13:18:45 -05:00
qazal
c5029fa460
jit case with Tensor.empty input, realized means allocated ( #14930 )
...
* simple failing jit test case with Tensor.empty
* this used to exist in ops.py...
* Revert "removed if self.buffer.is_allocated() in realized (#14836 )"
This reverts commit 72cf603805 .
2026-02-21 16:33:55 +09:00