Commit Graph

54 Commits

Author SHA1 Message Date
George Hotz
5f441ecffc unify reduce + reduce_axis (#15973)
* unify reduce + reduce_axis

* fix all tests

* lil cleanups
2026-04-29 10:29:56 -07:00
chenyu
5bdfd4883f update test_assign (#15809)
clean up old skips and update tests
2026-04-18 21:25:44 -04:00
chenyu
8da308573f update test_assign_changes_alt with clone (#15802) 2026-04-17 20:17:37 -04:00
chenyu
f0c12a2004 another form of assign to itself (#15770) 2026-04-16 15:17:19 -04:00
chenyu
d147e2a549 update test_nested_after_contiguous_store (#15763)
add kernel counts and some TODOs
2026-04-16 09:59:26 -04:00
George Hotz
d1cce7a476 put the ranges on store instead of after (#15759)
* put the ranges on store instead of after

* better assert

* fix stuff

* comment out slow rules i don't understand

* simpler rule

* closer

* return false for store

* fix loop

* only a few schedule failures remain

* remove stores to self

* all tests pass locally

* remove junk

* regression test and fix

* better test, bump broken torch count

* bugfix with regression test

* new fusion is better
2026-04-16 19:06:40 +08:00
George Hotz
7610bdc59e block multistore, it's not supported (#15708) 2026-04-13 20:57:59 +08:00
George Hotz
4c1fb18a09 Revert "Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (…" (#15703)
This reverts commit 0cec42db71.
2026-04-13 19:09:38 +08:00
George Hotz
0cec42db71 Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (#15700)" (#15702)
This reverts commit 6f5d756282.
2026-04-13 19:06:44 +08:00
George Hotz
6f5d756282 Tests for GatedDeltaNetBlock + fix multi after assign issue (#15700)
* broken after/assign test

* test for GatedDeltaNet

* better comments

* fix issue 1 with multi kernel

* fix 2

* fix

* linter

* public api + cleanup
2026-04-13 18:43:23 +08:00
Christopher Milan
acf239e4d2 specify renderer in DEV, <dev>_<ren>=1 is deprecated (#15551) 2026-03-31 18:35:14 -04:00
chenyu
02afb45f29 remove UOp.assign [pr] (#15300)
* remove UOp.assign [pr]

it's all store and after, UOp is immutable

* fix test
2026-03-16 21:45:41 -04:00
chenyu
cd14e8e64b allocations contiguous is store+after (#15280) 2026-03-15 11:58:40 -04:00
chenyu
a53187eef7 fix TestPartialAssignToSharedBuffer (#15202)
bufferize_to_store issue with assign
2026-03-09 23:14:23 -04:00
chenyu
fae400d300 update assign tests to also test the expected behavior (#15132) 2026-03-04 11:34:43 -05:00
chenyu
5dcf29b1a0 use clone in test_swap_slices (#15096) 2026-03-02 22:05:12 -05:00
George Hotz
d483e4153a buffer view is like buffer (#15082)
* buffer view is like buffer

* fix

* swap_reshape_shrink

* contiguous on gguf, fix overlap

* revert that

* _device_supports_view

* this

* fix that test

* 0 buffers

* that test was wrong

* this

* check correct size

* contig BUFFER_VIEW

* this

* fix tests

* buffer view tests

* om

* fix torch

* no MOCKGPU

* skip
2026-03-03 09:52:33 +08:00
chenyu
d345f7f5dc remove _pending_assigns (#15040) 2026-02-26 22:38:10 -05:00
George Hotz
37e31e7da4 gguf gemv test (#15039)
* add gemv tests

* gguf big

* skip

* make realize optional
2026-02-27 10:54:43 +08:00
George Hotz
fe3ee8c27e fix symbolic shapes in calls (#15021)
* fix symbolic shapes in calls

* fix after in the big graph

* real tests
2026-02-26 17:17:18 +08:00
chenyu
fde7a40bb0 allow dtype mismatched assign on disk (#14993)
reverted #14473, that was a bad idea. also added a test that safe_save only has copy
2026-02-24 20:49:55 -05:00
chenyu
24e8919438 raise explicitly for test_crossunder_assign (#14948) 2026-02-21 21:21:13 -05:00
chenyu
9764e2561c more assign into unrealize silent fail cases (#14944) 2026-02-21 18:12:57 -05:00
chenyu
0dbcd764ad a few assign into unrealized failed test case (#14940) 2026-02-21 13:18:45 -05:00
George Hotz
55d3a5def9 preallocate all realized buffers (#14823)
* preallocate all realized buffers

* contiguous

* work

* comment that out

* move to schedule

* better

* correct fix

* just buffer

* disk bufs

* fixes disk tensor stuff

* fix symbolic stuff

* fix multi

* 162 failures

* bugfixes

* don't check that anymore

* fix schedule tests

* mnist should be contiguious

* type and buffer

* fix tests

* shrink axis correction

* mypy fixes

* tests skips

* same 37 failures

* dedup

* no shrink in the graph

* 29 failures

* skips

* fix custom kernel

* fix training

* those optimizations aren't supported currently

* simpler

* more correct

* tests

* 14 failures

* works

* fix that test

* broken

* 11 failures

* only kernel counts left

* fixes

* all tests pass

* remove tensor_map

* op test

* 200 -> 230

* test fixes

* fixes

* revert test_tiny thing

* guard

* revert that

* test tiny passes

* no contigs there

* base realize back

* Revert "no contigs there"

This reverts commit c45bb9fcfd.

* revert that

* chop many assigns

* 12 failures

* fix tests

* tests

* apply after

* pre-commit

* remove old code

* delete that

* fix types

* remove extra contig

* fix dataloader

* torch fix

* disk fix

* update kernel fusion numbres

* runs on amd

* restore kernel count

* add that rule back

* that

* disable that

* wrong

* add the correct rule for that folding

* more tests

* guard c1.arg

* no newlines

* realize those

* split into a different file

* remove detach/contig back

* skip 2

* update that
2026-02-20 20:05:54 +08:00
George Hotz
d5636fba90 assign after copy shouldn't contig (#14847)
* assign after copy shouldn't contig

* fix assign copy
2026-02-18 12:23:49 +08:00
chenyu
e3c120c8e1 exclude 100 in test_assign_add (#14846)
this can crash, not sure why. skip 100 to see if it's better
2026-02-17 19:12:47 -05:00
chenyu
aec8a6c85b Revert "one run_schedule for assign realize (#14835)" (#14837)
This reverts commit df7c37f611.
2026-02-17 14:34:26 -05:00
chenyu
df7c37f611 one run_schedule for assign realize (#14835)
concat schedules. separate out the execution part
2026-02-17 14:01:55 -05:00
chenyu
f147791105 update test to reset and test kernel_count directly (#14832) 2026-02-17 11:48:46 -05:00
chenyu
9d4937ab5e remove assign test @unittest.skip("this test is crashing!") (#14831) 2026-02-17 10:30:58 -05:00
chenyu
f2f039cc0f fix chained full-buffer assign (#14828)
this shows issue that pm_remove_bufferize drops tags, will fix in bufferize next. this also fixed rand being different in jit vs no-jit
2026-02-17 09:11:04 -05:00
chenyu
58fa82eef5 stronger test_assign_add (#14826)
also test self add 10 and 100 times
2026-02-17 08:36:09 -05:00
chenyu
5bca5be2d2 test slice assign twice retains the buffer (#14807) 2026-02-16 20:01:47 -05:00
chenyu
9b44fbe0b8 update test_assign_add_twice (#14806)
failed test case to show that `+=1` twice returns a different buffer
2026-02-16 17:52:11 -05:00
chenyu
043f5dbfa0 fix write-after-read tracking (#14754)
AFTER-AFTER was silently dropped, which breaks write-after-read
2026-02-14 17:23:05 -05:00
chenyu
d79c63a0ff test_multi_step_assign_read_write_same_buffer (#14752)
pattern in LAMB that can be off subtly
2026-02-14 16:39:08 -05:00
chenyu
0c63f63ee4 recursive resolve assign dependency (#14688)
remove the .realize in llm.py
2026-02-11 17:41:05 -05:00
chenyu
cbbc2fdea5 update test_assign_slice_then_read (#14687)
passes locally now
2026-02-11 15:02:44 -05:00
chenyu
9e3f24db9f assign realize fix (#14649)
fix the need for explicit assign. track pending assigns for each buffer, and run those before the main realize in order
2026-02-09 17:46:46 -05:00
chenyu
15d3344d9e use int inputs in test_assign (#14580)
int is less flaky
2026-02-06 00:07:31 -05:00
chenyu
b09dc646f5 revert some late_buffer_view change (#14578)
revert #14478 which breaks tinyfs
2026-02-05 22:51:40 -05:00
chenyu
03d0fa9c3f merge as_buf into buf_uop [pr] (#14541) 2026-02-04 16:32:23 -05:00
chenyu
3ff390159b don't implicitly change dtype in assign (#14481)
broadcast shape is fine, but implicitly cast dtype is hard to find
2026-02-01 11:48:54 -05:00
chenyu
5d38db9da6 generic bitcast assign (#14474)
a.bitcast(X).assign(src) -> a.assign(src.bitcast(a.dtype))
2026-01-31 17:29:20 -05:00
chenyu
b38fc43b07 assert assign dtype mismatch for disk [pr] (#14473)
the disk hack is generally wrong, now force bitcast on the source before assign
2026-01-31 17:08:54 -05:00
chenyu
ced886f26c failed test case for assign into bitcast (#14469)
* failed test case for assign into bitcast

DISK assign has custom hack for this. need to fix before we can unify assign

* test_assign_bitcast_different_size
2026-01-31 14:26:47 -05:00
chenyu
86a204d22a allow Tensor setitem input to be list/tuple (#14432)
matches assign, and generally matches numpy
2026-01-29 21:26:58 -05:00
Christopher Milan
289a3e415e also skip test_nonoverlapping_shrink_assignment (#14382) 2026-01-27 16:26:26 -05:00
chenyu
c22667b0c4 also skip test_overlapping_shrink_assignment_reverse (#14375)
crashing
2026-01-27 12:20:39 -05:00