chenyu
b38fc43b07
assert assign dtype mismatch for disk [pr] ( #14473 )
...
the disk hack is generally wrong, now force bitcast on the source before assign
2026-01-31 17:08:54 -05:00
chenyu
ced886f26c
failed test case for assign into bitcast ( #14469 )
...
* failed test case for assign into bitcast
DISK assign has custom hack for this. need to fix before we can unify assign
* test_assign_bitcast_different_size
2026-01-31 14:26:47 -05:00
chenyu
c765641215
remove unused allow_any_len [pr] ( #14464 )
...
STORE has 2 src, RESHAPE has 2 src, BUFFER has 2 src
added some tests for the untested allow_any_len
2026-01-31 11:05:42 -05:00
chenyu
b4f5a51ebb
move tests to unit ( #14463 )
...
test_uop_graph does not need device, test_memory_planner can use NULL
2026-01-31 10:49:31 -05:00
chenyu
99b44121bc
failed test case for non-consecutive disk read ( #14455 )
...
silently fail now
2026-01-30 23:44:04 -05:00
chenyu
03613e83ad
update TestTensorMetadata ( #14443 )
...
run with SCACHE=0 some more TODOs
2026-01-30 12:39:01 -05:00
chenyu
26f5c00265
move TestTensorMetadata to unit ( #14442 )
2026-01-30 12:14:21 -05:00
George Hotz
838cd078bc
use atomics for embedding backward ( #14400 )
...
* embedding is slow
* failing
* float is fine
* null
* it fails
* simplify embedding with broadcasting
* ATOMIC_ADD incoming
* min change
* simpler test
* better test
* fix test
* real test
* simpler
* cleanups
* types and names
* _zero_kernel
* grad multi
* hack
* none
* multi unshard
* more for call
* don't tag in call
* good
* call_multi
* call_multi wow claude is useless
* embedding backward mutli test
* test passes
* fix as_param
* shape_to_shape_arg
* add clip
* before cast
* fix spec=2, use atomics
2026-01-30 18:10:59 +08:00
George Hotz
7a9dee4e50
add call/param UOps ( #14433 )
...
* add call/param UOps
* resolve call
* skip that for now
* grad on call
* fix tests
2026-01-30 14:51:45 +08:00
chenyu
86a204d22a
allow Tensor setitem input to be list/tuple ( #14432 )
...
matches assign, and generally matches numpy
2026-01-29 21:26:58 -05:00
chenyu
ddc041854b
failed test case for disk setitem ( #14426 )
...
strided setitem is wrong
2026-01-29 14:54:19 -05:00
Christopher Milan
5e36482314
decompose long to ints where unsupported, try 2 ( #14383 )
2026-01-27 23:20:43 -05:00
George Hotz
065b95cfb0
Revert "add retry to fetch ( #14370 )" ( #14385 )
...
This reverts commit dc4d7f2d55 .
2026-01-28 09:35:37 +08:00
Eitan Turok
dc4d7f2d55
add retry to fetch ( #14370 )
2026-01-27 14:04:25 -08:00
Christopher Milan
289a3e415e
also skip test_nonoverlapping_shrink_assignment ( #14382 )
2026-01-27 16:26:26 -05:00
chenyu
db010a31be
IGNORE_OOB -> CHECK_OOB [pr] ( #14374 )
...
flip the meaning
2026-01-27 12:20:59 -05:00
chenyu
c22667b0c4
also skip test_overlapping_shrink_assignment_reverse ( #14375 )
...
crashing
2026-01-27 12:20:39 -05:00
George Hotz
0ced258726
HOTFIX: skip crashing assign test
2026-01-27 20:35:17 +08:00
imaolo
14574c68fa
Add ContextVar to disable the scheduler cache ( #14257 )
...
* add scheduler cache ContextVar
* test scheduler cache context var
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2026-01-27 19:55:29 +08:00
Christopher Milan
2e72625652
Revert "decompose dtypes.long to ints where unsupported ( #14261 )" ( #14362 )
2026-01-27 02:04:59 -05:00
Christopher Milan
0793319929
decompose dtypes.long to ints where unsupported ( #14261 )
...
* add works
* use carry not overflow
* bitwise ops
* use tag instead of vec
* cleaner
* mul somewhat works
* mul actually works
* SUB and NEG work
* SHL/SHR
* ulong support
* this should work?
* oops
* fix indexing
* all ALU mostly works
* refactor
* test_dtype passing
* signed division works
* format
* clean
* some tests
* ruff
2026-01-26 18:34:13 -05:00
chenyu
d641e63189
improve min/max for AND ( #14356 )
2026-01-26 15:44:18 -05:00
chenyu
f16372487a
fix assign hazard on shrink ( #14355 )
...
* fix assign hazard on shrink
possible to have race if both assign src and dest are shrink
* test_nonoverlapping_shrink_assignment
2026-01-26 14:46:30 -05:00
chenyu
823bc17fb5
failed test case for shrink overlap assigns ( #14350 )
...
* failed test case for shrink overlap assigns
current logic can create a race resulted in wrong output
* skip for now
2026-01-26 11:58:45 -05:00
George Hotz
984cdc4840
add wrapper class for the -0.0 != 0.0 issue ( #14339 )
...
* add wrapper class for the -0.0 != 0.0 issue
* fixes
* spec fix
* missed one
2026-01-26 16:52:37 +08:00
George Hotz
cc49e47ea2
tinygrad changes from ucode ( #14336 )
...
* tinygrad changes from ucode
* dtype
2026-01-26 11:30:18 +08:00
chenyu
cb69b7b2b2
comment out fold_where_closure ( #14316 )
2026-01-24 10:15:42 -05:00
chenyu
e65bc7a7c5
where closure folding ( #14304 )
2026-01-23 10:55:13 -05:00
chenyu
5f32f7a06b
fix winograd padding order ( #14294 )
2026-01-22 23:00:14 -05:00
chenyu
6279ae4a94
remove llm generate always reset start_pos ( #14276 )
...
* remove llm generate always reset start_pos
by itself seems like a bug, also added a test to repro forward_jit.reset() issue
* issue is jit graph, so revert that test
2026-01-21 16:54:30 -05:00
chenyu
e64111ad08
update all_same [pr] ( #14270 )
...
add type annotation and unit test
2026-01-21 11:26:15 -05:00
George Hotz
5e24643889
minor import speedups ( #14244 )
...
* minor import speedups
* server stuff in server places
* pre-commit
* fix
2026-01-20 15:05:36 +09:00
qazal
b1c5a242b7
Revert "move is_dtype_supported logic to renderer ( #14188 )" ( #14237 )
...
This reverts commit 161fee9a48 .
2026-01-20 12:19:14 +09:00
Christopher Milan
161fee9a48
move is_dtype_supported logic to renderer ( #14188 )
...
* move is_dtype_supported logic to renderer
* fix CPU_COUNT
* mypy happy
* early import libclang too with llvm
* run with debug
* skip autogen tests if MTLCompiler or llvm is loaded
* run autogen tests separately in CI
* lint
2026-01-18 22:37:04 -05:00
chenyu
e7c2df9113
improve consecutive Tensor indexing ( #14208 )
...
* improve consecutive Tensor indexing
instead of O(idx_counts*src_dims), it can just be O(idx_counts)
* test correctness
2026-01-18 15:14:33 -05:00
chenyu
c7b8f6496f
remove dtypes.index_like and dtypes.fields [pr] ( #14207 )
...
barely used, so just use inline and DTYPES_DICT
2026-01-18 11:49:01 -05:00
Christopher Milan
a021b84604
autogen: fix enum ( #14171 )
2026-01-16 01:30:11 -05:00
chenyu
14e9a71a41
move test_assign to unit ( #14165 )
...
scheduling these should not depend on device
2026-01-15 17:10:13 -05:00
Christopher Milan
0cb024a5bb
remove ctypes.Structure ( #13651 )
2026-01-15 05:06:22 -05:00
qazal
164bc678a6
scheduler: sched_cache bugfix for different Tensor.custom_kernel schedules ( #14161 )
...
* simplest failing test
* min fix
* same function reuses the cache
* SPEC=2 never worked for custom_kernel
2026-01-15 14:59:14 +09:00
chenyu
35c9701df0
update outdated tests and comments ( #14090 )
2026-01-10 01:00:48 -05:00
chenyu
92246ea731
update tests, WEBGPU=1 pytest . passes ( #14089 )
...
* update tests, `WEBGPU=1 pytest .` passes
* minor update
2026-01-10 00:03:02 -05:00
chenyu
eacccc5ace
more disk assign tests ( #14087 )
...
covers more edge cases
2026-01-09 14:14:52 -05:00
chenyu
ed295e74dc
don't skip gguf test if ggml is not installed ( #14086 )
...
* don't skip gguf test if ggml is not installed
should just let it fail
* fix
2026-01-09 12:05:58 -05:00
chenyu
cff33c8d78
add some disk assign tests ( #14085 )
2026-01-09 11:50:59 -05:00
Garret Castro
16b652302e
skip bf16 test if not supported by device ( #14070 )
2026-01-08 13:37:24 -05:00
Christopher Milan
b2a0b9c551
autogen: dump patch in CI ( #14010 )
...
* autogen: don't fast-fail, produce patch artifact on differences
All verification steps now use continue-on-error to run completely.
Each job generates a patch artifact containing all differences found.
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com >
* add gen from header test
* fix tests
* fail if diff
* add forward decl autogen test
* remove confusing/wrong comments
* macos unittests set LIBCLANG_PATH
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com >
2026-01-04 22:38:12 -05:00
chenyu
cfb8bf5814
faster image load ( #13977 )
...
sometimes image load does not need to init with NAN
2026-01-04 13:09:59 -05:00
chenyu
8003db2a28
test case of NOOP store load folding ( #13997 )
2026-01-03 14:39:26 -05:00
chenyu
2e2b5fed12
fix misspellings ( #13976 )
2026-01-02 10:37:38 -05:00