George Hotz
3f2d401464
all tests pass with NOOPT=1 ( #16257 )
...
* all tests pass with NOOPT=1
* fix a few more
* noopt 100% pass
* noopt 100% pass
2026-05-18 20:39:51 -07:00
chenyu
754344087a
assign for deviceless const source ( #16248 )
2026-05-18 17:39:53 -04:00
chenyu
dcee90aa3f
remove requires_grad use in extra/examples ( #16238 )
...
except the ones fed into optimizer
2026-05-16 18:40:26 -04:00
chenyu
8631b6f17d
remove use of requires_grad in test/ ( #16237 )
2026-05-16 17:21:07 -04:00
chenyu
0ddc50d050
do not gate backward on requires_grad ( #16230 )
...
DETACH is filtered in _deepwalk. instead of None, it gets 0 grad now
2026-05-16 12:29:49 -04:00
chenyu
07a172dbbb
remove noop requires_grad_ calls ( #16213 )
2026-05-15 13:31:10 -04:00
chenyu
c6cf9e8f0c
remove test_svd_nonfull_5_5 ( #16217 )
...
flaky, kinda overlap with test_svd_general
2026-05-15 13:10:02 -04:00
chenyu
409bb0c9ad
requires_grad cannot be None ( #16212 )
...
final goal is to remove requires_grad, first change the default to True, and don't allow None
2026-05-15 02:01:04 -04:00
chenyu
a75c14f010
some setitem tests ( #16209 )
2026-05-14 22:36:25 -04:00
chenyu
ffa1aac7b1
gradient for STORE/AFTER ala clone ( #16205 )
2026-05-14 20:17:27 -04:00
chenyu
09096ea565
test_gradient_through_clone ( #16203 )
...
backward through clone crashes now
2026-05-14 19:26:47 -04:00
b1tg
3c806ff406
clean up gguf ( #16160 )
2026-05-12 21:16:10 -07:00
chenyu
38d407fd58
simplify svd more ( #16181 )
...
all the slowness is scheduling
2026-05-12 23:48:22 -04:00
chenyu
2172363be5
don't use Tensor indexing in svd ( #16174 )
...
prepare mixin, also about 4X faster for 8x8 input
2026-05-12 21:56:19 -04:00
wozeparrot
a613bcfc6d
allow after on contiguous in spec ( #16169 )
...
* feat: allow after on contiguous
* feat: add test
2026-05-12 13:11:44 -07:00
chenyu
da3b7e89a4
atol in test_custom_kernel_multi_output_backward_interacting ( #16166 )
2026-05-12 14:42:12 -04:00
George Hotz
8294d105a7
Update the spec in spec.py to match the current state ( #16132 )
...
* start work on specv2
* more spec
* more spec
* fix amd emulator
* more spec
* more
* fix test_uop_graph
* move those
* spec=2
* skip those questionable tests
* ptx fix
* more spec=2
* store
* allow custom function in tensor
* spec 2
* fix beam search for tensor cores
* delete the old specs
* fix import
2026-05-11 20:07:47 -07:00
chenyu
3942a80f66
fix wrong kwargs passed into rands ( #16149 )
...
working towards explicit args for these
2026-05-11 22:22:06 -04:00
chenyu
63c1f00b80
disable test_svd_general again ( #16146 )
...
flaky on CI
2026-05-11 19:24:32 -04:00
chenyu
fbe8be0b8b
style cleanup to Tensor.qr and svd ( #16142 )
...
* style cleanup to Tensor.qr and svd
same kernels
* more
* enable
2026-05-11 17:16:59 -04:00
wozeparrot
4d1a9dca41
fix: don't copy precompiled custom kernel outputs ( #16084 )
2026-05-07 14:02:38 -07:00
nimlgen
5fa0016ffc
supports_exec_item -> supports_uop ( #16033 )
2026-05-05 22:41:13 +03:00
wozeparrot
419d525553
feat: handle multioutput kernel grads ( #16028 )
2026-05-02 22:31:45 -07:00
George Hotz
5f441ecffc
unify reduce + reduce_axis ( #15973 )
...
* unify reduce + reduce_axis
* fix all tests
* lil cleanups
2026-04-29 10:29:56 -07:00
nimlgen
4164666c72
programinfo ( #15942 )
...
* programinfo
* fix
* m
* x
* x
* changes
* x
* fix
* rm
2026-04-27 23:12:03 +03:00
nimlgen
96165ff0d1
validate_with_cpu as rewrite ( #15938 )
...
* validate_with_cpu as rewrite
* compil
* x
* linter
* moved
* fix
2026-04-26 19:58:53 +03:00
nimlgen
d3378010ee
schedule() -> schedule_linear() in tests (batch 1) ( #15915 )
...
* schedule_with_vars -> linear_with_vars in tests
* tests batch 1
* batch 2
* estimate_uop
* simpler
* rm
2026-04-24 23:40:53 +03:00
b1tg
af93a677ae
llm: glm 4.5 air ( #15771 )
...
* llm: glm 4.5 air
* clean
* clean
* remove gguf_size
2026-04-22 22:47:37 +08:00
Christopher Milan
99a0debd62
Device.count() ( #15842 )
2026-04-21 16:46:38 -04:00
chenyu
9192c93b7e
Tensor.invalid -> Tesnor.invalids ( #15849 )
...
matches ones and zeros, and to not share name with UOp.invalid
2026-04-21 11:19:51 -04:00
nimlgen
01ac1c8c15
remove all run_schedule from tests ( #15846 )
2026-04-21 12:02:10 +03:00
Christopher Milan
1a8ba4cbd6
CPU renderers use arch ( #15839 )
2026-04-20 23:38:29 -04:00
George Hotz
5819c0abed
fix gc in gguf ( #15820 )
...
* fix gc in gguf
* fix mypy
2026-04-20 10:15:03 +08:00
George Hotz
67ed4c4eb3
move gguf stuff from nn/state.py to llm/gguf.py ( #15783 )
...
* move gguf stuff from nn/state.py to llm/gguf.py
* docs
2026-04-20 09:41:43 +08:00
Kartik Vashishta
a1696e8413
objc: fix _classmethods_ dispatch flag ( #14854 )
...
* objc: fix _classmethods_ dispatch flag
* test: add objc _classmethods_ regression
2026-04-20 09:35:03 +08:00
chenyu
5bdfd4883f
update test_assign ( #15809 )
...
clean up old skips and update tests
2026-04-18 21:25:44 -04:00
Christopher Milan
6adf4c3cd9
MOCKGPU interfaces ( #15796 )
2026-04-17 21:56:29 -04:00
chenyu
8da308573f
update test_assign_changes_alt with clone ( #15802 )
2026-04-17 20:17:37 -04:00
qazal
9f2a578e26
unskip TestCall.test_call_gemm_uop [pr] ( #15786 )
2026-04-17 16:18:51 +03:00
George Hotz
e1d13bc4fe
add GGUF IQ4_XS support ( #15766 )
...
* add GGUF IQ4_XS support
* gguf 21
* gguf 21
* use plus
* ggml_common autogen for constant arrays
* fix
* ggml_common in autogen
* inline
2026-04-17 14:43:39 +08:00
George Hotz
a9b6cfece0
refactor llm into files ( #15780 )
...
* refactor llm into files
* chat.html
* tokenizer cleanup
* cleanup
* tests
2026-04-17 12:33:11 +08:00
George Hotz
ec00cefa5b
llm is the only app ( #15779 )
...
* tinygrad/llm is the only app
* upd pyproject
* claude refs
* scoping
* min diff
2026-04-17 10:44:48 +08:00
chenyu
f0c12a2004
another form of assign to itself ( #15770 )
2026-04-16 15:17:19 -04:00
chenyu
d147e2a549
update test_nested_after_contiguous_store ( #15763 )
...
add kernel counts and some TODOs
2026-04-16 09:59:26 -04:00
George Hotz
f57380cbc2
simplify GatedDeltaNetBlock using two state tensors ( #15704 )
...
* test double after
* simpler ssm
* no double test
2026-04-16 21:14:00 +08:00
George Hotz
d1cce7a476
put the ranges on store instead of after ( #15759 )
...
* put the ranges on store instead of after
* better assert
* fix stuff
* comment out slow rules i don't understand
* simpler rule
* closer
* return false for store
* fix loop
* only a few schedule failures remain
* remove stores to self
* all tests pass locally
* remove junk
* regression test and fix
* better test, bump broken torch count
* bugfix with regression test
* new fusion is better
2026-04-16 19:06:40 +08:00
George Hotz
d24466c844
CALL with return value is FUNCTION ( #15758 )
...
* CALL with return value is FUNCTION (GPT try)
* cleanups
2026-04-16 13:25:07 +08:00
chenyu
10c262ced8
update tests that use UOp.size ( #15753 )
2026-04-15 21:58:27 -04:00
George Hotz
1ae6528bb6
move schedule into schedule ( #15736 )
...
* move schedule into schedule
* callify to root
* sched docs
2026-04-15 11:03:25 +08:00
wozeparrot
2b8d303f75
allreduce in precast dtype ( #15689 )
2026-04-13 20:24:12 -07:00