chenyu
52c92e15ae
no replacement multinomial ( #15995 )
...
* no replacement multinomial
Efraimidis–Spirakis
* num_samples == 1 can use fast path
2026-04-30 17:35:26 -04:00
chenyu
e0b09f288f
input validation for rand functions ( #15990 )
2026-04-30 14:00:44 -04:00
nimlgen
11e1a2b89f
cleaner and faster run_linear ( #15987 )
...
* cleaner and faster run_linear
* x
* assert for now
* x
* x
* sym_infer
* remove sink
2026-04-30 20:15:22 +03:00
qazal
58b34e71bd
failing test for llama useless copies ( #15989 )
2026-05-01 00:55:29 +09:00
nimlgen
dfd2d07005
remove CompiledRunner ( #15970 )
...
* rm usage of CompiledRunner
* more tests
* last
* linter
* sink
* remove
* linter
2026-04-29 22:45:48 +03:00
George Hotz
5f441ecffc
unify reduce + reduce_axis ( #15973 )
...
* unify reduce + reduce_axis
* fix all tests
* lil cleanups
2026-04-29 10:29:56 -07:00
nimlgen
7787f76dcc
get_runner -> get_runtime ( #15967 )
...
* get_runner -> get_runtime
* do not use get_runner
* fix
* remove get_tunner
* remove
* fix
* x
2026-04-29 18:29:49 +03:00
nimlgen
77965a22e5
local optimize as rewrite ( #15953 )
...
* local optimize as rewrite
* better
* x
* slighly rename
* fix
* ugh
* remove
* x
* remove
* not weak
2026-04-28 22:51:04 +03:00
nimlgen
4164666c72
programinfo ( #15942 )
...
* programinfo
* fix
* m
* x
* x
* changes
* x
* fix
* rm
2026-04-27 23:12:03 +03:00
nimlgen
96165ff0d1
validate_with_cpu as rewrite ( #15938 )
...
* validate_with_cpu as rewrite
* compil
* x
* linter
* moved
* fix
2026-04-26 19:58:53 +03:00
nimlgen
117e9e22dd
estimates from graph ( #15937 )
...
* estimates from graph
* test
* x
2026-04-26 18:22:53 +03:00
nimlgen
e0ff6cc15c
remove old schedule ( #15930 )
...
* remove old schedule
* tests
* r
* x
2026-04-25 16:46:36 +03:00
nimlgen
a5e9ea7a60
remove schedule batch 4 ( #15927 )
...
* remove schedule batch 4
* fini
2026-04-25 12:36:55 +03:00
nimlgen
d2ab6ea7a6
remove schedule batch 3 ( #15924 )
...
* remove shcedule batch 3
* batch 6
* batch 7
2026-04-25 11:53:16 +03:00
nimlgen
3c8a2db870
remove schedule() from tests batch 2 ( #15923 )
...
* remove schedule() from tests batch 2
* batch 4
2026-04-25 10:44:41 +03:00
nimlgen
f2751955cb
remove linear_to_schedule from tests ( #15912 )
...
* remove linear_to_schedule from tests
* x
2026-04-24 20:02:10 +03:00
chenyu
7a1adfd2aa
update Tensor.allclose to return Tensor ( #15904 )
...
matches jax
2026-04-24 08:27:17 -04:00
nimlgen
c0f77c2e1c
hcq graph to linear ( #15888 )
...
* hcq
* f
* f
* linter
2026-04-24 12:42:49 +03:00
nimlgen
5cf4ad2fb6
fix resolve param ( #15889 )
2026-04-23 17:41:44 +03:00
George Hotz
0c3260d5d9
rename VECTORIZE to STACK ( #15880 )
2026-04-23 10:43:42 +08:00
chenyu
f911a63a6b
don't allow negative num_classes in one_hot ( #15859 )
...
no auto infer num_classes, matches jax
2026-04-21 19:39:29 -04:00
Christopher Milan
99a0debd62
Device.count() ( #15842 )
2026-04-21 16:46:38 -04:00
chenyu
9192c93b7e
Tensor.invalid -> Tesnor.invalids ( #15849 )
...
matches ones and zeros, and to not share name with UOp.invalid
2026-04-21 11:19:51 -04:00
nimlgen
ae9b84d32f
rm beam uop ( #15844 )
2026-04-21 13:10:26 +03:00
nimlgen
01ac1c8c15
remove all run_schedule from tests ( #15846 )
2026-04-21 12:02:10 +03:00
nimlgen
c0d7135b5f
do not use jit_cache in test ( #15823 )
...
* do not use jit_cache in test
* fix
2026-04-20 11:45:17 +03:00
oxrinz
f551a4bded
add threefry const folding ( #15787 )
...
* prim threefry
* test fix
* clean test
* cleanup
* cleanup 2
* cleanup 3
* fix conflict markers in test_const_folding.py
* update test
* fix lint
* use const instead of value for test
2026-04-20 09:30:03 +08:00
Christopher Milan
6adf4c3cd9
MOCKGPU interfaces ( #15796 )
2026-04-17 21:56:29 -04:00
chenyu
0191cc73dc
update arange range check ( #15794 )
...
it was not checking negative steps correctly
2026-04-17 16:07:50 -04:00
nimlgen
23ca680a3a
run_linear ( #15784 )
...
* run_linear try 2
* x
* f
* tests
* ctx, cleaner
* r
* x
2026-04-17 22:44:16 +03:00
Christopher Milan
9f4b7bed25
add pickled jit regression test ( #15774 )
2026-04-16 16:59:09 -04:00
qazal
12c653a743
remove opts arg in get_program, everything uses opts_to_apply [pr] ( #15767 )
...
* check Ops.BEAM in process replay
* remove opts from the get_program api
* lint
* simplify
* cleanup
2026-04-16 22:42:43 +03:00
George Hotz
d1cce7a476
put the ranges on store instead of after ( #15759 )
...
* put the ranges on store instead of after
* better assert
* fix stuff
* comment out slow rules i don't understand
* simpler rule
* closer
* return false for store
* fix loop
* only a few schedule failures remain
* remove stores to self
* all tests pass locally
* remove junk
* regression test and fix
* better test, bump broken torch count
* bugfix with regression test
* new fusion is better
2026-04-16 19:06:40 +08:00
chenyu
218d6b8988
delete old UOp.size [pr] ( #15756 )
2026-04-15 23:21:00 -04:00
chenyu
8bd4fead26
UOp.size -> prod(max_shape) ( #15755 )
...
and more test updates
2026-04-15 22:41:30 -04:00
chenyu
10c262ced8
update tests that use UOp.size ( #15753 )
2026-04-15 21:58:27 -04:00
nimlgen
164495678c
test_graph to use uops ( #15746 )
...
* test_graph to use uops
* x
* n
2026-04-15 21:59:41 +03:00
George Hotz
1ae6528bb6
move schedule into schedule ( #15736 )
...
* move schedule into schedule
* callify to root
* sched docs
2026-04-15 11:03:25 +08:00
chenyu
3394d18066
size*itemsize -> nbytes ( #15729 )
...
and some UOp.size removal to prep for size to mixin change
2026-04-14 16:27:54 -04:00
chenyu
e706f408cb
suppress test warnings from numpy ( #15688 )
2026-04-11 22:33:20 -04:00
chenyu
8e7fcc8ca3
remove _include_initial in _cumalu ( #15674 )
...
handle negative pad in caller
2026-04-10 08:33:30 -04:00
chenyu
4cf2759fc8
fix merge_reduce_ends ( #15659 )
...
* fix merge_reduce_ends
same range with different nesting should not merge, like cumsum twice should not merge
* skip that
2026-04-08 17:20:01 -04:00
qazal
39a029ec55
remove ASM_GEMM context var ( #15645 )
2026-04-08 18:02:40 +09:00
wozeparrot
70dbd35023
llama: move custom_kernel into flat_llama ( #15643 )
2026-04-08 00:19:14 -07:00
chenyu
01b49c8647
support int operand for shifts ( #15618 )
...
matches torch/jax, also symbolic rule to remove mask
2026-04-06 12:32:12 -04:00
Andrew Cappelli
e39cfe685a
validate lr, momentum, weight_decay in optimizers ( #15576 )
2026-04-06 06:37:34 +08:00
wozeparrot
7e54992bf6
fp8 llama ( #15588 )
...
Co-authored-by: qazal <qazal.software@gmail.com >
2026-04-04 18:24:57 -07:00
Christopher Milan
645d45d968
DEV has arch ( #15577 )
...
Co-authored-by: Comma Device <device@comma.ai >
2026-04-03 19:17:19 -04:00
chenyu
8fdef2d3e4
mean/std/var to mixin ( #15593 )
2026-04-03 10:42:41 -04:00
Christopher Milan
0ed8d9271d
Renderers accept Target or nothing ( #15590 )
2026-04-03 01:09:41 -04:00