George Hotz
5b8ff231cd
Merge branch 'master' into remove_vec_3
2026-04-29 13:31:51 -07:00
qazal
55915584e5
viz: fix cfg for emulated amd on the null device ( #15976 )
...
* simple failing when i test it end to end
* pass
* linter
* assemble
2026-04-30 05:18:09 +09:00
nimlgen
dfd2d07005
remove CompiledRunner ( #15970 )
...
* rm usage of CompiledRunner
* more tests
* last
* linter
* sink
* remove
* linter
2026-04-29 22:45:48 +03:00
wozeparrot
0080489abe
llama: use env vars ( #15978 )
2026-04-29 12:37:15 -07:00
qazal
a37b605523
remove arch from asm kernel class ( #15977 )
...
* rm arch from kernel
* update other tests
* update abstractions4.py
2026-04-30 03:39:52 +09:00
Christopher Milan
7a79c2948a
DEV visible device filter supports hyphenated syntax ( #15971 )
2026-04-29 14:02:21 -04:00
Christopher Milan
6b9a45568c
autogen: better version handling for llvm and libclang ( #15975 )
2026-04-29 14:01:33 -04:00
chenyu
654e611a29
_bits_to_rand to mixin ( #15972 )
2026-04-29 13:47:25 -04:00
George Hotz
a448b47e9c
Merge branch 'master' into remove_vec_3
2026-04-29 10:30:18 -07:00
George Hotz
5f441ecffc
unify reduce + reduce_axis ( #15973 )
...
* unify reduce + reduce_axis
* fix all tests
* lil cleanups
2026-04-29 10:29:56 -07:00
qazal
b63e0a5f74
viz/sqtt: move amd decoder to extra, don't import from ops_amd ( #15969 )
...
* don't import from ops_amd
* start
* cleanup
2026-04-30 00:49:15 +09:00
nimlgen
7787f76dcc
get_runner -> get_runtime ( #15967 )
...
* get_runner -> get_runtime
* do not use get_runner
* fix
* remove get_tunner
* remove
* fix
* x
2026-04-29 18:29:49 +03:00
chenyu
fb188c3c23
UOp.bitcast noop early return ( #15968 )
...
matches Tensor
2026-04-29 09:41:40 -04:00
qazal
30403c1e25
viz/cli: merge DEBUG=6 and -i ( #15966 )
...
* print_step contiguous
* merge
2026-04-29 19:52:17 +09:00
qazal
86621e9e7c
gate f32_to_fp8 renderer ( #15964 )
2026-04-29 19:12:46 +09:00
wozeparrot
ef09071073
llama: speed 2 ( #15960 )
2026-04-28 20:44:37 -07:00
Christopher Milan
e6863a1cc5
autogen: fewer type: ignores ( #15956 )
2026-04-28 21:58:13 -04:00
George Hotz
6a983bb72b
fixes, reduce is broken
2026-04-28 18:54:15 -07:00
George Hotz
fedad1681f
remove dtype vec, try 3
2026-04-28 18:34:31 -07:00
chenyu
836af56513
some RandMixin cleanup ( #15961 )
...
cleaner to just put inside OpMixin
2026-04-28 19:58:02 -04:00
chenyu
c4bea54e9c
_threefry_random_bits to mixin ( #15959 )
...
start RandMixin
2026-04-28 19:13:57 -04:00
George Hotz
796fdf9fd8
end has no shape ( #15958 )
2026-04-28 15:15:48 -07:00
Miguel Villa Floran
b36010c55a
DGX Spark and Jetson Thor support ( #15939 )
2026-04-28 18:08:21 -04:00
Nino Risteski
5eb1fd5d3c
cleanup: untrack wait Metal buffers ( #15954 )
2026-04-28 12:54:59 -07:00
nimlgen
77965a22e5
local optimize as rewrite ( #15953 )
...
* local optimize as rewrite
* better
* x
* slighly rename
* fix
* ugh
* remove
* x
* remove
* not weak
2026-04-28 22:51:04 +03:00
qazal
b3f0f8d349
llama: fix missing label_smoothing arg ( #15955 )
2026-04-29 02:12:14 +09:00
wozeparrot
5e861cd2c4
llama: move llama kernels to llama_kernels ( #15952 )
2026-04-27 22:48:53 -07:00
Christopher Milan
987b6dd193
python -m tinygrad.device prints interface info ( #15950 )
2026-04-27 22:15:38 -04:00
qazal
54f00e1013
sqtt: correct rdna4 structs ( #15948 )
2026-04-28 07:35:50 +09:00
Charlie Kerfoot
890d7be0c3
fix: muon not using device ( #15936 )
2026-04-27 14:56:48 -07:00
qazal
c58fd85a99
sqtt: add needs_rocprof decorator ( #15947 )
...
* sqtt: add needs_rocprof decorator
* version string
2026-04-28 06:22:50 +09:00
Christopher Milan
3f508810d8
cpu: lowercase arch ( #15943 )
2026-04-27 17:05:25 -04:00
chenyu
77f9125c21
move Tensor.pad to OpMixin ( #15946 )
2026-04-27 16:56:04 -04:00
nimlgen
4164666c72
programinfo ( #15942 )
...
* programinfo
* fix
* m
* x
* x
* changes
* x
* fix
* rm
2026-04-27 23:12:03 +03:00
chenyu
fe38d6de94
_pad_circular and _pad_reflect_replicate to mixin ( #15944 )
2026-04-27 16:07:05 -04:00
qazal
8c174bdad4
viz/sqtt: correct exec pipes ( #15885 )
...
* wmma
* p2
* test
* left
* work
* pickle
* handwritten failing tests
* start work
* test the pipes
* empirical evidence
* update rdna4 enum types
* VALU pipe 1
* TRANSCENDENTAL pipe
* transcendental function units
* reorder
* wmma pipe
* cleanup and notes
* smaller
* work
* diff cleanup
* pickle
* use se:1
* int
2026-04-28 05:05:49 +09:00
qazal
eeb8d5eb0c
viz: small ui changes ( #15940 )
...
* rename colors
* keep ctrl c
2026-04-27 04:00:13 +09:00
nimlgen
96165ff0d1
validate_with_cpu as rewrite ( #15938 )
...
* validate_with_cpu as rewrite
* compil
* x
* linter
* moved
* fix
2026-04-26 19:58:53 +03:00
nimlgen
117e9e22dd
estimates from graph ( #15937 )
...
* estimates from graph
* test
* x
2026-04-26 18:22:53 +03:00
chenyu
e9983e3516
remove unused QCOMTextureInfo, QueueType [pr] ( #15935 )
2026-04-25 14:32:31 -04:00
nimlgen
ac3494a7cc
remove some runners ( #15934 )
...
* remove runners
* mypy
2026-04-25 21:27:05 +03:00
nimlgen
bb652352c7
remove execitem ( #15932 )
...
* remove execitem
* f
* x
2026-04-25 19:33:04 +03:00
chenyu
e27444a0ff
remove unused UOp.shard_size [pr] ( #15933 )
2026-04-25 12:27:58 -04:00
nimlgen
e0ff6cc15c
remove old schedule ( #15930 )
...
* remove old schedule
* tests
* r
* x
2026-04-25 16:46:36 +03:00
qazal
9a23de7d27
viz/cli: unify profile and rewrites, -s ALL default ( #15931 )
...
* work
* workg
* better
* cleanup
* better defaults
* --ls
* better
* work
* update llama
* update
2026-04-25 22:31:24 +09:00
nimlgen
768106a542
remove schedule from extra/docs/examples ( #15929 )
...
* remove schedule from extra/docs/examples
* f
2026-04-25 14:09:12 +03:00
nimlgen
a5e9ea7a60
remove schedule batch 4 ( #15927 )
...
* remove schedule batch 4
* fini
2026-04-25 12:36:55 +03:00
nimlgen
d2ab6ea7a6
remove schedule batch 3 ( #15924 )
...
* remove shcedule batch 3
* batch 6
* batch 7
2026-04-25 11:53:16 +03:00
nimlgen
3c8a2db870
remove schedule() from tests batch 2 ( #15923 )
...
* remove schedule() from tests batch 2
* batch 4
2026-04-25 10:44:41 +03:00
Denys Melnyk
1fdcb13bfb
webgpu: fix weight lookup in export_model after compile_net key change ( #15919 )
...
* fix lookup site in export_model_webgpu after refactoring
webgpu (sd): fix export_model weight lookup after compile_net changes
fix lookup site in export_model_webgpu after refactoring
* add regression test
2026-04-25 10:04:55 +03:00