George Hotz
02da66053e
null schedule test
2026-03-03 17:28:19 +08:00
George Hotz
654459a189
we are using walk
2026-03-03 17:15:58 +08:00
George Hotz
920fe858ae
no after removal
2026-03-03 17:07:04 +08:00
qazal
e3a0598d0b
viz: the whole pc should be in view ( #15101 )
2026-03-03 17:17:53 +09:00
b1tg
a9ea36de79
assembly/amd: v_cmp_lg_f32 is ordered not-equal ( #14982 )
2026-03-03 15:37:48 +08:00
wozeparrot
c35de9bd68
asm_gemm: support more sharding ( #15002 )
2026-03-02 23:16:37 -08:00
wozeparrot
824ba4386a
llama3 dp fix ( #15098 )
2026-03-02 22:43:07 -08:00
chenyu
5dcf29b1a0
use clone in test_swap_slices ( #15096 )
2026-03-02 22:05:12 -05:00
Christopher Milan
c70e8af068
move IMAGE FLOAT16 logic to allocations ( #15095 )
...
* FLOAT16 logic in allocations
* cleanup
* separate that
* only apply when IMAGE == 1
* test passing now
* create image buffers earlier
2026-03-02 22:00:05 -05:00
George Hotz
d483e4153a
buffer view is like buffer ( #15082 )
...
* buffer view is like buffer
* fix
* swap_reshape_shrink
* contiguous on gguf, fix overlap
* revert that
* _device_supports_view
* this
* fix that test
* 0 buffers
* that test was wrong
* this
* check correct size
* contig BUFFER_VIEW
* this
* fix tests
* buffer view tests
* om
* fix torch
* no MOCKGPU
* skip
2026-03-03 09:52:33 +08:00
qazal
62ee976c1b
gemm/asm: cleanup repeated patterns to helper functions ( #15094 )
2026-03-03 08:14:47 +09:00
qazal
848f5cea96
viz: sqtt instruction packet trace ( #15065 )
2026-03-03 07:55:04 +09:00
chenyu
14d1c5fdfd
assign fusion tests on detach and contiguous_backward ( #15092 )
2026-03-02 15:21:51 -05:00
nimlgen
dfa180413d
tbgpu: sign nv ( #15087 )
2026-03-02 22:58:30 +03:00
chenyu
71f228f80f
test exact kernel count in torch_backend/test_kernel_fusion ( #15091 )
2026-03-02 14:26:32 -05:00
chenyu
f80b1033c5
simpler Tensor.all ( #15089 )
...
same generated kernel
2026-03-02 11:08:55 -05:00
chenyu
4008f7d4e8
move Tensor.one_hot +1 to python ( #15088 )
2026-03-02 10:56:41 -05:00
nimlgen
dafbe9733a
am: cleanup ( #15086 )
2026-03-02 17:06:21 +03:00
qazal
f7aeff6061
viz: cli.py cleanups, do not require PYTHONPATH ( #15085 )
...
* cleanup the print
* sys.exit
* equal check
* cleanup unpacker
* cli doesn't need PYTHONPATH
* no semicolons
* %s/PYTHONPATH=. //g
2026-03-02 19:24:38 +09:00
George Hotz
5ff278446c
add contiguous_view_offset ( #15084 )
...
* add contiguous_view_offset
* no int
2026-03-02 18:05:04 +08:00
Christopher Milan
977c270774
IMAGE=1 kernel count failing tests ( #15083 )
2026-03-02 04:35:26 -05:00
George Hotz
3539693555
Support triu variable on diagonal + SDPA symbolic ( #15081 )
...
* triu variable
* fails
* dumbbb
* no commutative in reshape
* real fix
* revert that
* sdpa symbolic tests
2026-03-02 12:19:48 +08:00
wozeparrot
a4f6365929
llama3: fstep takes grads ( #15069 )
2026-03-01 20:05:07 -08:00
Nick
8e8e9f6ff6
assert removal for _tri() + tests ( #15073 )
...
* assert removal for _tri() and tests
* removed import
* tests triu/tril like in prefill
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2026-03-02 10:34:28 +08:00
nimlgen
ccbbca05ef
beam: add dev_timeout for am ( #15063 )
...
* beam: add dev_timeout for am
* all covered
* fk
* x
* fuzz
* reset
* f
2026-03-01 16:57:29 +03:00
chenyu
8cb4368967
delete unused END NOOP rule [pr] ( #15077 )
2026-03-01 00:09:05 -05:00
chenyu
efce99adc9
skip isComposing key press in llm.py ( #15076 )
...
for the CJK input user
2026-02-28 20:31:53 -05:00
chenyu
103ea16ec0
add contiguous back to svd ( #15074 )
...
can cause infinite loop
2026-02-28 16:49:26 -05:00
chenyu
fe0fa8333b
Revert "improve Tensor.sort indices ( #15070 )" ( #15072 )
...
This reverts commit e3003631f2 .
2026-02-28 14:40:30 -05:00
chenyu
e3003631f2
improve Tensor.sort indices ( #15070 )
...
* improve Tensor.sort indices
instead of N^2 match at the end, have an arange to start and go through the same N(logN)^2 path
* contiguous
2026-02-28 14:16:16 -05:00
wozeparrot
cfc5cf65ad
llama3: vocab padding fix + jit copies on fakedata ( #15067 )
2026-02-28 08:44:55 -08:00
chenyu
76170d035a
relax atol for test_xlm_roberta_large ( #15066 )
2026-02-28 11:22:35 -05:00
qazal
cfb8e6922d
viz: arrow keys move through time ( #15064 )
...
* work
* automatic zoom, keeping scale
* the whole shape should be out of view
2026-02-28 23:52:36 +09:00
nimlgen
9b3450c9da
test gpu crash on cdna ( #15062 )
2026-02-28 13:17:59 +03:00
nimlgen
6bbf813dd3
ci: switch to tinygrad/amdcomgr_dylib ( #15061 )
2026-02-28 13:09:39 +03:00
nimlgen
77846300b2
am: reset vm fault ( #15060 )
2026-02-28 12:58:56 +03:00
George Hotz
dc54441e1f
add better printing to tinygrad.apps.llm ( #15059 )
...
* add better printing to tinygrad.apps.llm
* add gc.collect
* comment
2026-02-28 16:38:50 +08:00
George Hotz
bb84e389cf
functions for llama trainer ( #15045 )
...
* functions for llama trainer
* function there
* axis match
* fix multi
* lil cleaner
* there's a bug with HK_FLASH_ATTENTION
* training functions
* for commit
2026-02-28 12:15:18 +08:00
chenyu
9b4ba3f838
remove ReduceContext.range_to_ends [pr] ( #15055 )
...
* remove ReduceContext.range_to_ends [pr]
make merge_reduce_ends pure. this state is causing issue when introducing more reduce merging rewrites
* tag
2026-02-27 22:15:44 -05:00
chenyu
151608aa90
update test_multiple_to_single_device ( #15056 )
...
follow up to #14482 , add SCACHE=0 to the test
2026-02-27 21:44:33 -05:00
chenyu
5fd06f4f02
differentiable setitem ( #15054 )
...
* differentiable setitem
go through the where path for bw
* no return
2026-02-27 17:25:15 -05:00
chenyu
db6b3e1edc
fix mixed setitem with both basic and tensor indexing ( #15050 )
2026-02-27 15:35:48 -05:00
chenyu
c9f6d8751b
don't remove_bufferize for Invalid ( #15053 )
...
* don't remove_bufferize for Invalid
* replaced
2026-02-27 15:16:09 -05:00
qazal
b8a55d5f68
sqtt: new packet types, add discovery script ( #14960 )
2026-02-28 04:27:27 +09:00
nimlgen
4e12fc3fe6
am: mi3xx recovery ( #15051 )
2026-02-27 22:10:47 +03:00
chenyu
81a35cef38
rearrange Tensor.getitem code ( #15049 )
...
no-op change to prepare setitem fix
2026-02-27 12:57:16 -05:00
chenyu
1406d49eef
failed test cases for advanced setitem ( #15048 )
2026-02-27 10:50:18 -05:00
qazal
ef1017f7ed
viz: skip drawing offscreen tracks in profiler ( #15047 )
2026-02-27 22:19:08 +09:00
qazal
ad99b77f6d
assembly/amd: add gfx12_asm_vflat llvm tests, disasm fixes ( #15046 )
...
* add gfx12_asm_vflat.s
* work
2026-02-27 20:20:31 +09:00
George Hotz
010d2790ce
fix multi minimal ( #15044 )
2026-02-27 14:31:58 +08:00