Commit Graph

13563 Commits

Author SHA1 Message Date
Chen-Yu Yang
5842f434af PADTO pads Invalids 2026-06-10 22:09:49 -04:00
Christopher Milan
6e1b61f16f cleanup some amd deps (#16563)
don't load hsa runtime, remove ib autogen
2026-06-10 19:01:56 -04:00
George Hotz
7e6d617935 addrspace cleanups (#16565)
* addrspace cleanups

* bumps

* eh, relax a little
2026-06-10 15:57:18 -07:00
nimlgen
2c9d2c0d31 jit: memplan before compile (#16560) 2026-06-10 15:05:15 +03:00
qazal
34481830f1 rangeify: fix cost function for AFTER(out, CALL) (#16559)
* simple failing test

* fix rangeify cost function

* new ops count
2026-06-10 17:30:50 +09:00
chenyu
623b66e0e4 more tensor and mixin cleanups [PR] (#16558) 2026-06-10 00:39:33 -04:00
chenyu
7366d32247 getitem cleanups [PR] (#16556) 2026-06-09 22:48:58 -04:00
George Hotz
fd76ac992e cstyle renderer is new style [pr] (#16484)
* cstyle new style

* switch cstyle renderer to new style

* fix hip

* fixes

* fix webgpu

* correct webgpu is_packed

* fix dsp

* fixes

* fix Ops.RANGE must be CONST

* old style render access

* this is correct

* fix cstyle to good

* dl/dr

* as array

* fix spec

* remove define_local/define_reg

* buffer in shrink

* fix test_tiny

* all tests fix

* param args aren't realized

* wgsl fix

* work

* new gate

* fix opencl qcom

* process replay

* sort order

* fix render index
2026-06-09 18:36:01 -07:00
Christopher Milan
97d483350c ci: download prebuilt ocelot (#16554) 2026-06-09 19:51:33 -04:00
Christopher Milan
f9d88d3c3a fix race in test_quantize_onnx (#16555) 2026-06-09 18:39:48 -04:00
wozeparrot
2bdc360606 gemm: mxfp8 hipkittens gemm (#16541)
* gemm: mxfp8 hipkittens gemm

* feat: update hipkittens

* feat: kernel signature

* clean: just kernel

* feat: from tinygrad

* feat: test

* fix: add back utils

* clean: no diff

* clean: no diff
2026-06-09 15:20:05 -07:00
chenyu
12addee14f tesnor and mixin cleanups [PR] (#16553) 2026-06-09 15:33:13 -04:00
nimlgen
2ab2d51099 hcq2: fix repeated calls (#16552) 2026-06-09 19:11:42 +03:00
chenyu
3f053a3370 move functional part of rand to RandMixin (#16551) 2026-06-09 09:40:48 -04:00
nimlgen
fa31c744b9 hcq2: cleaner (#16550) 2026-06-09 16:33:05 +03:00
qazal
598cc13ad2 more readable null graph profile in VIZ (#16548)
* more readable null graph profile in VIZ

* change

* fix flaky test
2026-06-09 18:35:05 +09:00
qazal
d18ad49f20 fix flaky test_disktensor (#16549) 2026-06-09 18:23:22 +09:00
qazal
fa400f9790 less E kernels in all2all (#16546) 2026-06-09 13:51:57 +09:00
qazal
b8931440ae add all2all schedule test (#16545) 2026-06-09 12:41:35 +09:00
wozeparrot
5ef30005fa update hipkittens (#16544) 2026-06-08 18:53:25 -07:00
Christopher Milan
4e2e2e9956 ocelot: use c.DLL (#16540) 2026-06-08 21:27:28 -04:00
chenyu
11fee53527 RandMixin [PR] (#16543) 2026-06-08 19:11:28 -04:00
chenyu
e2ef5cf5c9 no args and kwargs for _multi_like [PR] (#16539) 2026-06-08 17:35:15 -04:00
chenyu
12764161c9 UOp.shard support axis=None [PR] (#16538)
match Tensor
2026-06-08 11:36:50 -04:00
chenyu
ebc5390c9a advance indexing to mixin [PR] (#16532) 2026-06-08 09:24:49 -04:00
nimlgen
95d63d6c07 hcq2: lower to ins (#16535)
* hcq2: lower to ins

* pm4

* f
2026-06-08 16:15:30 +03:00
nimlgen
8baca185d5 hcq2: add kfd (#16537) 2026-06-08 13:48:27 +03:00
chenyu
03943cd1a0 use more _uop for cleanup [PR] (#16531)
`t.uop if isinstance(t, Tensor) else t` -> `t._uop`
2026-06-07 17:41:36 -04:00
chenyu
937aeaec60 remove device= from UPat.const [PR] (#16530) 2026-06-07 16:38:43 -04:00
George Hotz
eb1238436a more prereqs for DL/DR -> BUFFER (#16529) 2026-06-07 12:25:11 -07:00
George Hotz
0336ba8eb1 buffer param arg + dsp fixups (#16528) 2026-06-07 12:07:00 -07:00
Dmitriy Strunin
75e903d533 remove unused device arg from _get_winograd_matcols (#16527) 2026-06-07 08:15:09 -04:00
chenyu
90b556ca48 move gradient to mixin [PR] (#16526) 2026-06-07 00:05:02 -04:00
chenyu
4e7c6260b0 clean up test_tesnor_uop_mixin (#16525)
most of those don't have UNIQUE anymore
2026-06-06 23:25:44 -04:00
George Hotz
2a2f81dd3d remove ANON from addrspace, refactor marg (#16523)
* remove ANON from addrspace, refactor marg

* as_shape

* as_shape is cached
2026-06-06 09:49:09 -07:00
qazal
e69b4189b0 viz: hide STACK on PARAM by default (#16522) 2026-06-06 16:41:15 +09:00
Christopher Milan
857b1f5399 ci: more parallelism, less duplication (#16509) 2026-06-05 21:26:19 -04:00
wozeparrot
a1ec32cfd2 llama: current grad scaling (#16518) 2026-06-05 15:39:41 -07:00
Christopher Milan
8c0ba1da5c cleanup more from test/backend (#16521) 2026-06-05 18:38:46 -04:00
chenyu
9982185b14 remove unused AFTER rules in pm_add_buffers[PR] (#16519) 2026-06-05 14:58:34 -04:00
nimlgen
5ebd44aa12 hcq2: merge queues (#16514)
* hcq2: mergw queues

* cleaner
2026-06-05 21:20:25 +03:00
chenyu
a51b5ba424 remove early fixup const copy [PR] (#16516) 2026-06-05 11:35:34 -04:00
Nueramarcos
8274140134 uop/ops: fix ~bool deprecation warning on Python 3.12+ (ORANGE Grok helped with the patch) (#16512) 2026-06-05 10:54:30 -04:00
chenyu
588c759a3d remove unused GroupOp.Buffer [PR] (#16515) 2026-06-05 10:38:52 -04:00
qazal
79a13310b3 viz: kernel_graph.txt unique is per schedule (#16511) 2026-06-05 16:17:28 +09:00
Christopher Milan
9b0f75622c many jit tests belong in unit (#16508) 2026-06-04 21:36:53 -04:00
chenyu
bb407d8b3c fix transform_precompiled_call for MULTI (#16510)
based on my understanding for https://github.com/tinygrad/tinygrad/pull/16084
2026-06-04 20:09:58 -04:00
wozeparrot
f11f63007d llama: immediate scaling on flag (#16494) 2026-06-04 10:30:00 -07:00
George Hotz
4fb8ce1831 update buffer in spec (#16507) 2026-06-04 10:12:31 -07:00
chenyu
4a8bf07a87 remove CONST(DEVICE) (#16506) 2026-06-04 11:29:46 -04:00