13536 Commits

Author SHA1 Message Date
chenyu
03943cd1a0 use more _uop for cleanup [PR] (#16531)
`t.uop if isinstance(t, Tensor) else t` -> `t._uop`
2026-06-07 17:41:36 -04:00
chenyu
937aeaec60 remove device= from UPat.const [PR] (#16530) 2026-06-07 16:38:43 -04:00
George Hotz
eb1238436a more prereqs for DL/DR -> BUFFER (#16529) 2026-06-07 12:25:11 -07:00
George Hotz
0336ba8eb1 buffer param arg + dsp fixups (#16528) 2026-06-07 12:07:00 -07:00
Dmitriy Strunin
75e903d533 remove unused device arg from _get_winograd_matcols (#16527) 2026-06-07 08:15:09 -04:00
chenyu
90b556ca48 move gradient to mixin [PR] (#16526) 2026-06-07 00:05:02 -04:00
chenyu
4e7c6260b0 clean up test_tesnor_uop_mixin (#16525)
most of those don't have UNIQUE anymore
2026-06-06 23:25:44 -04:00
George Hotz
2a2f81dd3d remove ANON from addrspace, refactor marg (#16523)
* remove ANON from addrspace, refactor marg

* as_shape

* as_shape is cached
2026-06-06 09:49:09 -07:00
qazal
e69b4189b0 viz: hide STACK on PARAM by default (#16522) 2026-06-06 16:41:15 +09:00
Christopher Milan
857b1f5399 ci: more parallelism, less duplication (#16509) 2026-06-05 21:26:19 -04:00
wozeparrot
a1ec32cfd2 llama: current grad scaling (#16518) 2026-06-05 15:39:41 -07:00
Christopher Milan
8c0ba1da5c cleanup more from test/backend (#16521) 2026-06-05 18:38:46 -04:00
chenyu
9982185b14 remove unused AFTER rules in pm_add_buffers[PR] (#16519) 2026-06-05 14:58:34 -04:00
nimlgen
5ebd44aa12 hcq2: merge queues (#16514)
* hcq2: mergw queues

* cleaner
2026-06-05 21:20:25 +03:00
chenyu
a51b5ba424 remove early fixup const copy [PR] (#16516) 2026-06-05 11:35:34 -04:00
Nueramarcos
8274140134 uop/ops: fix ~bool deprecation warning on Python 3.12+ (ORANGE Grok helped with the patch) (#16512) 2026-06-05 10:54:30 -04:00
chenyu
588c759a3d remove unused GroupOp.Buffer [PR] (#16515) 2026-06-05 10:38:52 -04:00
qazal
79a13310b3 viz: kernel_graph.txt unique is per schedule (#16511) 2026-06-05 16:17:28 +09:00
Christopher Milan
9b0f75622c many jit tests belong in unit (#16508) 2026-06-04 21:36:53 -04:00
chenyu
bb407d8b3c fix transform_precompiled_call for MULTI (#16510)
based on my understanding for https://github.com/tinygrad/tinygrad/pull/16084
2026-06-04 20:09:58 -04:00
wozeparrot
f11f63007d llama: immediate scaling on flag (#16494) 2026-06-04 10:30:00 -07:00
George Hotz
4fb8ce1831 update buffer in spec (#16507) 2026-06-04 10:12:31 -07:00
chenyu
4a8bf07a87 remove CONST(DEVICE) (#16506) 2026-06-04 11:29:46 -04:00
nimlgen
3838c8df1b hcq2: move global sync (#16504) 2026-06-04 17:32:40 +03:00
chenyu
0faaf6df26 remove kwargs from arange and linspace [PR] (#16505)
it used to have requires_grad and device, now both are removed
2026-06-04 10:32:37 -04:00
qazal
3b1a5f9770 llama: a_bT and aT_b bf16 gemms (#16487)
* hk_bf16_gemm

* enable in 8b

* cleanups

* rename to USE_HK_BF16_GEMM

* work

* work

* work

* work

* change the gemms

* work

* work

* set as default

* work

* change
2026-06-04 23:30:21 +09:00
chenyu
5fad87252d no device= into arange and eye (#16503) 2026-06-04 09:21:50 -04:00
nimlgen
11af81f96f hcq2: cleaner (#16502) 2026-06-04 15:26:37 +03:00
chenyu
2c915c61ed no CONST(DEVICE) in torch_backend (#16499) 2026-06-04 00:26:47 -04:00
wozeparrot
fd13080636 deviceless const skip axis check (#16496) 2026-06-03 19:13:20 -07:00
qazal
f7f03bd7e5 viz: better name for src id in kernel_graph.txt (#16495)
* viz: better name for src id in kernel_graph.txt

* better order

* cleanup
2026-06-04 11:09:29 +09:00
Christopher Milan
9dac781e45 ci: use uv (#16492) 2026-06-03 21:38:50 -04:00
George Hotz
9fdeaa402b no anon addrspace, don't write hacks (#16491)
* no anon addrspace, don't write hacks

* revert that

* no reg there
2026-06-03 16:19:30 -07:00
chenyu
2f83d01ccf fix deviceless materialize device (#16493)
symbolic arange currently does not fuse, which creates a deviceless UOp post rangeify that needs a device to bufferize
2026-06-03 19:13:21 -04:00
chenyu
19eb72ff60 remove use of full with buffer=False and non-None device= (#16489) 2026-06-03 16:21:24 -04:00
nimlgen
6f2a2857c8 hcq2: refactor deps (#16490) 2026-06-03 23:20:24 +03:00
chenyu
243446b44f remove CONST(DEVICE) from const_like (#16488) 2026-06-03 14:04:51 -04:00
George Hotz
cee472a0ef renderer Estimates uses maxel (#16485) 2026-06-03 10:55:00 -07:00
chenyu
8a4203638a make full with buffer=False deviceless (#16483)
affects arange and eye
2026-06-03 12:35:59 -04:00
qazal
405866f2b7 viz: improve kernel_graph.py usability (#16486)
* better default

* always format kernel output

* also show ref

* sched num
2026-06-03 21:12:44 +09:00
Christopher Milan
f43cba5765 ci: native python where possible (#16473)
linters stays at 3.11
2026-06-02 22:40:12 -04:00
wozeparrot
7dcfd144b6 llama: columnwise fp8 scaling (#16480) 2026-06-02 18:55:45 -07:00
George Hotz
ffadd7a315 remove intel and amx support (#16482) 2026-06-02 18:53:05 -07:00
George Hotz
5f439e3b7c refactor cstyle to avoid dtype [PR] (#16478)
* refactor cstyle to avoid dtype

* clean up rules

* add new style option
2026-06-02 18:27:12 -07:00
Christopher Milan
80eeb4dd21 mockgpu: use autogen.libc (#16479) 2026-06-02 19:59:36 -04:00
chenyu
a43b55d480 deviceless const folding schedule test (#16477) 2026-06-02 18:46:30 -04:00
George Hotz
14f843737b renderer cleanups (pt 3) [PR] (#16475)
* renderer cleanups (pt 3)

* point refactors

* fix bugs

* fix PR
2026-06-02 14:24:24 -07:00
nimlgen
99e37b1ee3 hcq2: deps (#16459)
* start

* sin

* f
2026-06-02 22:34:25 +03:00
George Hotz
82f1c983d4 clean renderer migrations [pr] (#16472)
* clean renderer migrations

* minor webgpu

* use PARAM UOp as API

* make linter happy
2026-06-02 11:19:00 -07:00
Christopher Milan
9897658895 ci: fix ocelot compilation on macos (#16471) 2026-06-02 12:43:31 -04:00