George Hotz
8612385ccb
add all codegen stages to spec_tensor
2026-05-12 10:23:03 -07:00
chenyu
f3e3c3851f
explicit args to Tensor.rand ( #16161 )
...
added requires_grad, other kwargs were silently dropped
2026-05-12 12:53:39 -04:00
nimlgen
e93fb5f9b9
hcq2: remove hcqprogram ( #16157 )
...
* hcq2 rm program
* nonbeauty
* no prog
* tiny
* f
* x
2026-05-12 18:49:13 +03:00
nimlgen
a708542308
fix ci spec ( #16156 )
2026-05-12 17:57:11 +03:00
nimlgen
e5729935c6
time_call ( #16152 )
...
* time_call
* x
* fix caches
2026-05-12 16:58:28 +03:00
qazal
fe39cf148a
add Ops.SOURCE test ( #16155 )
...
* simple failing test
* raises
* change
2026-05-12 22:49:32 +09:00
qazal
5cd0494b14
viz: canonicalize ast for schedule to codegen linking ( #16154 )
...
* simple failing test
* always null device
* viz: canonicalize ast for schedule to codegen linking
* SCACHE
2026-05-12 22:40:21 +09:00
qazal
c1d125ff3b
llm: add markers to --benchmark ( #16153 )
...
* markers in llm
* ui fix
2026-05-12 20:14:11 +09:00
wozeparrot
e9359d9e7d
more llama mp fixes ( #16151 )
...
* llama: SPLIT_W13
* llama: fix with no fused kernels
* llama: cast to bf16 on non asm_gemm patH
* llama: new mp flags
2026-05-11 21:29:23 -07:00
chenyu
09fd80fba6
fix randperm and _multi_like drop requires_grad ( #16150 )
2026-05-11 23:23:34 -04:00
George Hotz
8294d105a7
Update the spec in spec.py to match the current state ( #16132 )
...
* start work on specv2
* more spec
* more spec
* fix amd emulator
* more spec
* more
* fix test_uop_graph
* move those
* spec=2
* skip those questionable tests
* ptx fix
* more spec=2
* store
* allow custom function in tensor
* spec 2
* fix beam search for tensor cores
* delete the old specs
* fix import
2026-05-11 20:07:47 -07:00
chenyu
3942a80f66
fix wrong kwargs passed into rands ( #16149 )
...
working towards explicit args for these
2026-05-11 22:22:06 -04:00
Christopher Milan
039d84ff02
Revert "onnx: deduplicate simple proto parsers" ( #16148 )
...
This reverts commit 83eaefcd0f .
2026-05-11 21:45:17 -04:00
Christopher Milan
20f587d5d5
nv: rm _download ( #16147 )
2026-05-11 19:56:37 -04:00
chenyu
371ab2023f
clean up image_dot and image_conv2d ( #16145 )
2026-05-11 19:37:58 -04:00
Vikram Rangarajan
effa263865
Torch backend aten::cat.out fix ( #16121 )
...
* Handle empty 1D tensors in cat_out
* Undid other changes
* Fixed torch cat
* Improved cat.out, added more tests
* Cleaned code
* Type hinted dim
* Removed whitespace
2026-05-11 16:28:16 -07:00
chenyu
63c1f00b80
disable test_svd_general again ( #16146 )
...
flaky on CI
2026-05-11 19:24:32 -04:00
Christopher Milan
2dccd4a3eb
am: autogen pmc ( #16143 )
...
* am: autogen pmc
* cleanup
* fix
* type
2026-05-11 19:22:12 -04:00
Christopher Milan
7ba55ad3ba
nv: autogen regs ( #16139 )
...
* nv: autogen regs
* flcn cot
* ci
* gen
2026-05-11 18:52:24 -04:00
chenyu
0b02fb6797
Revert "[pr] match torch rmsnorm ( #16122 )" ( #16144 )
...
This reverts commit 692257dd70 .
2026-05-11 17:53:42 -04:00
chenyu
fbe8be0b8b
style cleanup to Tensor.qr and svd ( #16142 )
...
* style cleanup to Tensor.qr and svd
same kernels
* more
* enable
2026-05-11 17:16:59 -04:00
qazal
fc2cc1d77a
viz: call graph renderer example ( #16141 )
...
* work
* emits
* this
* cleaner repr for custom binaries
* --call-graph
* _ref
* this
* start
* this
* everything execpt the pyrender
* bring pyrender back
2026-05-12 05:07:30 +09:00
chenyu
f65e343fb3
spec.py cleanups ( #16140 )
...
removed END from shared_spec and NOOP from full_spec
2026-05-11 15:59:49 -04:00
Joshua James Venter
692257dd70
[pr] match torch rmsnorm ( #16122 )
...
* [pr] match rmsnorm torch
Signed-off-by: Joshua James Venter <venter.joshua@gmail.com >
* 1e-5
* ops.md
---------
Signed-off-by: Joshua James Venter <venter.joshua@gmail.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2026-05-11 14:36:41 -04:00
Sachith Shetty
59a81559d4
fix: add self.device to qr, svd, masked_select intermediates ( #16131 )
2026-05-11 11:22:54 -04:00
nimlgen
70c2480e71
hcq2 to extra ( #16126 )
...
* hcq2 in extra
* correct
* some revert from non-extra
* cln
* cpu
* x
* attach
* min
* remove attach
* linter
2026-05-11 17:17:30 +03:00
nimlgen
ad9738892c
get_buf() for Buffer ( #16134 )
...
* p
* mypy
* x
2026-05-11 16:36:14 +03:00
qazal
2dd84416bf
viz/cli: schedule renderer ( #16101 )
...
* simpler steps
* work
* work
* iterate
* faster
* better
* simplify more
* sys stdin
* less
* work
* work and mv
* better
* seen bufs
* all call graphs
* print query
* ux
* param to buffer / buffer_view
* work
* respect NO_COLOR in uop_to_json
* less
* render uops
* rm custom renderer
* call can't pyrender.
* unrelated diff
* assert
* 5
2026-05-11 01:56:16 +09:00
George Hotz
53f9587099
add canary
2026-05-10 09:38:18 -07:00
George Hotz
28cb7f1bcc
update readme with contributing guidelines
2026-05-10 09:35:48 -07:00
George Hotz
daed602569
rename BUFFERIZE to STAGE ( #16125 )
2026-05-10 09:26:46 -07:00
qazal
39ce780907
viz/cli: emit all runs of selected kernel, json fixes ( #16124 )
...
* keep print
* --json in tests, sqtt --json err
* work
* import
* less
* line
2026-05-10 21:45:51 +09:00
qazal
51c7dafb0d
split viz cli test helpers ( #16123 )
2026-05-10 19:42:24 +09:00
chenyu
b2a682ec60
remove _shape check in pm_mops [pr] ( #16120 )
...
seems fine now
2026-05-09 17:54:22 -04:00
wozeparrot
026688f03f
llama: move to correct dir ( #16118 )
2026-05-08 19:42:16 -07:00
Christopher Milan
a7512e0d12
PYTHON: images have no alignment constraints (by default) ( #16115 )
2026-05-08 20:35:03 -04:00
Christopher Milan
105b037c3c
cl: image alignment in arch ( #16106 )
2026-05-08 19:33:33 -04:00
Charlie Kerfoot
71a8c0da09
fix: trailing space format string ( #16005 )
2026-05-08 16:31:10 -07:00
Pawan
4dd6ad3514
gradient: add TRUNC backward ( #15925 )
...
* gradient: add TRUNC backward
* test: move round quantization gradient to test_ops
2026-05-08 16:27:55 -07:00
chenyu
5152ff95e7
_pad_constant and avg_pool2d cleanups ( #16110 )
2026-05-08 18:09:47 -04:00
chenyu
e6584532f4
minor elementwise cleanups ( #16102 )
2026-05-08 13:38:34 -04:00
nimlgen
49b55af619
jit: simpler free_intermediates ( #16099 )
2026-05-08 19:08:33 +03:00
chenyu
0f46c08582
div mixin cleanups ( #16100 )
2026-05-08 12:05:37 -04:00
chenyu
235044c9d8
Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD ( #16093 )
...
* Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD
* ruff
2026-05-07 23:18:15 -04:00
Christopher Milan
faabe6aa42
nv: remaining firmware from /lib/firmware ( #16088 )
2026-05-07 23:07:43 -04:00
b1tg
7ef901a81d
llm: moe speedup ( #16059 )
2026-05-07 19:06:35 -07:00
George Hotz
80da8a4b9c
add spec to main tinygrad repo ( #16092 )
2026-05-07 18:52:49 -07:00
June
83eaefcd0f
onnx: deduplicate simple proto parsers ( #16085 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2026-05-07 18:44:27 -07:00
George Hotz
c106c73e51
remove the gate from index ( #16081 )
...
* remove the gate from index
* gpt says this works
* remove hanging casts
* simplify
* move that down
* move gates
* ptr
* remove that simplify
* move that
2026-05-07 18:42:00 -07:00
wozeparrot
d11f4d0ec2
fix: don't copy on slice of DP weight ( #16089 )
2026-05-07 17:58:01 -07:00