Commit Graph

13228 Commits

Author SHA1 Message Date
George Hotz
b910f1d5c0 something 2026-05-08 17:32:30 -07:00
George Hotz
e14b2b41c6 move image index 2026-05-08 17:27:34 -07:00
George Hotz
bf05a2762e Merge branch 'master' into image_no_vec 2026-05-08 16:32:08 -07:00
Charlie Kerfoot
71a8c0da09 fix: trailing space format string (#16005) 2026-05-08 16:31:10 -07:00
Pawan
4dd6ad3514 gradient: add TRUNC backward (#15925)
* gradient: add TRUNC backward

* test: move round quantization gradient to test_ops
2026-05-08 16:27:55 -07:00
chenyu
5152ff95e7 _pad_constant and avg_pool2d cleanups (#16110) 2026-05-08 18:09:47 -04:00
George Hotz
08747264cf fixes 2026-05-08 11:07:09 -07:00
George Hotz
f68c224b71 don't use vec(2) for image index 2026-05-08 10:52:24 -07:00
chenyu
e6584532f4 minor elementwise cleanups (#16102) 2026-05-08 13:38:34 -04:00
nimlgen
49b55af619 jit: simpler free_intermediates (#16099) 2026-05-08 19:08:33 +03:00
chenyu
0f46c08582 div mixin cleanups (#16100) 2026-05-08 12:05:37 -04:00
chenyu
235044c9d8 Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD (#16093)
* Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD

* ruff
2026-05-07 23:18:15 -04:00
Christopher Milan
faabe6aa42 nv: remaining firmware from /lib/firmware (#16088) 2026-05-07 23:07:43 -04:00
b1tg
7ef901a81d llm: moe speedup (#16059) 2026-05-07 19:06:35 -07:00
George Hotz
80da8a4b9c add spec to main tinygrad repo (#16092) 2026-05-07 18:52:49 -07:00
June
83eaefcd0f onnx: deduplicate simple proto parsers (#16085)
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2026-05-07 18:44:27 -07:00
George Hotz
c106c73e51 remove the gate from index (#16081)
* remove the gate from index

* gpt says this works

* remove hanging casts

* simplify

* move that down

* move gates

* ptr

* remove that simplify

* move that
2026-05-07 18:42:00 -07:00
wozeparrot
d11f4d0ec2 fix: don't copy on slice of DP weight (#16089) 2026-05-07 17:58:01 -07:00
George Hotz
1d1b726cf6 hotfix: disable flaky framework pytest 2026-05-07 17:05:06 -07:00
Christopher Milan
9a6f7f7576 nv: look for fmc firmware in /lib/firmware (#16080) 2026-05-07 18:08:27 -04:00
George Hotz
b796bbae87 fix valid in indexing tests (#16087) 2026-05-07 14:11:28 -07:00
wozeparrot
4d1a9dca41 fix: don't copy precompiled custom kernel outputs (#16084) 2026-05-07 14:02:38 -07:00
qazal
f9083cf901 use subactions for benchmark.yml process replay [pr] (#13396) 2026-05-08 03:46:25 +09:00
nimlgen
2f0aa884d5 tinygpu: minimal is macos13 for resets (#16075) 2026-05-07 21:25:56 +03:00
chenyu
072db9924c div to mixin (#16078)
also deleted idiv method
2026-05-07 12:52:37 -04:00
chenyu
516b00e286 mod and fmod to mixin (#16077) 2026-05-07 12:13:39 -04:00
qazal
a9a87ad8fd viz/cli: less flags (#16076)
* viz/cli: merge -s and -i flags

* only -t

* merge parser

* fix
2026-05-08 00:22:40 +09:00
qazal
f813a04b3f viz: pickle path in str (#16073) 2026-05-07 18:49:21 +09:00
wozeparrot
730fa66bf3 llama speed 6 (#16071) 2026-05-06 20:51:03 -07:00
Christopher Milan
7b91f7c90c nv: look for gsp firmware in /lib/firmware (#16068) 2026-05-06 21:35:47 -04:00
George Hotz
8e84317743 the renderer part of gate moving from index to load/store (#16064)
* the renderer part of gate moving from index to load/store

* fixed

* fix gated stores

* fix spec

* better?

* Where after gated load becomes alt value

* cleaner expression

* fix python backend

* remove dead code
2026-05-06 13:47:04 -07:00
chenyu
ef085304bc stronger divmod_recombine (#16066) 2026-05-06 15:41:54 -04:00
qazal
d7d32d82ee viz/cli: print first uop with DEBUG=6 (#16065)
* viz/cli: print first uop with DEBUG=6

* rename fmt to emit

* define inst
2026-05-07 03:39:34 +09:00
chenyu
af4140f3be fix divmod recombine for floordiv (#16062) 2026-05-06 14:22:42 -04:00
chenyu
c6ad3d3ac2 better divmod late rewrite (#16061)
better order
2026-05-06 11:31:48 -04:00
chenyu
aaabe42373 relax fold_divmod_general (#16058) 2026-05-05 21:37:56 -04:00
Christopher Milan
1de14cf33a am: autogen soc (#16055) 2026-05-05 20:39:43 -04:00
chenyu
869eae6b37 fix double div rewrites (#16054) 2026-05-05 19:34:35 -04:00
Christopher Milan
bd06ea9f97 am: simplify import_module (#16046) 2026-05-05 19:25:53 -04:00
qazal
795501e1da fix device in null graph events (#16053)
* failing test

* fix compute

* fix sdma
2026-05-06 07:44:08 +09:00
wozeparrot
ab6218bc92 llama mp fixes (#16050) 2026-05-05 15:35:32 -07:00
chenyu
34fe37d64e use FLOORDIV and FLOORMOD (#16048)
* use FLOORDIV and FLOORMOD

also removed CORRECT_DIVMOD_FOLDING

* fix

* Revert "fix"

This reverts commit 86af33b88ef31943c61e67189b072eca4896409a.

* fix

* fix
2026-05-05 18:32:54 -04:00
Christopher Milan
76ff378007 autogen: fewer apt dependencies (#16049) 2026-05-05 17:22:41 -04:00
nimlgen
5fa0016ffc supports_exec_item -> supports_uop (#16033) 2026-05-05 22:41:13 +03:00
qazal
cee17e0d2f viz: fix diff color (#16045) 2026-05-06 03:40:53 +09:00
chenyu
9c37a0c75d Ops.FLOORDIV and Ops.FLOORMOD (#16038)
* Ops.FLOORDIV and Ops.FLOORMOD

lowered into IDIV and MOD in get_late_rewrite_patterns

* still need this

* exclude

* like that?
2026-05-05 11:42:14 -04:00
qazal
d79bf356c2 viz: add CALL -> codegen link (#16044)
* work

* cleaner

* details

* rm
2026-05-05 23:34:44 +09:00
Christopher Milan
1c8cb0769a am: autogen asic_regs (#16004) 2026-05-04 22:52:07 -04:00
George Hotz
26406bed83 amd uses .valid, not index src valid (#16042) 2026-05-04 18:35:15 -07:00
chenyu
a357a0449a Tensor.div cleanup (#16041) 2026-05-04 19:27:36 -04:00