Commit Graph

13438 Commits

Author SHA1 Message Date
George Hotz
4218cc9257 fix spec 2026-05-27 17:35:46 -07:00
George Hotz
17419edc4a fix slice store to remove the index 2026-05-27 17:21:49 -07:00
qazal
88e88d63d6 viz: click on +- toggles sources (#16409) 2026-05-28 09:12:43 +09:00
George Hotz
b21afb4883 marg line cleanup (#16408)
* marg line cleanup

* bitcast is a mop
2026-05-27 16:41:04 -07:00
wozeparrot
dac3743d75 llama: delayed scaling in optim (#16407) 2026-05-27 15:40:03 -07:00
George Hotz
8ee3a37524 shrink/pad use (new_shape, offset) (#16405)
* shrink uses offset and shape

* pad does too

* fix
2026-05-27 15:13:08 -07:00
Christopher Milan
171401e8df skip modulo by zero in test_dtype_alu (#16404) 2026-05-27 17:09:05 -04:00
qazal
452c7d4230 llama: don't allocate grad_xw13 in bf16 (#16359) 2026-05-28 04:33:07 +09:00
nimlgen
0c385e31c6 hcq2 rewrite (#16375)
* hcq2 rewrite

* fi

* x

* simpler
2026-05-27 22:25:35 +03:00
chenyu
c33b767407 bring back test and torch backend change for unique const (#16403) 2026-05-27 15:16:08 -04:00
Christopher Milan
bacabf0866 webgpu: fix enums (#16402) 2026-05-27 13:09:50 -04:00
chenyu
6da785562b test_custom_kernel_precompile_multidevice (#16401)
add a test to show what invalids need
2026-05-27 11:19:16 -04:00
chenyu
3e80f375ee skip test_setitem_fancy_on_unrealized_view (#16400)
crashes in linux llvm ci
2026-05-27 09:50:26 -04:00
chenyu
945ed4f689 revert const unique changes (#16395) 2026-05-27 00:06:41 -04:00
Christopher Milan
aacc8addf4 ci: use ubuntu 24.04 (#16393) 2026-05-26 23:22:01 -04:00
chenyu
fa14cde05c test update for arange and eye (#16394)
these will need explicit clone to make a buffer
2026-05-26 22:48:34 -04:00
wozeparrot
3a7a6da7d5 llama: fakedata uses real vocab size (#16389) 2026-05-26 18:58:55 -07:00
George Hotz
156a4438d9 rename BUFFER_VIEW to SLICE (#16391)
* rename BUFFER_VIEW to SLICE

* fix comments
2026-05-26 18:15:00 -07:00
Christopher Milan
3adf7f5d95 disable flaky cl test (#16388) 2026-05-26 19:56:57 -04:00
Christopher Milan
d23659d38b cleanup some old test skips (#16384) 2026-05-26 19:07:22 -04:00
George Hotz
fd963038a0 remove allow_any_len from store (#16385)
* remove allow_any_len from store

* a few more

* no bv there

* more fixes

* fixes

* oh that
2026-05-26 15:26:53 -07:00
chenyu
0b88827482 remove CONST(UNIQUE) (#16383) 2026-05-26 14:45:22 -04:00
chenyu
d861c50dce remove unique_const (#16382) 2026-05-26 13:53:31 -04:00
George Hotz
bac82d4949 fix emu bug in gfx950 (#16381)
* fix emu bug in gfx950

* fix renderer
2026-05-26 10:32:03 -07:00
chenyu
9b00defc8c Revert "remove unique_const (#16372)" (#16380)
This reverts commit 09019d6761.
2026-05-26 12:30:07 -04:00
chenyu
09019d6761 remove unique_const (#16372)
* remove unique_const

* fix SDWA thing

* that?
2026-05-26 12:18:03 -04:00
George Hotz
7f1b02854e bufferview offset is units of input dtype (#16378) 2026-05-26 08:49:31 -07:00
qazal
846a809af7 viz: add +- toggle for hidden UOps (#16368)
* first

* remove

* move src toggles to client side

* line

* update viz server tests

* remove those

* logic

* cleanup

* call matches

* fix const arg

* add labels

* keep changes

* the stack on movement ops hiding change

* structure

* rename to expandedNodes

* work

* test intention
2026-05-26 22:31:54 +09:00
nimlgen
032905dec9 hcq2: simpler (#16361) 2026-05-26 14:28:48 +03:00
George Hotz
322693dcd3 hotfix: bump Mac pytest timeout to 4 minutes (try 2) 2026-05-25 18:23:21 -07:00
George Hotz
41ee7dab1c script to generate testsig for DSP (#16371)
* script to generate testsig for DSP

* cleanups
2026-05-25 17:54:58 -07:00
wozeparrot
76fc39ccc0 gather to single device (#16354) 2026-05-25 17:27:08 -07:00
George Hotz
942cb42b97 Revert "hotfix: bump Mac pytest timeout to 4 minutes"
This reverts commit 695a0069ed.
2026-05-25 17:25:11 -07:00
Christopher Milan
8ddd1328df remove getenv(CI) (#16365)
gone everywhere except test_interop, because torch MPS does not work in actions
2026-05-25 20:23:33 -04:00
George Hotz
695a0069ed hotfix: bump Mac pytest timeout to 4 minutes 2026-05-25 17:20:19 -07:00
George Hotz
689ab6a49f move buffer view offset to src (#16364)
* this work?

* failed
2026-05-25 17:07:55 -07:00
Christopher Milan
d8f86be613 webgpu: shader-f16 support in arch (#16370) 2026-05-25 19:20:59 -04:00
qazal
4bcc53eb26 viz: stable node position for +- toggle (#16367) 2026-05-26 06:30:47 +09:00
qazal
3506eb08ec viz: sidebar toggles always recenter (#16366)
* viz: sidebar toggles always recenters

* python brain
2026-05-26 06:14:32 +09:00
chenyu
cdeb861828 invalids is empty [pr] (#16353) 2026-05-25 16:11:38 -04:00
qazal
b73d2d17b9 viz/cli: add --interval (#16363)
* interval support

* add test_interval

* llama uses interval
2026-05-26 03:35:06 +09:00
C T
2ab90f31b1 use windows-specific alias nvcuda when loading cuda on windows (#16260)
This also makes it possible to use cuda on windows by specifying 3 env
vars with direct dll paths: NVCUDA_PATH, NVRTC_PATH and NVJITLINK_PATH
without name collision with CUDA_PATH which is used for cuda headers
include path in NVRTCCompiler.
2026-05-25 08:50:50 -07:00
wozeparrot
68d2102fd2 llama: offload master weights (#16355) 2026-05-25 08:48:13 -07:00
qazal
eecd4706ff fix mailbox comment, add types (#16360) 2026-05-25 22:24:00 +09:00
nimlgen
64095cf2e2 use get_buf in exec_kernel (#16356) 2026-05-25 15:13:40 +03:00
chenyu
5d5e02871f remove Tensor.from_uop (#16344)
and no device for const in Tensor init
2026-05-24 18:53:09 -04:00
nimlgen
a891727c9f hcq2: multi (#16347)
* hcq2: multi

* cleaner a bit
2026-05-24 19:28:33 +03:00
chenyu
926d125a63 update test_stack (#16345)
also skip COMPILE_ONLY, it was comparing 0==0
2026-05-23 10:42:35 -04:00
chenyu
149a87dac2 deviceless const cleanups (#16341) 2026-05-22 20:11:01 -04:00
Christopher Milan
35461d4d8f ci: cleanup some deps [pr] (#16340) 2026-05-22 19:16:08 -04:00