951 Commits

Author SHA1 Message Date
Christopher Milan
857b1f5399 ci: more parallelism, less duplication (#16509) 2026-06-05 21:26:19 -04:00
Christopher Milan
9dac781e45 ci: use uv (#16492) 2026-06-03 21:38:50 -04:00
Christopher Milan
f43cba5765 ci: native python where possible (#16473)
linters stays at 3.11
2026-06-02 22:40:12 -04:00
George Hotz
ffadd7a315 remove intel and amx support (#16482) 2026-06-02 18:53:05 -07:00
Christopher Milan
c6cad1ad67 ci: standardize runs-on (#16466)
* ci: use macos 26

* ugh github

* stick with github for arm
2026-06-01 21:39:58 -04:00
Christopher Milan
b0ecbb34d9 ci: cleanup python backend tests (#16465) 2026-06-01 20:08:05 -04:00
Christopher Milan
2d0f132a3b ci: cleanup more duplicate tests (#16462) 2026-06-01 18:56:29 -04:00
Christopher Milan
c377d01491 ci: run dsp on tinygrad[testing] (#16442) 2026-05-29 21:16:56 -04:00
Christopher Milan
d943493b79 ci: remove duplicate op compile test (#16441) 2026-05-29 19:20:31 -04:00
Christopher Milan
ef50a49693 ci: macos dev matrix (#16436) 2026-05-29 17:40:32 -04:00
Christopher Milan
6e0d5262dc ci: autocancel outdated pr jobs (#16424) 2026-05-28 23:14:35 -04:00
Christopher Milan
69aa2054f6 rename clangjit to clang (#16423) 2026-05-28 22:41:58 -04:00
Christopher Milan
a909acb882 move llvmspeed to benchmarks (#16422) 2026-05-28 22:26:22 -04:00
Christopher Milan
7d38edffdb ci: dev matrix (#16420)
windows just runs test_tiny
2026-05-28 22:04:04 -04:00
George Hotz
c87f3433d1 use namespace runners (#16387)
Co-authored-by: Christopher Milan <chrismilan@ucla.edu>
2026-05-28 18:05:46 -04:00
Christopher Milan
c8af163d2b disable process replay by default (#16419)
enable process replay with [pr] and assert with [PR]
process replay no longer captures on master
2026-05-28 17:36:28 -04:00
Christopher Milan
aacc8addf4 ci: use ubuntu 24.04 (#16393) 2026-05-26 23:22:01 -04:00
Christopher Milan
35461d4d8f ci: cleanup some deps [pr] (#16340) 2026-05-22 19:16:08 -04:00
Christopher Milan
518e60534e only load tinymesa_cpu when LVP is explicitly requested (#16320) 2026-05-21 19:03:13 -04:00
George Hotz
58d58c1659 remove DEVECTORIZE (#16290)
* remove DEVECTORIZE

* fully remove DEVECTORIZE
2026-05-20 13:25:49 -07:00
George Hotz
da7414d6dc fix RUN_PICKLE and test it (#16272)
* add test for openpilot RUN_PICKLE

* fix RUN_PICKLE and test it
2026-05-19 17:00:25 -07:00
ttomsa
aa1e59ab97 X86 with Ops.INS (#14873)
* draft

* cleanup test_encodings

* cleanup test_isel

* model flag state and support rematerialization

* woops

* add vbroadcastss instruction

* don't fuse load if used multiple times in src

* add movabs instruction and fix idiv

* fixes

* add x86 backend to tests

* float16 fix

* rm TwoAddress2nd

* add BARRIER

* test windows ci

* yup isel fixes the mask stuff too and its beautiful

* add cmoves to the spec

* support storing imms

* no TUPLE_ORDER, breaks tests

* fix remaining seg faults

* add float max

* always fuse index

* minor

* fix DEFINE_VAR/SPECIAL and enable multithreading

* linter

* more linter

* more

* more

* more

* let's try this

* perhaps

* start new scheduler

* more scheduling info

* cleaner shuffle functions

* fixup isel tests

* skip bounds check when NOOPs exist

* skip inf rewrite tests

* fix const tag hack and add x86ops to _shape

* fix

* skip a few tests

* func arg order independent from op value

* x86 goes in own linearize

* switch to PARAM

* more

* add min x86op and neg in decomps

* do mulacc in isel

* use def_reg in test_encodings

* enable emulated int64 tests

* how much does this fix

* Ops becomes OpType

* fix

* rm noqa

* rm machine scheduler stuff

* and this

* allow for extending enums and move X86Ops out of uop

* fix imports

* rm X86GroupOp from ops.py

* spacing

* tell mypy to shut up

* more linter

* add x86op test

* allow set[X86Ops] in upat

* move NOOPs to pre_isel_matcher and rm NOOP from spec

* more asserts

* also this

* cleanup encode

* simplify live range

* fix idiv

* add Ops.INS to x86

* more changes

* more changes

* more changes

* fix

* fix

* fix

* fix

* print formatted assembly

* fix 8bit idiv?

* oops

* enable float16  and unaligned vector load/store

* actually no

* move x86 tests

* no more bool cast

* fix

* linter

* linter

* move X86Ops to x86.py

* fix vpbroadcast

* cleanups

* linter

* print correct reg names

* canonical max

* move max/min and add test

* support float16 vector load/store

* rm bad rewrite

* vpsrldq can't access memory

* regalloc takes renderer

* enable vector load/store on all dtypes

* more isel tests

* rm this for now

* a lot better

* fix

* fix

* fix

* deal with flags correctly

* fix

* enable gep noop rule

* fix

* fix

* fix

* add callee saved registers

* use Ops.CONST instead of X86Ops.IMM

* fix

* enable TUPLE_ORDER

* fix

* rm x86 code in linearizer

* fix

* fix

* fix

* move isa rewrites to codegen

* fix

* fix

* skip test_linearizer.py

* skip more tests

* fix

* fix for idiv/mod changes

* fix

* don't use fmadd if it duplicates fused op

* hacky

* fix

* cleanups

* cleanups

* fix

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2026-05-19 12:42:54 -07:00
George Hotz
b2e8102209 25000 lines for x86 backend 2026-05-19 11:27:41 -07:00
Christopher Milan
7515824a6d ci: actually use clang-20, enable bfloat16 (#16249) 2026-05-18 19:06:43 -04:00
Christopher Milan
891a1ae7c2 onnx: remove dtype_fallback (#15717) 2026-05-14 22:06:57 -04:00
Christopher Milan
9a365d9978 ci: fix null image tests (#16188) 2026-05-13 18:00:05 -04:00
Christopher Milan
f1fdd2ccec ci: add IMAGE=1 compile-only tests (#16182)
* ci: add IMAGE=1 compile-only tests

* fix
2026-05-12 23:40:32 -04:00
Christopher Milan
3844a31f87 ci: untangle cuda/ocelot, less apt (#16171)
* ci: untangle cuda/ocelot, less apt

* ldconfig
2026-05-12 18:14:03 -04:00
Christopher Milan
316607f004 dsp: don't use docker in ci (#16167)
* dsp: don't use docker in ci

* add setup script for macos docker
2026-05-12 17:11:03 -04:00
Christopher Milan
a7512e0d12 PYTHON: images have no alignment constraints (by default) (#16115) 2026-05-08 20:35:03 -04:00
Christopher Milan
105b037c3c cl: image alignment in arch (#16106) 2026-05-08 19:33:33 -04:00
chenyu
34fe37d64e use FLOORDIV and FLOORMOD (#16048)
* use FLOORDIV and FLOORMOD

also removed CORRECT_DIVMOD_FOLDING

* fix

* Revert "fix"

This reverts commit 86af33b88ef31943c61e67189b072eca4896409a.

* fix

* fix
2026-05-05 18:32:54 -04:00
chenyu
9c37a0c75d Ops.FLOORDIV and Ops.FLOORMOD (#16038)
* Ops.FLOORDIV and Ops.FLOORMOD

lowered into IDIV and MOD in get_late_rewrite_patterns

* still need this

* exclude

* like that?
2026-05-05 11:42:14 -04:00
Christopher Milan
697e7aa819 MOCK+AMD and MOCK+NV interfaces (#15858)
MOCK+AMD is an alias for MOCKKFD+AMD, MOCKNVK+NV is renamed to MOCK+NV
2026-04-21 18:22:16 -04:00
qazal
e36ff22538 fix dev syntax in emulated amd tests, skip test_tk (#15856)
* fix dev syntax in emulated amd tests

* skip test_tk
2026-04-21 23:47:29 +03:00
Christopher Milan
1a8ba4cbd6 CPU renderers use arch (#15839) 2026-04-20 23:38:29 -04:00
Christopher Milan
6adf4c3cd9 MOCKGPU interfaces (#15796) 2026-04-17 21:56:29 -04:00
George Hotz
ec00cefa5b llm is the only app (#15779)
* tinygrad/llm is the only app

* upd pyproject

* claude refs

* scoping

* min diff
2026-04-17 10:44:48 +08:00
George Hotz
16f50a40a5 remove REMU from tree (#15706)
* no more compare emulators

* remove remu from tree
2026-04-13 20:43:08 +08:00
b1tg
2b5ba0095d qwen3.5 (#15210)
* qwen3.5

* faster

* or

* rm zero hack

* less float

* T=1

* clean

* clean

* 4b

* rope_dim

* Revert "jit: captures linears, not execitems (#15399)"

This reverts commit 9656d97d97.

* DeltaNetBlock

* pairwise_topk

* clean

* Reapply "jit: captures linears, not execitems (#15399)"

This reverts commit cf3deff53d.

* clean topk, _swiglu

* common

* FFNBlock

* clean

* half

* no mix

* qwen3.5 test

* fix ssm cache invalidation

* TransformerConfig

* SSMConfig

* clean

* reset_state

* llm: reuse server conversation tokens to avoid BPE roundtrip cache miss

* import error

* prefill

* none check

* put it back

* clean pairwise_topk

* symbolic: fold BIND(CONST, CONST) to CONST

* clean

* simpler pm

* _cached_msg_count

* stream decoder; ssm checkpoints

* rm checkpoint

* attn_output_gate

* conflict, attn_output_gate

* clean, less has_ssm, assert

* chunked prefill

* _reset_cache

* _reusable_prefix_len

* revert loop

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2026-04-13 15:35:24 +08:00
Christopher Milan
19e96497ee interface in DEV (#15620) 2026-04-06 19:59:28 -04:00
Christopher Milan
645d45d968 DEV has arch (#15577)
Co-authored-by: Comma Device <device@comma.ai>
2026-04-03 19:17:19 -04:00
chenyu
5aeb2273db add amd_copy_matmul.py to CI (#15555)
more tests before cleanup
2026-03-31 22:39:18 -04:00
Christopher Milan
acf239e4d2 specify renderer in DEV, <dev>_<ren>=1 is deprecated (#15551) 2026-03-31 18:35:14 -04:00
Christopher Milan
313937ad6d fix IMAGE TestEnd2End.test_linear_mnist (#15488) 2026-03-26 04:12:47 -04:00
Christopher Milan
bc180a963c deprecate <dev>=1 in favor of DEV=<dev> (#15467)
* start work on target

* add test

* update actions to use DEV

* update docs

* update readmes

* tests need that too

* update example

* update tests (comments)

* fix that test

* ruff

* mypy

* oops

* remove getenvs

* don't add Target yet

* and the test

* lint

* and docs

* more stuff

* assert

* few more fixes

* test assert
2026-03-26 03:48:03 -04:00
nimlgen
9d2d0774b4 remote: disk copies (#15482)
* remote: disk copies

* lineter

* r

* nv

* x
2026-03-25 22:14:25 +03:00
Salman Chishti
84049fdc07 Upgrade GitHub Actions to latest versions (#15446)
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2026-03-24 10:28:49 -04:00
Salman Chishti
9567075e20 Upgrade GitHub Actions for Node 24 compatibility (#15445)
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2026-03-24 10:28:19 -04:00
nimlgen
fa4cdb422e memplan on linears (#15422)
* memplan

* test

* x

* arenas

* correct

* set any size

* ugh

* make hevc happy

* x

* x

* held

* rm old

* del

* x

* fu

* f

* cl

* cl

* ok
2026-03-23 19:50:16 +08:00