Commit Graph

1017 Commits

Author SHA1 Message Date
Sachith Shetty
74567c1958 fix: pass input device to ONNX helper internal tensors (#16242)
* fix: pass input device to onnx methods internal tensors

* test: onnx helper internal tensors use input device
2026-05-19 11:16:33 -07:00
chenyu
dcee90aa3f remove requires_grad use in extra/examples (#16238)
except the ones fed into optimizer
2026-05-16 18:40:26 -04:00
chenyu
8631b6f17d remove use of requires_grad in test/ (#16237) 2026-05-16 17:21:07 -04:00
chenyu
d62c1d83c0 remove Tensor.eye override (#16219)
* remove Tensor.eye override

was only needed for requires_grad arg

* README
2026-05-15 15:40:34 -04:00
Christopher Milan
891a1ae7c2 onnx: remove dtype_fallback (#15717) 2026-05-14 22:06:57 -04:00
George Hotz
83ec66da34 fix a fastdiv edge case (#16199) 2026-05-14 13:12:18 -07:00
George Hotz
8294d105a7 Update the spec in spec.py to match the current state (#16132)
* start work on specv2

* more spec

* more spec

* fix amd emulator

* more spec

* more

* fix test_uop_graph

* move those

* spec=2

* skip those questionable tests

* ptx fix

* more spec=2

* store

* allow custom function in tensor

* spec 2

* fix beam search for tensor cores

* delete the old specs

* fix import
2026-05-11 20:07:47 -07:00
Christopher Milan
039d84ff02 Revert "onnx: deduplicate simple proto parsers" (#16148)
This reverts commit 83eaefcd0f.
2026-05-11 21:45:17 -04:00
nimlgen
70c2480e71 hcq2 to extra (#16126)
* hcq2 in extra

* correct

* some revert from non-extra

* cln

* cpu

* x

* attach

* min

* remove attach

* linter
2026-05-11 17:17:30 +03:00
George Hotz
daed602569 rename BUFFERIZE to STAGE (#16125) 2026-05-10 09:26:46 -07:00
chenyu
235044c9d8 Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD (#16093)
* Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD

* ruff
2026-05-07 23:18:15 -04:00
June
83eaefcd0f onnx: deduplicate simple proto parsers (#16085)
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2026-05-07 18:44:27 -07:00
chenyu
34fe37d64e use FLOORDIV and FLOORMOD (#16048)
* use FLOORDIV and FLOORMOD

also removed CORRECT_DIVMOD_FOLDING

* fix

* Revert "fix"

This reverts commit 86af33b88ef31943c61e67189b072eca4896409a.

* fix

* fix
2026-05-05 18:32:54 -04:00
nimlgen
f6d92b55e6 am: use per pipe reset for gfx11+ (#16006) 2026-05-01 12:56:43 +03:00
nimlgen
dfd2d07005 remove CompiledRunner (#15970)
* rm usage of CompiledRunner

* more tests

* last

* linter

* sink

* remove

* linter
2026-04-29 22:45:48 +03:00
nimlgen
7787f76dcc get_runner -> get_runtime (#15967)
* get_runner -> get_runtime

* do not use get_runner

* fix

* remove get_tunner

* remove

* fix

* x
2026-04-29 18:29:49 +03:00
nimlgen
77965a22e5 local optimize as rewrite (#15953)
* local optimize as rewrite

* better

* x

* slighly rename

* fix

* ugh

* remove

* x

* remove

* not weak
2026-04-28 22:51:04 +03:00
nimlgen
4164666c72 programinfo (#15942)
* programinfo

* fix

* m

* x

* x

* changes

* x

* fix

* rm
2026-04-27 23:12:03 +03:00
nimlgen
e0ff6cc15c remove old schedule (#15930)
* remove old schedule

* tests

* r

* x
2026-04-25 16:46:36 +03:00
nimlgen
a5e9ea7a60 remove schedule batch 4 (#15927)
* remove schedule batch 4

* fini
2026-04-25 12:36:55 +03:00
nimlgen
f2751955cb remove linear_to_schedule from tests (#15912)
* remove linear_to_schedule from tests

* x
2026-04-24 20:02:10 +03:00
nimlgen
e5891acab2 jit: precompile (#15848)
* x

* jit: precompile as sep step

* x

* s

* x

* x

* x

* ?

* ?

* x

* x

* viz

* f

* x

* u

* x

* x
2026-04-23 00:23:32 +03:00
nimlgen
ae9b84d32f rm beam uop (#15844) 2026-04-21 13:10:26 +03:00
nimlgen
01ac1c8c15 remove all run_schedule from tests (#15846) 2026-04-21 12:02:10 +03:00
Christopher Milan
6adf4c3cd9 MOCKGPU interfaces (#15796) 2026-04-17 21:56:29 -04:00
George Hotz
ec00cefa5b llm is the only app (#15779)
* tinygrad/llm is the only app

* upd pyproject

* claude refs

* scoping

* min diff
2026-04-17 10:44:48 +08:00
qazal
12c653a743 remove opts arg in get_program, everything uses opts_to_apply [pr] (#15767)
* check Ops.BEAM in process replay

* remove opts from the get_program api

* lint

* simplify

* cleanup
2026-04-16 22:42:43 +03:00
qazal
96092d110c fix process_replay Ops.BEAM [pr] (#15752) 2026-04-16 07:35:28 +09:00
George Hotz
1ae6528bb6 move schedule into schedule (#15736)
* move schedule into schedule

* callify to root

* sched docs
2026-04-15 11:03:25 +08:00
chenyu
3394d18066 size*itemsize -> nbytes (#15729)
and some UOp.size removal to prep for size to mixin change
2026-04-14 16:27:54 -04:00
George Hotz
f930579b7a llm: change the default port to 8000 so you can remember it (match vLLM) 2026-04-08 11:25:38 +08:00
chenyu
a444be172d lower fuzz_symbolic_symbolic_div timeout (#15619)
mitigate timeout crash due to high total time
2026-04-06 12:58:29 -04:00
nimlgen
604cdbf2f7 am: large allocs aligned to 2mb to use 2mb pages (#15609) 2026-04-05 18:01:31 +03:00
Christopher Milan
645d45d968 DEV has arch (#15577)
Co-authored-by: Comma Device <device@comma.ai>
2026-04-03 19:17:19 -04:00
nimlgen
902edc3781 hcq: hcqbuf in copy (#15595) 2026-04-03 22:47:36 +03:00
Christopher Milan
acf239e4d2 specify renderer in DEV, <dev>_<ren>=1 is deprecated (#15551) 2026-03-31 18:35:14 -04:00
Christopher Milan
adbfd82d1d DEV is ContextVar, setting Device.DEFAULT is deprecated (#15508) 2026-03-30 17:10:49 -04:00
chenyu
f485d0b664 UOp.sum -> usum, prod -> uprod [pr] (#15522)
rename to prep reduce mixin
2026-03-29 04:51:55 -04:00
Christopher Milan
bc180a963c deprecate <dev>=1 in favor of DEV=<dev> (#15467)
* start work on target

* add test

* update actions to use DEV

* update docs

* update readmes

* tests need that too

* update example

* update tests (comments)

* fix that test

* ruff

* mypy

* oops

* remove getenvs

* don't add Target yet

* and the test

* lint

* and docs

* more stuff

* assert

* few more fixes

* test assert
2026-03-26 03:48:03 -04:00
George Hotz
fe2690399b llm: support assistant prefill + refactor to TransformerConfig (#15457)
* llm: support assistant prefill

* refactor to ModelConfig

* TransformerConfig

* more
2026-03-25 10:50:48 +08:00
George Hotz
a33ac869aa llm server: temperature + test client (#15444)
* improvements to the llm server

* eval script

* eval llm

* better eval gets 58.71

* cleanups

* add temperature, but multinomial is absurdly slow

* claude is so smart

* lint

* remove slop

* no more stop
2026-03-24 21:07:15 +08:00
nimlgen
9656d97d97 jit: captures linears, not execitems (#15399)
* jit: captures linears, not execitems

* x

* um

* etsts

* mockcuda
2026-03-21 16:32:12 +08:00
chenyu
da1700e16b dtypes.index -> dtypes.weakint (#15377) 2026-03-20 01:08:46 -04:00
nimlgen
d720d50e12 memory: traverse all valid ranges only (#15338)
* memory: traverse all valid ranges only

* x
2026-03-18 14:03:39 +08:00
Christopher Milan
864d3917d5 add openpilot onnx parser test (#15334) 2026-03-18 00:12:02 -04:00
nimlgen
4b42bb54aa am: reset sdma to start from 0 (#15109) 2026-03-03 18:14:46 +03:00
nimlgen
ccbbca05ef beam: add dev_timeout for am (#15063)
* beam: add dev_timeout for am

* all covered

* fk

* x

* fuzz

* reset

* f
2026-03-01 16:57:29 +03:00
nimlgen
9b3450c9da test gpu crash on cdna (#15062) 2026-02-28 13:17:59 +03:00
nimlgen
faa66e0a61 mi350 hive_reset am repro (#15014) 2026-02-25 21:30:18 +03:00
George Hotz
2611907afb start ripping out old scheduler -- no maps (#14909)
* start ripping out old scheduler -- no maps

* no more metadata
2026-02-20 21:05:04 +08:00