tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-16 01:48:24 +08:00

Author	SHA1	Message	Date
Christopher Milan	171401e8df	skip modulo by zero in test_dtype_alu (#16404 )	2026-05-27 17:09:05 -04:00
chenyu	c33b767407	bring back test and torch backend change for unique const (#16403 )	2026-05-27 15:16:08 -04:00
chenyu	6da785562b	test_custom_kernel_precompile_multidevice (#16401 ) add a test to show what invalids need	2026-05-27 11:19:16 -04:00
chenyu	3e80f375ee	skip test_setitem_fancy_on_unrealized_view (#16400 ) crashes in linux llvm ci	2026-05-27 09:50:26 -04:00
chenyu	945ed4f689	revert const unique changes (#16395 )	2026-05-27 00:06:41 -04:00
chenyu	fa14cde05c	test update for arange and eye (#16394 ) these will need explicit clone to make a buffer	2026-05-26 22:48:34 -04:00
George Hotz	156a4438d9	rename BUFFER_VIEW to SLICE (#16391 ) * rename BUFFER_VIEW to SLICE * fix comments	2026-05-26 18:15:00 -07:00
Christopher Milan	3adf7f5d95	disable flaky cl test (#16388 )	2026-05-26 19:56:57 -04:00
Christopher Milan	d23659d38b	cleanup some old test skips (#16384 )	2026-05-26 19:07:22 -04:00
George Hotz	fd963038a0	remove allow_any_len from store (#16385 ) * remove allow_any_len from store * a few more * no bv there * more fixes * fixes * oh that	2026-05-26 15:26:53 -07:00
chenyu	0b88827482	remove CONST(UNIQUE) (#16383 )	2026-05-26 14:45:22 -04:00
chenyu	d861c50dce	remove unique_const (#16382 )	2026-05-26 13:53:31 -04:00
George Hotz	bac82d4949	fix emu bug in gfx950 (#16381 ) * fix emu bug in gfx950 * fix renderer	2026-05-26 10:32:03 -07:00
chenyu	9b00defc8c	Revert "remove unique_const (#16372 )" (#16380 ) This reverts commit `09019d6761`.	2026-05-26 12:30:07 -04:00
chenyu	09019d6761	remove unique_const (#16372 ) * remove unique_const * fix SDWA thing * that?	2026-05-26 12:18:03 -04:00
George Hotz	7f1b02854e	bufferview offset is units of input dtype (#16378 )	2026-05-26 08:49:31 -07:00
qazal	846a809af7	viz: add +- toggle for hidden UOps (#16368 ) * first * remove * move src toggles to client side * line * update viz server tests * remove those * logic * cleanup * call matches * fix const arg * add labels * keep changes * the stack on movement ops hiding change * structure * rename to expandedNodes * work * test intention	2026-05-26 22:31:54 +09:00
wozeparrot	76fc39ccc0	gather to single device (#16354 )	2026-05-25 17:27:08 -07:00
Christopher Milan	8ddd1328df	remove getenv(CI) (#16365 ) gone everywhere except test_interop, because torch MPS does not work in actions	2026-05-25 20:23:33 -04:00
George Hotz	689ab6a49f	move buffer view offset to src (#16364 ) * this work? * failed	2026-05-25 17:07:55 -07:00
Christopher Milan	d8f86be613	webgpu: shader-f16 support in arch (#16370 )	2026-05-25 19:20:59 -04:00
qazal	b73d2d17b9	viz/cli: add --interval (#16363 ) * interval support * add test_interval * llama uses interval	2026-05-26 03:35:06 +09:00
chenyu	5d5e02871f	remove Tensor.from_uop (#16344 ) and no device for const in Tensor init	2026-05-24 18:53:09 -04:00
chenyu	926d125a63	update test_stack (#16345 ) also skip COMPILE_ONLY, it was comparing 0==0	2026-05-23 10:42:35 -04:00
chenyu	149a87dac2	deviceless const cleanups (#16341 )	2026-05-22 20:11:01 -04:00
Christopher Milan	451f38155c	start cleanup of the slowest tests (#16339 )	2026-05-22 18:39:36 -04:00
qazal	bbfe4f80ec	quantize_fp8 kernels in uops (#16288 ) * add tests * simple UOp kernel is n^2 * fast kernel matching c++, opts_to_apply=() * remove cpp * simple o(n) kernel, two passes * fuse the loops * works on DEV=CPU * multi regression test * fix multi, this can possibly be its own bugfix * test cleanups * minimal diff * match C in UOps * Revert "match C in UOps" This reverts commit `0bef740c30`. * edit test * match speed with C try 2 * needs_second_gpu * cleanup	2026-05-22 20:54:06 +09:00
chenyu	3115952266	more unique const removal prerequisite (#16328 )	2026-05-21 23:51:40 -04:00
Christopher Milan	c2d06570a5	remove getenv(CI) from core tinygrad (#16326 )	2026-05-21 22:20:33 -04:00
Christopher Milan	150a82de1f	start cleaning up dtype tests (#16324 )	2026-05-21 21:11:49 -04:00
chenyu	31424cda71	Tensor.requires_grad -> is_param (#16325 ) for optimizer	2026-05-21 19:39:57 -04:00
chenyu	720a27bed8	remove many requires_grad= args (#16321 ) * remove many requires_grad= args * doc and example * not cifar	2026-05-21 18:37:11 -04:00
chenyu	73ea36f4ac	full(buffer=True) (#16311 ) make full a buffer with flag to turn off	2026-05-21 16:34:44 -04:00
George Hotz	6815f28849	dtype.vec shapes (#16287 ) * dtype.vec shapes * something * Closer * more passes * shape is in spec * fix reduce * image dtype shape correct * lil * use reshape on image * need BUFFER there * remove that test * fix ptx + x86 * fix nir * x86 fix maybe * x86 fixups * x86 fix * don't check that for NOOP	2026-05-21 11:56:49 -07:00
George Hotz	ec547250ef	don't use dtype vec for image idx (#16298 ) * don't use dtype vec for image idx * double gate * y/x confused * upd * fix nir * simplify_valid_image_load	2026-05-20 18:45:13 -07:00
Christopher Milan	172f9493e1	move is_dtype_supported to renderer (#16226 )	2026-05-20 21:19:37 -04:00
chenyu	beea4633fc	UOp.clone [pr] (#16295 ) generates the store after structure	2026-05-20 17:47:49 -04:00
George Hotz	58d58c1659	remove DEVECTORIZE (#16290 ) * remove DEVECTORIZE * fully remove DEVECTORIZE	2026-05-20 13:25:49 -07:00
Philipp Braun	a01d5918af	fix: qlinearconv quant params (#16234 ) * fix: qlinearconv quant params * fix: simplify reshape --------- Co-authored-by: Philipp Braun <braunphilipp@users.noreply.github.com>	2026-05-20 11:31:41 -07:00
chenyu	4dbe6a2ee7	remove _force_unique from Tensor init (#16277 )	2026-05-20 14:13:05 -04:00
qazal	1e0fffe256	fused ce llama kernel in UOps (#16263 ) * work * using uops * delete things * work * work * higher level uops * cleanups	2026-05-20 19:45:28 +09:00
chenyu	e1715b3b92	extent jit const error to deviceless inputs (#16276 )	2026-05-20 02:02:45 -04:00
chenyu	170b857da9	clean up deviceless const _buffer (#16274 ) process on CPU similar to multi	2026-05-19 22:47:45 -04:00
chenyu	188d7ec15e	clone can take device (#16271 ) useful to materialize const on a specific device	2026-05-19 21:29:27 -04:00
George Hotz	55515747b7	Remove Ops.VCONST (#16267 ) * start removing vconst * remove a lot of vconst * const folding + strict ordering * update tests * spec from minigen * move that	2026-05-19 16:35:24 -07:00
Christopher Milan	7cdd9cbdeb	PYTHONREMU: V_CVT_PK_BF8_F32 saturation (#16268 )	2026-05-19 19:29:59 -04:00
Christopher Milan	bb2a51f1ea	fix mypy mockgpu and add tinygrad.renderer.isa to packages (#16265 )	2026-05-19 16:45:03 -04:00
chenyu	890b731b1e	more prerequisuite test changed for deviceless const (#16264 )	2026-05-19 15:43:45 -04:00
ttomsa	aa1e59ab97	X86 with Ops.INS (#14873 ) * draft * cleanup test_encodings * cleanup test_isel * model flag state and support rematerialization * woops * add vbroadcastss instruction * don't fuse load if used multiple times in src * add movabs instruction and fix idiv * fixes * add x86 backend to tests * float16 fix * rm TwoAddress2nd * add BARRIER * test windows ci * yup isel fixes the mask stuff too and its beautiful * add cmoves to the spec * support storing imms * no TUPLE_ORDER, breaks tests * fix remaining seg faults * add float max * always fuse index * minor * fix DEFINE_VAR/SPECIAL and enable multithreading * linter * more linter * more * more * more * let's try this * perhaps * start new scheduler * more scheduling info * cleaner shuffle functions * fixup isel tests * skip bounds check when NOOPs exist * skip inf rewrite tests * fix const tag hack and add x86ops to _shape * fix * skip a few tests * func arg order independent from op value * x86 goes in own linearize * switch to PARAM * more * add min x86op and neg in decomps * do mulacc in isel * use def_reg in test_encodings * enable emulated int64 tests * how much does this fix * Ops becomes OpType * fix * rm noqa * rm machine scheduler stuff * and this * allow for extending enums and move X86Ops out of uop * fix imports * rm X86GroupOp from ops.py * spacing * tell mypy to shut up * more linter * add x86op test * allow set[X86Ops] in upat * move NOOPs to pre_isel_matcher and rm NOOP from spec * more asserts * also this * cleanup encode * simplify live range * fix idiv * add Ops.INS to x86 * more changes * more changes * more changes * fix * fix * fix * fix * print formatted assembly * fix 8bit idiv? * oops * enable float16 and unaligned vector load/store * actually no * move x86 tests * no more bool cast * fix * linter * linter * move X86Ops to x86.py * fix vpbroadcast * cleanups * linter * print correct reg names * canonical max * move max/min and add test * support float16 vector load/store * rm bad rewrite * vpsrldq can't access memory * regalloc takes renderer * enable vector load/store on all dtypes * more isel tests * rm this for now * a lot better * fix * fix * fix * deal with flags correctly * fix * enable gep noop rule * fix * fix * fix * add callee saved registers * use Ops.CONST instead of X86Ops.IMM * fix * enable TUPLE_ORDER * fix * rm x86 code in linearizer * fix * fix * fix * move isa rewrites to codegen * fix * fix * skip test_linearizer.py * skip more tests * fix * fix for idiv/mod changes * fix * don't use fmadd if it duplicates fused op * hacky * fix * cleanups * cleanups * fix --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2026-05-19 12:42:54 -07:00
Sachith Shetty	74567c1958	fix: pass input device to ONNX helper internal tensors (#16242 ) * fix: pass input device to onnx methods internal tensors * test: onnx helper internal tensors use input device	2026-05-19 11:16:33 -07:00

1 2 3 4 5 ...

5685 Commits