tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-11 23:46:02 +08:00

Author	SHA1	Message	Date
Sachith Shetty	74567c1958	fix: pass input device to ONNX helper internal tensors (#16242 ) * fix: pass input device to onnx methods internal tensors * test: onnx helper internal tensors use input device	2026-05-19 11:16:33 -07:00
Christopher Milan	a178301dbe	PYTHONREMU: fix CDNA VOP3 conditional writes (#16258 )	2026-05-19 13:31:31 -04:00
George Hotz	a120709671	tighten shape spec for broadcasting (#16206 ) * tighten shape spec for broadcasting * use IndexError, not ValueError * needs size	2026-05-18 22:12:04 -07:00
George Hotz	3f2d401464	all tests pass with NOOPT=1 (#16257 ) * all tests pass with NOOPT=1 * fix a few more * noopt 100% pass * noopt 100% pass	2026-05-18 20:39:51 -07:00
chenyu	e694d7f222	more deviceless const prerequisites [pr] (#16256 ) * more deviceless const prerequisites [pr] * remove that * arange.contiguous -> arange.clone in tests arange will become deviceless const soon, update tests where it needs to be a buffer	2026-05-18 23:14:12 -04:00
chenyu	c1076ed56c	Tensor.device and UOp.device can be None (#16255 )	2026-05-18 22:08:10 -04:00
chenyu	d532b4f533	multi alu with deviceless const (#16251 )	2026-05-18 19:31:53 -04:00
Christopher Milan	7515824a6d	ci: actually use clang-20, enable bfloat16 (#16249 )	2026-05-18 19:06:43 -04:00
chenyu	754344087a	assign for deviceless const source (#16248 )	2026-05-18 17:39:53 -04:00
chenyu	73e6b4963b	to and shard is noop for deviceless uop (#16247 )	2026-05-18 16:11:10 -04:00
chenyu	db639ebe3e	deviceless const from UOp (#16243 )	2026-05-18 14:14:12 -04:00
chenyu	5ae4dbd599	make slow tests faster (#16244 )	2026-05-18 11:42:02 -04:00
chenyu	dcee90aa3f	remove requires_grad use in extra/examples (#16238 ) except the ones fed into optimizer	2026-05-16 18:40:26 -04:00
chenyu	8631b6f17d	remove use of requires_grad in test/ (#16237 )	2026-05-16 17:21:07 -04:00
chenyu	0ddc50d050	do not gate backward on requires_grad (#16230 ) DETACH is filtered in _deepwalk. instead of None, it gets 0 grad now	2026-05-16 12:29:49 -04:00
qazal	ebcb7b7cc0	fp8 gemm tests with scale args (#16231 ) * update atol * update fp8 path * more work * update profile.sh	2026-05-16 20:47:58 +09:00
wozeparrot	2d48d7ab09	remove more invalid (#16227 )	2026-05-16 02:52:27 -07:00
Christopher Milan	79c0ae5b89	metal: arch is GPU family (#16223 )	2026-05-15 21:22:48 -04:00
chenyu	d62c1d83c0	remove Tensor.eye override (#16219 ) * remove Tensor.eye override was only needed for requires_grad arg * README	2026-05-15 15:40:34 -04:00
chenyu	07a172dbbb	remove noop requires_grad_ calls (#16213 )	2026-05-15 13:31:10 -04:00
chenyu	c6cf9e8f0c	remove test_svd_nonfull_5_5 (#16217 ) flaky, kinda overlap with test_svd_general	2026-05-15 13:10:02 -04:00
chenyu	409bb0c9ad	requires_grad cannot be None (#16212 ) final goal is to remove requires_grad, first change the default to True, and don't allow None	2026-05-15 02:01:04 -04:00
chenyu	a612b88abb	better assert when setitem a refed tensor (#16210 ) also decouple from requires_grad	2026-05-14 23:40:29 -04:00
chenyu	a75c14f010	some setitem tests (#16209 )	2026-05-14 22:36:25 -04:00
Christopher Milan	891a1ae7c2	onnx: remove dtype_fallback (#15717 )	2026-05-14 22:06:57 -04:00
chenyu	ffa1aac7b1	gradient for STORE/AFTER ala clone (#16205 )	2026-05-14 20:17:27 -04:00
chenyu	09096ea565	test_gradient_through_clone (#16203 ) backward through clone crashes now	2026-05-14 19:26:47 -04:00
George Hotz	83ec66da34	fix a fastdiv edge case (#16199 )	2026-05-14 13:12:18 -07:00
George Hotz	3b8cc31759	disable fast idiv by default, it's broken (#16197 ) * disable fast idiv by default, it's broken * fix fast idiv tests	2026-05-14 11:48:27 -07:00
C T	1b779a9058	add gelu approximate="none" (match pytorch) (#16162 ) * add gelu approximate="none" (match pytorch) * lint * pass through onnx Gelu approximate * type annotate * explicit math.sqrt * keep tinygrad's gelu approximate="tanh" default	2026-05-13 18:53:24 -07:00
b1tg	3c806ff406	clean up gguf (#16160 )	2026-05-12 21:16:10 -07:00
chenyu	38d407fd58	simplify svd more (#16181 ) all the slowness is scheduling	2026-05-12 23:48:22 -04:00
chenyu	32138c2418	svd to mixin (#16175 )	2026-05-12 22:29:01 -04:00
chenyu	2172363be5	don't use Tensor indexing in svd (#16174 ) prepare mixin, also about 4X faster for 8x8 input	2026-05-12 21:56:19 -04:00
chenyu	420a08c6d1	qr to mixin (#16173 )	2026-05-12 21:23:25 -04:00
chenyu	bdcdf1f1a1	jittable masked_select and nonzero (#16170 ) * jittable masked_select and nonzero make jittable with `size=`, matches jax * COMPILE_ONLY	2026-05-12 16:39:36 -04:00
wozeparrot	a613bcfc6d	allow after on contiguous in spec (#16169 ) * feat: allow after on contiguous * feat: add test	2026-05-12 13:11:44 -07:00
chenyu	7c3e3fa154	fix empty input for masked_select and nonzero (#16168 )	2026-05-12 15:36:51 -04:00
chenyu	da3b7e89a4	atol in test_custom_kernel_multi_output_backward_interacting (#16166 )	2026-05-12 14:42:12 -04:00
chenyu	25583f6dc1	fix cumsum dtype for 0d input (#16164 )	2026-05-12 14:18:08 -04:00
George Hotz	64c81dfd24	add all codegen stages to spec_tensor (#16163 )	2026-05-12 10:35:38 -07:00
chenyu	f3e3c3851f	explicit args to Tensor.rand (#16161 ) added requires_grad, other kwargs were silently dropped	2026-05-12 12:53:39 -04:00
nimlgen	e5729935c6	time_call (#16152 ) * time_call * x * fix caches	2026-05-12 16:58:28 +03:00
qazal	fe39cf148a	add Ops.SOURCE test (#16155 ) * simple failing test * raises * change	2026-05-12 22:49:32 +09:00
qazal	5cd0494b14	viz: canonicalize ast for schedule to codegen linking (#16154 ) * simple failing test * always null device * viz: canonicalize ast for schedule to codegen linking * SCACHE	2026-05-12 22:40:21 +09:00
chenyu	09fd80fba6	fix randperm and _multi_like drop requires_grad (#16150 )	2026-05-11 23:23:34 -04:00
George Hotz	8294d105a7	Update the spec in spec.py to match the current state (#16132 ) * start work on specv2 * more spec * more spec * fix amd emulator * more spec * more * fix test_uop_graph * move those * spec=2 * skip those questionable tests * ptx fix * more spec=2 * store * allow custom function in tensor * spec 2 * fix beam search for tensor cores * delete the old specs * fix import	2026-05-11 20:07:47 -07:00
chenyu	3942a80f66	fix wrong kwargs passed into rands (#16149 ) working towards explicit args for these	2026-05-11 22:22:06 -04:00
Christopher Milan	039d84ff02	Revert "onnx: deduplicate simple proto parsers" (#16148 ) This reverts commit `83eaefcd0f`.	2026-05-11 21:45:17 -04:00
chenyu	63c1f00b80	disable test_svd_general again (#16146 ) flaky on CI	2026-05-11 19:24:32 -04:00

1 2 3 4 5 ...

5636 Commits