Commit Graph

  • 6815f28849 dtype.vec shapes (#16287) George Hotz 2026-05-21 11:56:49 -07:00
  • afc5bfa183 llama: remove fused grad accum (#16301) wozeparrot 2026-05-21 12:38:40 -04:00
  • a321700baa hcq2: multi prereqs (#16304) nimlgen 2026-05-21 17:00:52 +03:00
  • e33e058d34 set SPLIT_W13=0 for 8b DP by default (#16302) qazal 2026-05-21 16:09:10 +03:00
  • dd279ee25e print dtype decomp warning in DEBUG=2 (#16300) Christopher Milan 2026-05-20 19:08:48 -07:00
  • ec547250ef don't use dtype vec for image idx (#16298) George Hotz 2026-05-20 18:45:13 -07:00
  • 172f9493e1 move is_dtype_supported to renderer (#16226) Christopher Milan 2026-05-20 18:19:37 -07:00
  • d548f8d0f3 use clone instead of unique_const in allreduce [pr] (#16297) chenyu 2026-05-20 18:58:47 -04:00
  • 9e88b08f93 x86: don't use id (#16296) qazal 2026-05-21 01:36:40 +03:00
  • da07b28998 am: override smu 13_0_7 to 13_0_0 (#16292) Christopher Milan 2026-05-20 15:14:30 -07:00
  • beea4633fc UOp.clone [pr] (#16295) chenyu 2026-05-20 17:47:49 -04:00
  • a19fa2908f fix x86 nondeterminism (#16293) qazal 2026-05-20 23:48:05 +03:00
  • 58d58c1659 remove DEVECTORIZE (#16290) George Hotz 2026-05-20 13:25:49 -07:00
  • 825f30bf18 llama: apply_grad saves memory (#16275) wozeparrot 2026-05-20 16:14:06 -04:00
  • a88feef40f hcq2: cleanups (#16278) nimlgen 2026-05-20 21:48:50 +03:00
  • a01d5918af fix: qlinearconv quant params (#16234) Philipp Braun 2026-05-20 21:31:41 +03:00
  • f597ec8d5f Merge branch 'master' into minigen minigen George Hotz 2026-05-20 11:22:40 -07:00
  • 19535df53c enable broadcasting in _shape (#16285) George Hotz 2026-05-20 11:21:51 -07:00
  • 4dbe6a2ee7 remove _force_unique from Tensor init (#16277) chenyu 2026-05-20 14:13:05 -04:00
  • fe2d8d1ecf filter by base_class in pci_scan_bus on macOS (#16282) Christopher Bradford 2026-05-20 10:09:35 -07:00
  • 1e0fffe256 fused ce llama kernel in UOps (#16263) qazal 2026-05-20 13:45:28 +03:00
  • e1715b3b92 extent jit const error to deviceless inputs (#16276) chenyu 2026-05-20 02:02:45 -04:00
  • 170b857da9 clean up deviceless const _buffer (#16274) chenyu 2026-05-19 22:47:45 -04:00
  • 7af7b6703a relax policy ASSERT_MIN_STEP_TIME to 3.2 (#16273) chenyu 2026-05-19 22:29:09 -04:00
  • 188d7ec15e clone can take device (#16271) chenyu 2026-05-19 21:29:27 -04:00
  • 361553c0a8 llama: match flat_llama with model_train (#16269) wozeparrot 2026-05-19 20:25:56 -04:00
  • da7414d6dc fix RUN_PICKLE and test it (#16272) George Hotz 2026-05-19 17:00:25 -07:00
  • 55515747b7 Remove Ops.VCONST (#16267) George Hotz 2026-05-19 16:35:24 -07:00
  • 7cdd9cbdeb PYTHONREMU: V_CVT_PK_BF8_F32 saturation (#16268) Christopher Milan 2026-05-19 16:29:59 -07:00
  • 4524c78e87 broadcast George Hotz 2026-05-19 15:02:05 -07:00
  • 651031e3da Merge branch 'master' into minigen George Hotz 2026-05-19 14:12:51 -07:00
  • bb2a51f1ea fix mypy mockgpu and add tinygrad.renderer.isa to packages (#16265) Christopher Milan 2026-05-19 13:45:03 -07:00
  • 890b731b1e more prerequisuite test changed for deviceless const (#16264) chenyu 2026-05-19 15:43:45 -04:00
  • aa1e59ab97 X86 with Ops.INS (#14873) ttomsa 2026-05-19 20:42:54 +01:00
  • b2e8102209 25000 lines for x86 backend George Hotz 2026-05-19 11:27:41 -07:00
  • 74567c1958 fix: pass input device to ONNX helper internal tensors (#16242) Sachith Shetty 2026-05-19 11:16:33 -07:00
  • a178301dbe PYTHONREMU: fix CDNA VOP3 conditional writes (#16258) Christopher Milan 2026-05-19 10:31:31 -07:00
  • b3dcf8f452 hcq2: split into schedule/realize (#16216) nimlgen 2026-05-19 16:40:17 +03:00
  • e4350e7de9 set hipcc mac docker to 7.1 (#16261) qazal 2026-05-19 15:30:39 +03:00
  • 79ad8f3ef4 Merge branch 'master' into minigen George Hotz 2026-05-18 22:14:50 -07:00
  • a120709671 tighten shape spec for broadcasting (#16206) George Hotz 2026-05-18 22:12:04 -07:00
  • c54151235e expanded George Hotz 2026-05-18 21:00:57 -07:00
  • 8a25d18135 Merge branch 'master' into minigen George Hotz 2026-05-18 20:40:14 -07:00
  • 3f2d401464 all tests pass with NOOPT=1 (#16257) George Hotz 2026-05-18 20:39:51 -07:00
  • e694d7f222 more deviceless const prerequisites [pr] (#16256) chenyu 2026-05-18 23:14:12 -04:00
  • c1076ed56c Tensor.device and UOp.device can be None (#16255) chenyu 2026-05-18 22:08:10 -04:00
  • 4e3c3e66b0 fix weakint George Hotz 2026-05-18 19:06:29 -07:00
  • 474ec6e44d more amd fixes George Hotz 2026-05-18 18:34:16 -07:00
  • 7274239315 emu fixes George Hotz 2026-05-18 18:23:10 -07:00
  • a3d59faef6 llama: don't save weight (#16252) wozeparrot 2026-05-18 20:05:45 -04:00
  • 18b102f355 llama: also use 7.1 comgr, update startup_walltime.sh (#16253) qazal 2026-05-19 02:59:02 +03:00
  • d532b4f533 multi alu with deviceless const (#16251) chenyu 2026-05-18 19:31:53 -04:00
  • 3378626308 fix custom kernel George Hotz 2026-05-18 16:12:46 -07:00
  • 98b8a2b407 llama: use hipcc 7.1 version (#16250) qazal 2026-05-19 02:09:57 +03:00
  • 7515824a6d ci: actually use clang-20, enable bfloat16 (#16249) Christopher Milan 2026-05-18 16:06:43 -07:00
  • 9e408e239f fix rangeify tests George Hotz 2026-05-18 15:56:57 -07:00
  • b7e3b92c54 fix pow George Hotz 2026-05-18 15:37:26 -07:00
  • 754344087a assign for deviceless const source (#16248) chenyu 2026-05-18 17:39:53 -04:00
  • 96ad6f05bf Merge branch 'master' into minigen George Hotz 2026-05-18 14:14:41 -07:00
  • 73e6b4963b to and shard is noop for deviceless uop (#16247) chenyu 2026-05-18 16:11:10 -04:00
  • 50481ec9b4 cl: check for cl_khr_fp64 (#16246) Christopher Milan 2026-05-18 11:42:43 -07:00
  • db639ebe3e deviceless const from UOp (#16243) chenyu 2026-05-18 14:14:12 -04:00
  • bfb2d1f89a Revert "fp8 gemm speedup (#16236)" (#16245) qazal 2026-05-18 20:01:44 +03:00
  • 5ae4dbd599 make slow tests faster (#16244) chenyu 2026-05-18 11:42:02 -04:00
  • 981c12182f remove requires_grad= in tinygrad/ (#16241) chenyu 2026-05-17 16:55:37 -04:00
  • fcdd1af880 remove Tensor.detach override [pr] (#16239) chenyu 2026-05-16 23:58:12 -04:00
  • dcee90aa3f remove requires_grad use in extra/examples (#16238) chenyu 2026-05-16 18:40:26 -04:00
  • 8631b6f17d remove use of requires_grad in test/ (#16237) chenyu 2026-05-16 17:21:07 -04:00
  • d95bf394e1 fp8 gemm speedup (#16236) qazal 2026-05-16 22:58:28 +03:00
  • 0ddc50d050 do not gate backward on requires_grad (#16230) chenyu 2026-05-16 12:29:49 -04:00
  • bef5f717bc fix nolocals and beam (#16232) nimlgen 2026-05-16 18:09:19 +03:00
  • ebcb7b7cc0 fp8 gemm tests with scale args (#16231) qazal 2026-05-16 14:47:58 +03:00
  • e575f778f9 move debug prints (#16218) nimlgen 2026-05-16 13:57:34 +03:00
  • 2d48d7ab09 remove more invalid (#16227) wozeparrot 2026-05-16 05:52:27 -04:00
  • 159694347e llama: fix running flat_llama (#16224) wozeparrot 2026-05-15 23:16:48 -04:00
  • 6d082d46ce fp32 paeak qcom_mmapeak Comma Device 2026-05-16 03:03:59 +00:00
  • 9169a9b674 thread128 support Comma Device 2026-05-16 03:00:57 +00:00
  • f3e7efd39e control flow George Hotz 2026-05-15 19:42:29 -07:00
  • a4b9f67153 add qcom_fp16_mad_peak.py Comma Device 2026-05-16 02:41:31 +00:00
  • a263eca378 Merge branch 'master' into minigen George Hotz 2026-05-15 18:41:57 -07:00
  • 79c0ae5b89 metal: arch is GPU family (#16223) Christopher Milan 2026-05-15 18:22:48 -07:00
  • 2c61f65211 cl: device extensions in arch (#16220) Christopher Milan 2026-05-15 15:59:20 -07:00
  • 2549b14ec2 fix caformer onnx run (#16222) George Hotz 2026-05-15 15:08:36 -07:00
  • 2570bded8b update spec for LOAD (#16221) George Hotz 2026-05-15 14:46:00 -07:00
  • d62c1d83c0 remove Tensor.eye override (#16219) chenyu 2026-05-15 15:40:34 -04:00
  • 07a172dbbb remove noop requires_grad_ calls (#16213) chenyu 2026-05-15 13:31:10 -04:00
  • c6cf9e8f0c remove test_svd_nonfull_5_5 (#16217) chenyu 2026-05-15 13:10:02 -04:00
  • fede39db53 minigen can render more George Hotz 2026-05-15 09:07:20 -07:00
  • 0e90a46ae0 start minigen, a minimal correct codegen George Hotz 2026-05-15 08:43:06 -07:00
  • d54fa86b71 viz/cli: select all calls in graph by default (#16214) qazal 2026-05-15 15:01:44 +03:00
  • 28b98e529d nv: move structs to vram (#16184) nimlgen 2026-05-15 13:41:42 +03:00
  • 409bb0c9ad requires_grad cannot be None (#16212) chenyu 2026-05-15 02:01:04 -04:00
  • c7870f11ff mesa: suggest curl install tip (#16211) Christopher Milan 2026-05-14 21:29:06 -07:00
  • a612b88abb better assert when setitem a refed tensor (#16210) chenyu 2026-05-14 23:40:29 -04:00
  • a75c14f010 some setitem tests (#16209) chenyu 2026-05-14 22:36:25 -04:00
  • 891a1ae7c2 onnx: remove dtype_fallback (#15717) Christopher Milan 2026-05-14 19:06:57 -07:00
  • b4d267dfd4 llama: only save when small (#16208) wozeparrot 2026-05-14 20:46:29 -04:00
  • ffa1aac7b1 gradient for STORE/AFTER ala clone (#16205) chenyu 2026-05-14 20:17:27 -04:00
  • 770dac0e0d broadcast broadcast_shape_2 George Hotz 2026-05-14 17:04:37 -07:00
  • b827858479 broadcast shape George Hotz 2026-05-14 17:01:20 -07:00