Commit Graph

  • 69eefdca20 images with height=1 have less strict width rules (#15325) Christopher Milan 2026-03-17 04:07:22 -07:00
  • 14eb8170e4 skip TestRunAsModule if libclang is loaded (#15323) chenyu 2026-03-17 06:02:53 -04:00
  • e7c26b6319 viz: rename to Start Cycle for the sqtt graph (#15320) qazal 2026-03-17 11:53:06 +02:00
  • e89a103984 remove dmaref (#15321) nimlgen 2026-03-17 17:52:09 +08:00
  • 3090d4a6e0 disallow reshape from None shape [pr] (#15322) chenyu 2026-03-17 05:46:53 -04:00
  • a50fdb0528 nvcc macos (#15308) nimlgen 2026-03-17 17:25:33 +08:00
  • 9d95321be3 set allow_implicit=False by default (#15319) George Hotz 2026-03-17 17:14:38 +08:00
  • e1c2d09720 system: rebar to remote devs (#15316) nimlgen 2026-03-17 16:09:12 +08:00
  • 79d2e83853 tighter ALU/variable min==max -> CONST rule [pr] (#15317) chenyu 2026-03-17 03:44:24 -04:00
  • 584ec75aa2 precompile backward (#15311) George Hotz 2026-03-17 15:28:40 +08:00
  • 34d2ef6447 Merge branch 'master' into precompile_backward precompile_backward George Hotz 2026-03-17 15:17:41 +08:00
  • 125d987886 no NOOPT George Hotz 2026-03-17 15:14:25 +08:00
  • 6b6d1814ca update no_vectorized_index [pr] (#15313) chenyu 2026-03-17 03:05:23 -04:00
  • c83f2a7154 simpler George Hotz 2026-03-17 15:01:46 +08:00
  • 32a8b3aaa0 split v not split George Hotz 2026-03-17 14:57:53 +08:00
  • 39aae38b2a compact grad George Hotz 2026-03-17 14:37:40 +08:00
  • f909d5c983 fix George Hotz 2026-03-17 14:19:47 +08:00
  • f7512e9595 cleanups George Hotz 2026-03-17 14:14:41 +08:00
  • 856a839efc llm: fix qwen3 moe topk renormalization (#15201) b1tg 2026-03-17 12:57:33 +08:00
  • 1283b57b4e update fix_store_after_hazard (#15309) chenyu 2026-03-16 23:55:59 -04:00
  • 9f591b42d1 add precompile backward support George Hotz 2026-03-17 11:40:09 +08:00
  • 575b40b93a determine image shapes before index devectorization (#15304) Christopher Milan 2026-03-16 20:16:33 -07:00
  • 3ff03be413 call always has tuple (#15297) George Hotz 2026-03-17 10:58:46 +08:00
  • 1b8b151195 simpler Tensor.assign (#15302) chenyu 2026-03-16 22:37:25 -04:00
  • 674c760974 embedded bwd vocab shard (#15001) wozeparrot 2026-03-17 10:37:16 +08:00
  • 62bfd48d95 smarter padding in image_conv2d (#15289) Christopher Milan 2026-03-16 19:17:48 -07:00
  • e1fab4d2a9 UOp.store is always void [pr] (#15301) chenyu 2026-03-16 21:58:05 -04:00
  • 02afb45f29 remove UOp.assign [pr] (#15300) chenyu 2026-03-16 21:45:41 -04:00
  • 33bd33e783 sqtt: add CDNA ops enum, show in viz (#15140) qazal 2026-03-17 02:38:42 +02:00
  • 3e2b7803e6 view assign replaces at buffer identity (#15298) chenyu 2026-03-16 19:58:38 -04:00
  • 346596cdce viz: nanoseconds time axis in sqtt (#15299) qazal 2026-03-17 00:20:18 +02:00
  • 1bc4cb254c signed tinygpu as default (#15296) nimlgen 2026-03-16 19:29:41 +08:00
  • 0de519c7c2 [pr] fewer simplify calls in image_fixup (#15283) Christopher Milan 2026-03-16 03:57:52 -07:00
  • 27e29127b5 system: remote prereqs (#15290) nimlgen 2026-03-16 18:45:41 +08:00
  • 837b06c609 style cleanups in allocations.py [pr] (#15295) chenyu 2026-03-16 05:45:24 -04:00
  • 476276f4b4 support grads on tuples (#15287) George Hotz 2026-03-16 17:39:34 +08:00
  • 20799df10b remove Ops.ASSIGN [pr] (#15294) chenyu 2026-03-16 05:22:21 -04:00
  • b3378e7022 UOp.assign is store+after [pr] (#15292) chenyu 2026-03-16 04:51:50 -04:00
  • 2e1c81c23f allow_implicit to disable implicit params (#15291) George Hotz 2026-03-16 16:40:14 +08:00
  • a0d1444790 Tensor.assign is store+after [pr] (#15288) chenyu 2026-03-16 04:04:55 -04:00
  • 08662bc4ab add TUPLE/GETTUPLE, simple tests pass (#15286) George Hotz 2026-03-16 15:06:02 +08:00
  • e7705fe311 system: pcidev doesn't care about bars (#15284) nimlgen 2026-03-16 14:45:43 +08:00
  • e4d8e03954 more test tuplegtuple George Hotz 2026-03-16 12:37:16 +08:00
  • 09574a096a maketuple George Hotz 2026-03-16 12:04:39 +08:00
  • b89c233917 fix George Hotz 2026-03-16 11:56:36 +08:00
  • 00d847afdf single backward gradient George Hotz 2026-03-16 11:44:15 +08:00
  • 8d75eed0a4 fix precompile George Hotz 2026-03-16 11:33:52 +08:00
  • e59cbf78bd Add TUPLE and GETTUPLE George Hotz 2026-03-16 11:22:03 +08:00
  • ff0bcc8de0 system: iface p1 changes (#15278) nimlgen 2026-03-16 10:48:25 +08:00
  • 4445f50356 viz: variable duration rdna barriers (#15277) qazal 2026-03-15 23:06:19 +02:00
  • 5cd1daa3bc cdna asm_gemm in one file, remove old rdna3 asm (#15281) qazal 2026-03-15 21:32:30 +02:00
  • cd14e8e64b allocations contiguous is store+after (#15280) chenyu 2026-03-15 11:58:40 -04:00
  • 7b6211fdd7 sqtt: remove discover_ops script (#15279) qazal 2026-03-15 15:17:06 +02:00
  • 473e5e4368 feat: make USE_ATOMICS embedding bwd faster (#15151) wozeparrot 2026-03-15 12:21:10 +08:00
  • 3858bfc83d sqtt: CDNA inst decodes (#15274) qazal 2026-03-14 14:03:46 +02:00
  • d753c5d7e5 IMAGE=1 image_conv2d pads for bank conflicts (#15252) Christopher Milan 2026-03-14 04:59:16 -07:00
  • 9047249a7c m.where(x.pad_to(m.shape), Invalid) ranges shrink (#15275) Christopher Milan 2026-03-14 04:26:36 -07:00
  • f392c53c66 system: merge remote into pciiface (#15273) nimlgen 2026-03-14 18:44:20 +08:00
  • 13eec8fbe8 remove unused assign rules [pr] (#15268) chenyu 2026-03-14 05:37:49 -04:00
  • dabdc986df shrink guarded ranges, try 2 (#15272) Christopher Milan 2026-03-14 01:24:05 -07:00
  • 7cf4b16c91 Revert "shrink guarded ranges" (#15271) Christopher Milan 2026-03-14 00:44:38 -07:00
  • d9951e2f8e shrink guarded ranges (#15263) Christopher Milan 2026-03-14 00:38:48 -07:00
  • 43ffd66fda viz: oneline inst list (#15269) qazal 2026-03-14 08:37:18 +02:00
  • 86f17468ed store in spec + USB BOT fix (#15265) George Hotz 2026-03-14 13:25:05 +08:00
  • 06d7cddb33 amd_copy_matmul is cleaner (#15248) George Hotz 2026-03-14 12:56:09 +08:00
  • b3600e4774 don't emit assign in transform_precompiled_call [pr] (#15262) chenyu 2026-03-13 22:42:35 -04:00
  • 4d60312f7f viz: asm python dsl syntax highlighting (#15259) qazal 2026-03-13 23:37:43 +02:00
  • 6209ddfc90 viz: improve disasm of s_code_end (#15258) qazal 2026-03-13 20:31:14 +02:00
  • a191ac0566 llama: use mlperf model (#15257) wozeparrot 2026-03-13 23:08:32 +08:00
  • 4b59083d7c assign into empty works (#15256) Sieds Lykles 2026-03-13 15:24:29 +01:00
  • 60b1b908c6 sqtt: CDNA layout header packet is the same size (#15255) qazal 2026-03-13 15:28:24 +02:00
  • 4e21735f31 system: update tinygpu app (#15247) nimlgen 2026-03-13 20:36:57 +08:00
  • 1fbe1fef2c move write_configs to drivers (#15253) nimlgen 2026-03-13 19:02:34 +08:00
  • 018c01508d test case for call precompile multi (#15254) chenyu 2026-03-13 06:28:43 -04:00
  • bc16f80b50 am: remove dma_regions param (#15251) nimlgen 2026-03-13 18:12:48 +08:00
  • 576e7f985f remove handle_assign_mops [pr] (#15249) chenyu 2026-03-13 01:53:21 -04:00
  • c251fc67c5 ci: consider arch in venv and apt caches and go back to 3.12 (#15250) Christopher Milan 2026-03-12 21:36:49 -07:00
  • d4b947ea9a ci: explicitly request python 3.12.10 instead of 3.12 (#15246) Christopher Milan 2026-03-12 20:00:46 -07:00
  • a7d2429c21 amd_uop_matmul more cleanups (#15240) George Hotz 2026-03-13 10:24:43 +08:00
  • a164ec328c Merge branch 'master' into more_uop_mm more_uop_mm George Hotz 2026-03-13 10:11:31 +08:00
  • d893b14193 sqtt: update cdna packet names (#15243) qazal 2026-03-13 01:49:09 +02:00
  • 749162bd2f llama memory tweaks (#15223) wozeparrot 2026-03-13 03:36:23 +08:00
  • 9a7173b7a0 viz: visualize full range of shader clock frequency, auto zoom to kernel range (#15225) qazal 2026-03-12 17:07:31 +02:00
  • d9c09397c0 Ops.STORE is shapeless [pr] (#15239) chenyu 2026-03-12 09:05:30 -04:00
  • d746ccb791 system: fix vfio (#15235) nimlgen 2026-03-12 18:31:00 +08:00
  • d104a903f8 system: print output when err (#15230) nimlgen 2026-03-12 18:30:49 +08:00
  • 0c31a8f63b amd_uop_matmul more cleanups George Hotz 2026-03-12 18:19:58 +08:00
  • e560a46f59 update amd_uop_matmul (#15236) George Hotz 2026-03-12 17:33:12 +08:00
  • 90b7f4341d failed two level divmod recombine case (#15233) chenyu 2026-03-12 04:04:36 -04:00
  • ca36ef0186 found_assign assign_try_3 George Hotz 2026-03-12 15:49:43 +08:00
  • 0e28667c0a after George Hotz 2026-03-12 15:30:41 +08:00
  • 6759b66c2b store George Hotz 2026-03-12 15:22:43 +08:00
  • a8b04efec7 ASSIGN is STORE+AFTER (try 3) George Hotz 2026-03-12 15:13:32 +08:00
  • 8b8d9a443c remove unused invalid rules [pr] (#15231) chenyu 2026-03-12 03:10:34 -04:00
  • b1ed293239 go no_assign_2 George Hotz 2026-03-12 14:52:37 +08:00
  • 810d7ee7ec fixes George Hotz 2026-03-12 14:49:02 +08:00
  • 743e5334a4 Merge branch 'master' into no_assign_2 George Hotz 2026-03-12 14:25:13 +08:00
  • bdd62fd484 remove unneeded realize map entries (#15229) George Hotz 2026-03-12 14:23:19 +08:00
  • 75d311d766 no contig there George Hotz 2026-03-12 13:53:11 +08:00
  • fee1f4ccb9 ASSIGN is STORE+AFTER George Hotz 2026-03-12 11:59:10 +08:00