Commit Graph

  • 9275f283e5 viz: update flag and display names (#15566) qazal 2026-04-01 15:48:37 +03:00
  • f5c0794df2 fix Tensor.const_like (#15565) chenyu 2026-04-01 08:35:19 -04:00
  • 09f60d80fd llama: fix FP8=1 FAKEDATA=1 (#15564) qazal 2026-04-01 14:53:03 +03:00
  • 6d1e992e89 copyout sharded w/o ioring (#15562) nimlgen 2026-04-01 14:47:29 +03:00
  • 150c456977 add OSError to suppress_finalizing (#15558) nimlgen 2026-04-01 12:33:59 +03:00
  • fc5b94b902 fix UOp.where(const, const) (#15560) chenyu 2026-04-01 05:28:49 -04:00
  • 5aeb2273db add amd_copy_matmul.py to CI (#15555) chenyu 2026-03-31 22:39:18 -04:00
  • 034f617971 NVCCRenderer is separate from CUDARenderer (#15554) Christopher Milan 2026-03-31 18:26:13 -07:00
  • 53ef0d36ec Merge remote-tracking branch 'upstream/master' into new_x86_backend ttomsa 2026-04-01 00:49:39 +01:00
  • 8b5b9a0e90 llama: run_and_time (#15533) wozeparrot 2026-04-01 06:46:16 +08:00
  • acf239e4d2 specify renderer in DEV, <dev>_<ren>=1 is deprecated (#15551) Christopher Milan 2026-03-31 15:35:14 -07:00
  • 5181c8e23a llm: fix nan in kvcache (#15552) nimlgen 2026-04-01 00:38:45 +03:00
  • 3af25ccdb4 docs: minor tinygpu changes (#15550) nimlgen 2026-03-31 21:29:15 +03:00
  • 477d194630 hipcomgr and tinygpu scripts (#15549) nimlgen 2026-03-31 20:07:52 +03:00
  • 83085f103c tinygpu docs (#15545) nimlgen 2026-03-31 19:49:38 +03:00
  • ca89215a59 nv: use nvcc over nak by default (#15547) nimlgen 2026-03-31 18:54:56 +03:00
  • a15345a53e viz/cli: improve --help message (#15546) qazal 2026-03-31 16:31:33 +03:00
  • 10d570b3d5 signed tinygpu (#15541) nimlgen 2026-03-31 14:55:09 +03:00
  • 4ac2552642 improve ReduceMixin.all (#15544) chenyu 2026-03-31 07:54:27 -04:00
  • 89ec22131a tests to show double negation in min is not cancelled (#15543) chenyu 2026-03-31 06:59:13 -04:00
  • 8feb8edc68 gemm/asm: add fp8 support to cdna asm_gemm (#15542) qazal 2026-03-31 13:32:54 +03:00
  • 2939ae8b22 more mixin (#15540) chenyu 2026-03-31 05:46:55 -04:00
  • e69f5f9f69 more movement methods to mixin (#15536) chenyu 2026-03-31 05:16:47 -04:00
  • ceb63c8c2f new bundle id (#15307) nimlgen 2026-03-31 12:16:03 +03:00
  • 467c0af8aa viz: skip flaky sever tests (#15538) qazal 2026-03-31 11:20:30 +03:00
  • f88e255cea gemm/asm: split and parameterize dtype in llama gemm tests (#15408) qazal 2026-03-31 11:12:44 +03:00
  • a63392a565 llm: pairwise ranking topk for MoE expert selection (#15499) b1tg 2026-03-31 12:46:39 +08:00
  • 79cccf3003 write sz output to file (#15534) wozeparrot 2026-03-31 11:16:17 +08:00
  • 6fb038d109 replace CompilerSet with list (#15530) Christopher Milan 2026-03-30 20:07:52 -07:00
  • bc866a93f0 viz: rename exec to sqtt (#15527) qazal 2026-03-31 02:06:51 +03:00
  • adbfd82d1d DEV is ContextVar, setting Device.DEFAULT is deprecated (#15508) Christopher Milan 2026-03-30 14:10:49 -07:00
  • 9583489068 add mlx driver to extra (#15526) nimlgen 2026-03-30 20:28:49 +03:00
  • ad6347f6d8 sqtt: allow mapping sopk to IMMEDIATE packets (#15525) qazal 2026-03-30 17:12:17 +03:00
  • 301b2cea57 move matmul to mixin (#15524) chenyu 2026-03-30 07:39:09 -04:00
  • f0eaac4235 reduce mixin (#15523) chenyu 2026-03-30 05:23:58 -04:00
  • d333ac1242 Merge branch 'master' into new_x86_backend ttomsa 2026-03-29 16:51:49 +01:00
  • f485d0b664 UOp.sum -> usum, prod -> uprod [pr] (#15522) chenyu 2026-03-29 04:51:55 -04:00
  • 36a925e2a2 viz: color wmma, one color map for cli and web (#15519) qazal 2026-03-28 21:53:01 +02:00
  • 0c3e438229 llama: mllog (#15502) wozeparrot 2026-03-29 02:18:25 +08:00
  • 7e57e101d5 better oor message in profiles (#15516) nimlgen 2026-03-28 20:25:07 +03:00
  • 266fb07721 viz: show exec duration (#15484) qazal 2026-03-28 15:48:59 +02:00
  • fe705def0d move more broadcast method to mixin [pr] (#15513) chenyu 2026-03-28 01:48:08 -04:00
  • c0753ab62f XOR simplifcation rules (#15512) chenyu 2026-03-27 23:23:27 -04:00
  • ccaa6bfc19 viz/cli cleanups (#15511) qazal 2026-03-28 01:50:38 +02:00
  • dcc2a5d23b viz/cli: simplify to --source and --item flags (#15510) qazal 2026-03-27 21:46:39 +02:00
  • 0d6fc0f571 jit: graphing in uops (#15489) nimlgen 2026-03-27 19:09:02 +03:00
  • 30ebbe7f17 few more fold valid tests (#15509) chenyu 2026-03-27 10:38:42 -04:00
  • 9e0cc5c6ae create image buffers in late codegen (#15493) Christopher Milan 2026-03-27 01:50:53 -07:00
  • 1198d6e908 move pow to mixin (#15507) chenyu 2026-03-27 03:16:40 -04:00
  • 323fcefd7d Revert "DEV is a ContextVar (#15505)" (#15506) chenyu 2026-03-27 02:22:40 -04:00
  • fdb30cba96 DEV is a ContextVar (#15505) Christopher Milan 2026-03-26 21:57:09 -07:00
  • a65e958be9 llama: new apply_grad (#15503) wozeparrot 2026-03-27 10:39:25 +08:00
  • 67a50fb738 move where on load with casts (#15492) Christopher Milan 2026-03-26 19:11:27 -07:00
  • 85e6b77c13 Merge branch 'master' into new_x86_backend ttomsa 2026-03-26 23:34:27 +00:00
  • 586c49642f viz/cli: test in CI (#15501) qazal 2026-03-26 23:47:15 +02:00
  • 3f9f0fa846 viz: yield sqtt alt events (#15500) qazal 2026-03-26 21:43:41 +02:00
  • a2b32b3abf Merge remote-tracking branch 'upstream/master' into new_x86_backend ttomsa 2026-03-26 16:28:02 +00:00
  • 237c25031f sqtt: construct OTHER_SIMD op types with for loop (#15495) qazal 2026-03-26 16:07:18 +02:00
  • 7193f90746 test view input in jit (#15497) nimlgen 2026-03-26 21:59:47 +08:00
  • de24b3fe37 jit: pass init params straight to base (#15496) nimlgen 2026-03-26 21:59:10 +08:00
  • ec5b7a249e viz: refactor sqtt timeline builder (#15494) qazal 2026-03-26 14:16:15 +02:00
  • 313937ad6d fix IMAGE TestEnd2End.test_linear_mnist (#15488) Christopher Milan 2026-03-26 01:12:47 -07:00
  • bc180a963c deprecate <dev>=1 in favor of DEV=<dev> (#15467) Christopher Milan 2026-03-26 00:48:03 -07:00
  • 8426f820a1 Tensor.sub to mixin (#15486) chenyu 2026-03-25 23:20:56 -04:00
  • 1ca178f379 llama: stochastic rounding (#15456) wozeparrot 2026-03-26 09:16:31 +08:00
  • 7c8f992894 move EXPAND dtype cast back to gradient.py (#15481) chenyu 2026-03-25 19:25:26 -04:00
  • 9d2d0774b4 remote: disk copies (#15482) nimlgen 2026-03-26 03:14:25 +08:00
  • 7c2c8d3905 viz: small ux improvements (#15483) qazal 2026-03-25 20:18:25 +02:00
  • 737d5f67f9 viz: compute canvas dims for auto zoom (#15474) qazal 2026-03-25 17:05:23 +02:00
  • 60bd546593 sqtt: add cycle count to rdna3 enums (#15473) qazal 2026-03-25 16:19:54 +02:00
  • 142bf11926 logical_not to mixin [pr] (#15472) chenyu 2026-03-25 09:16:45 -04:00
  • 25ff7146f2 add a status line to REMOTE with DEBUG=1 (#15471) George Hotz 2026-03-25 20:54:56 +08:00
  • f819f82c89 r4 qazal 2026-03-25 14:21:12 +02:00
  • be78f614f9 Merge branch 'master' into rdna4_gemm qazal 2026-03-25 14:19:47 +02:00
  • c973b508b8 viz/cli: pass ctrlc (#15470) qazal 2026-03-25 14:13:28 +02:00
  • c1a7d90ccc python speedups of hot paths (#15469) George Hotz 2026-03-25 20:02:42 +08:00
  • ae7090b13b print function timing with DEBUG=2 (#15468) George Hotz 2026-03-25 19:07:32 +08:00
  • e7f389efda fix height=1 images on macos (#15460) Christopher Milan 2026-03-25 02:59:56 -07:00
  • 789628df2e hotfix: add USE_BOT flag to ASM24 USB George Hotz 2026-03-25 15:00:08 +08:00
  • cd1a276f47 llm: support gguf path or url (#15464) George Hotz 2026-03-25 14:43:19 +08:00
  • 713b322e70 add weakint to promo_lattice (#15463) chenyu 2026-03-25 00:27:34 -04:00
  • 02878c5a2f move _broadcasted to OpMixin (#15461) chenyu 2026-03-24 23:56:01 -04:00
  • 1fb940f762 Merge branch 'master' into new_x86_backend ttomsa 2026-03-25 03:08:17 +00:00
  • 519ba22470 more Tensor._broadcasted cleanup (#15459) chenyu 2026-03-24 22:55:45 -04:00
  • e401546bf0 deepsek work deepsek George Hotz 2026-03-24 21:09:54 +08:00
  • fe2690399b llm: support assistant prefill + refactor to TransformerConfig (#15457) George Hotz 2026-03-25 10:50:48 +08:00
  • fd92aec094 cleanup unused image pitch code (#15458) Christopher Milan 2026-03-24 19:47:16 -07:00
  • f6ed4da268 Tensor.ufix (#15452) chenyu 2026-03-24 22:34:43 -04:00
  • 1b3d00d6ac viz/cli: remove --offset and --limit flags (#15439) qazal 2026-03-25 02:52:27 +02:00
  • da2031266a llama: correct 8b init (#15397) wozeparrot 2026-03-25 04:41:41 +08:00
  • 652bab8aad viz: support nested track_rewrites (#15454) qazal 2026-03-24 22:01:30 +02:00
  • 41eb2cc41b viz: preserve zoom between re renders (#15451) qazal 2026-03-24 20:11:10 +02:00
  • 84049fdc07 Upgrade GitHub Actions to latest versions (#15446) Salman Chishti 2026-03-24 14:28:49 +00:00
  • 9567075e20 Upgrade GitHub Actions for Node 24 compatibility (#15445) Salman Chishti 2026-03-24 14:28:19 +00:00
  • b7960841af support shape broadcast in UOp.alu (#15442) chenyu 2026-03-24 10:14:57 -04:00
  • a33ac869aa llm server: temperature + test client (#15444) George Hotz 2026-03-24 21:07:15 +08:00
  • 9db5d677c7 jit in viz (#15447) nimlgen 2026-03-24 18:23:53 +08:00
  • 2e4fbbcc9c ir3: fix texture mapping and benchmark (#15443) Christopher Milan 2026-03-24 01:52:54 -07:00
  • d5320a9ddf QCOM cleanups (#15435) Christopher Milan 2026-03-23 19:18:38 -07:00
  • 85dee83f5d amd flash attention cleanups + emulator fixes (#15431) George Hotz 2026-03-24 10:10:46 +08:00