Commit Graph

  • 4e05b06c57 Deployed a837103 with MkDocs version: 1.6.1 gh-pages github-actions[bot] 2026-06-11 09:15:12 +00:00
  • a83710396c support mselect input to CALL, less kernels in allreduce (#16567) master qazal 2026-06-11 17:10:47 +08:00
  • 7d4a77dce4 relax comma benchmark timeout (#16568) qazal 2026-06-11 17:03:37 +08:00
  • be87a115fc resolve mstack update_benchmark qazal 2026-06-11 15:41:15 +08:00
  • f4d94c35fb support mselect input to CALL, less kernels in allreduce qazal 2026-06-11 15:24:53 +08:00
  • 21f1101691 add allreduce kernel count test (#16566) qazal 2026-06-11 14:54:12 +08:00
  • c38d6a7e3a mxfp8 part 2 (#16561) wozeparrot 2026-06-11 02:36:11 -04:00
  • 83971860d8 ci: simplify webgpu install (#16557) Christopher Milan 2026-06-10 19:57:19 -07:00
  • 6e1b61f16f cleanup some amd deps (#16563) Christopher Milan 2026-06-10 16:01:56 -07:00
  • 7e6d617935 addrspace cleanups (#16565) George Hotz 2026-06-10 15:57:18 -07:00
  • 8dcfea4a96 fixes memreg George Hotz 2026-06-10 15:11:24 -07:00
  • 3c14bf2568 fix idx George Hotz 2026-06-10 14:59:48 -07:00
  • ed27e2ccda fix cast and flaky test George Hotz 2026-06-10 14:53:34 -07:00
  • 8fea97bb3c MEMREG is more correct for cstyle/llvm stuff George Hotz 2026-06-10 14:40:05 -07:00
  • 2c9d2c0d31 jit: memplan before compile (#16560) nimlgen 2026-06-10 15:05:15 +03:00
  • 34481830f1 rangeify: fix cost function for AFTER(out, CALL) (#16559) qazal 2026-06-10 16:30:50 +08:00
  • 623b66e0e4 more tensor and mixin cleanups [PR] (#16558) chenyu 2026-06-10 00:39:33 -04:00
  • 7366d32247 getitem cleanups [PR] (#16556) chenyu 2026-06-09 22:48:58 -04:00
  • fd76ac992e cstyle renderer is new style [pr] (#16484) George Hotz 2026-06-09 18:36:01 -07:00
  • 97d483350c ci: download prebuilt ocelot (#16554) Christopher Milan 2026-06-09 16:51:33 -07:00
  • f9d88d3c3a fix race in test_quantize_onnx (#16555) Christopher Milan 2026-06-09 15:39:48 -07:00
  • 2bdc360606 gemm: mxfp8 hipkittens gemm (#16541) wozeparrot 2026-06-09 18:20:05 -04:00
  • 12addee14f tesnor and mixin cleanups [PR] (#16553) chenyu 2026-06-09 15:33:13 -04:00
  • 2ab2d51099 hcq2: fix repeated calls (#16552) nimlgen 2026-06-09 19:11:42 +03:00
  • 3f053a3370 move functional part of rand to RandMixin (#16551) chenyu 2026-06-09 09:40:48 -04:00
  • fa31c744b9 hcq2: cleaner (#16550) nimlgen 2026-06-09 16:33:05 +03:00
  • 598cc13ad2 more readable null graph profile in VIZ (#16548) qazal 2026-06-09 17:35:05 +08:00
  • d18ad49f20 fix flaky test_disktensor (#16549) qazal 2026-06-09 17:23:22 +08:00
  • fa400f9790 less E kernels in all2all (#16546) qazal 2026-06-09 12:51:57 +08:00
  • b8931440ae add all2all schedule test (#16545) qazal 2026-06-09 11:41:35 +08:00
  • 5ef30005fa update hipkittens (#16544) wozeparrot 2026-06-08 21:53:25 -04:00
  • 4e2e2e9956 ocelot: use c.DLL (#16540) Christopher Milan 2026-06-08 18:27:28 -07:00
  • 11fee53527 RandMixin [PR] (#16543) chenyu 2026-06-08 19:11:28 -04:00
  • e2ef5cf5c9 no args and kwargs for _multi_like [PR] (#16539) chenyu 2026-06-08 17:35:15 -04:00
  • 12764161c9 UOp.shard support axis=None [PR] (#16538) chenyu 2026-06-08 11:36:50 -04:00
  • ebc5390c9a advance indexing to mixin [PR] (#16532) chenyu 2026-06-08 09:24:49 -04:00
  • 95d63d6c07 hcq2: lower to ins (#16535) nimlgen 2026-06-08 16:15:30 +03:00
  • 8baca185d5 hcq2: add kfd (#16537) nimlgen 2026-06-08 13:48:27 +03:00
  • 03943cd1a0 use more _uop for cleanup [PR] (#16531) chenyu 2026-06-07 17:41:36 -04:00
  • 937aeaec60 remove device= from UPat.const [PR] (#16530) chenyu 2026-06-07 16:38:43 -04:00
  • eb1238436a more prereqs for DL/DR -> BUFFER (#16529) George Hotz 2026-06-07 12:25:11 -07:00
  • 0336ba8eb1 buffer param arg + dsp fixups (#16528) George Hotz 2026-06-07 12:07:00 -07:00
  • 75e903d533 remove unused device arg from _get_winograd_matcols (#16527) Dmitriy Strunin 2026-06-07 05:15:09 -07:00
  • 90b556ca48 move gradient to mixin [PR] (#16526) chenyu 2026-06-07 00:05:02 -04:00
  • 4e7c6260b0 clean up test_tesnor_uop_mixin (#16525) chenyu 2026-06-06 23:25:44 -04:00
  • 2a2f81dd3d remove ANON from addrspace, refactor marg (#16523) George Hotz 2026-06-06 09:49:09 -07:00
  • e69b4189b0 viz: hide STACK on PARAM by default (#16522) qazal 2026-06-06 15:41:15 +08:00
  • 857b1f5399 ci: more parallelism, less duplication (#16509) Christopher Milan 2026-06-05 18:26:19 -07:00
  • a1ec32cfd2 llama: current grad scaling (#16518) wozeparrot 2026-06-05 18:39:41 -04:00
  • 8c0ba1da5c cleanup more from test/backend (#16521) Christopher Milan 2026-06-05 15:38:46 -07:00
  • 9982185b14 remove unused AFTER rules in pm_add_buffers[PR] (#16519) chenyu 2026-06-05 14:58:34 -04:00
  • 5ebd44aa12 hcq2: merge queues (#16514) nimlgen 2026-06-05 21:20:25 +03:00
  • a51b5ba424 remove early fixup const copy [PR] (#16516) chenyu 2026-06-05 11:35:34 -04:00
  • 8274140134 uop/ops: fix ~bool deprecation warning on Python 3.12+ (ORANGE Grok helped with the patch) (#16512) Nueramarcos 2026-06-05 07:54:30 -07:00
  • 588c759a3d remove unused GroupOp.Buffer [PR] (#16515) chenyu 2026-06-05 10:38:52 -04:00
  • 79a13310b3 viz: kernel_graph.txt unique is per schedule (#16511) qazal 2026-06-05 15:17:28 +08:00
  • 9b0f75622c many jit tests belong in unit (#16508) Christopher Milan 2026-06-04 18:36:53 -07:00
  • bb407d8b3c fix transform_precompiled_call for MULTI (#16510) chenyu 2026-06-04 20:09:58 -04:00
  • f11f63007d llama: immediate scaling on flag (#16494) wozeparrot 2026-06-04 13:30:00 -04:00
  • 4fb8ce1831 update buffer in spec (#16507) George Hotz 2026-06-04 10:12:31 -07:00
  • 4a8bf07a87 remove CONST(DEVICE) (#16506) chenyu 2026-06-04 11:29:46 -04:00
  • 3838c8df1b hcq2: move global sync (#16504) nimlgen 2026-06-04 17:32:40 +03:00
  • 0faaf6df26 remove kwargs from arange and linspace [PR] (#16505) chenyu 2026-06-04 10:32:37 -04:00
  • 3b1a5f9770 llama: a_bT and aT_b bf16 gemms (#16487) qazal 2026-06-04 22:30:21 +08:00
  • 5fad87252d no device= into arange and eye (#16503) chenyu 2026-06-04 09:21:50 -04:00
  • 11af81f96f hcq2: cleaner (#16502) nimlgen 2026-06-04 15:26:37 +03:00
  • 2c915c61ed no CONST(DEVICE) in torch_backend (#16499) chenyu 2026-06-04 00:26:47 -04:00
  • fd13080636 deviceless const skip axis check (#16496) wozeparrot 2026-06-03 22:13:20 -04:00
  • f7f03bd7e5 viz: better name for src id in kernel_graph.txt (#16495) qazal 2026-06-04 10:09:29 +08:00
  • 9dac781e45 ci: use uv (#16492) Christopher Milan 2026-06-03 18:38:50 -07:00
  • 9fdeaa402b no anon addrspace, don't write hacks (#16491) George Hotz 2026-06-03 16:19:30 -07:00
  • 2f83d01ccf fix deviceless materialize device (#16493) chenyu 2026-06-03 19:13:21 -04:00
  • 19eb72ff60 remove use of full with buffer=False and non-None device= (#16489) chenyu 2026-06-03 16:21:24 -04:00
  • 6f2a2857c8 hcq2: refactor deps (#16490) nimlgen 2026-06-03 23:20:24 +03:00
  • 243446b44f remove CONST(DEVICE) from const_like (#16488) chenyu 2026-06-03 14:04:51 -04:00
  • cee472a0ef renderer Estimates uses maxel (#16485) George Hotz 2026-06-03 10:55:00 -07:00
  • 8a4203638a make full with buffer=False deviceless (#16483) chenyu 2026-06-03 12:35:59 -04:00
  • 405866f2b7 viz: improve kernel_graph.py usability (#16486) qazal 2026-06-03 20:12:44 +08:00
  • f43cba5765 ci: native python where possible (#16473) Christopher Milan 2026-06-02 19:40:12 -07:00
  • 7dcfd144b6 llama: columnwise fp8 scaling (#16480) wozeparrot 2026-06-02 21:55:45 -04:00
  • ffadd7a315 remove intel and amx support (#16482) George Hotz 2026-06-02 18:53:05 -07:00
  • 5f439e3b7c refactor cstyle to avoid dtype [PR] (#16478) George Hotz 2026-06-02 18:27:12 -07:00
  • 80eeb4dd21 mockgpu: use autogen.libc (#16479) Christopher Milan 2026-06-02 16:59:36 -07:00
  • a43b55d480 deviceless const folding schedule test (#16477) chenyu 2026-06-02 18:46:30 -04:00
  • 14f843737b renderer cleanups (pt 3) [PR] (#16475) George Hotz 2026-06-02 14:24:24 -07:00
  • c7b6ee0c7d dt.count more_ren_clean George Hotz 2026-06-02 13:19:04 -07:00
  • 13f5d39fcf count George Hotz 2026-06-02 13:16:42 -07:00
  • 431accc9b7 bitsize in nir George Hotz 2026-06-02 13:11:31 -07:00
  • df000116ea more renderer cleanups George Hotz 2026-06-02 13:02:28 -07:00
  • e76de41110 fixes shrink_in_render George Hotz 2026-06-02 12:53:00 -07:00
  • 5768042e3f fix after merge George Hotz 2026-06-02 12:45:51 -07:00
  • 99e37b1ee3 hcq2: deps (#16459) nimlgen 2026-06-02 22:34:25 +03:00
  • ef9c60238e Merge branch 'master' into shrink_in_render George Hotz 2026-06-02 11:20:24 -07:00
  • 82f1c983d4 clean renderer migrations [pr] (#16472) George Hotz 2026-06-02 11:19:00 -07:00
  • 9897658895 ci: fix ocelot compilation on macos (#16471) Christopher Milan 2026-06-02 09:43:31 -07:00
  • 6b7d2b91df update test_uop_graph (#16470) chenyu 2026-06-02 08:53:54 -04:00
  • 854eac09c6 llama: no E_ copy after bf16 GEMM (#16458) qazal 2026-06-02 13:14:13 +08:00
  • 5825d4d833 Merge branch 'master' into shrink_in_render George Hotz 2026-06-01 22:08:00 -07:00
  • 7d8ed8d4d7 add store to buffer's addrspace (#16468) George Hotz 2026-06-01 22:07:43 -07:00
  • 394afe40c5 fix for nir George Hotz 2026-06-01 21:46:45 -07:00