Commit Graph

  • b4d267dfd4 llama: only save when small (#16208) wozeparrot 2026-05-14 20:46:29 -04:00
  • ffa1aac7b1 gradient for STORE/AFTER ala clone (#16205) chenyu 2026-05-14 20:17:27 -04:00
  • 770dac0e0d broadcast broadcast_shape_2 George Hotz 2026-05-14 17:04:37 -07:00
  • b827858479 broadcast shape George Hotz 2026-05-14 17:01:20 -07:00
  • 09096ea565 test_gradient_through_clone (#16203) chenyu 2026-05-14 19:26:47 -04:00
  • d4dcd8487b aggressive shape check to prepare for broadcasting (#16202) George Hotz 2026-05-14 16:15:44 -07:00
  • 83ec66da34 fix a fastdiv edge case (#16199) George Hotz 2026-05-14 13:12:18 -07:00
  • ae6c2d2b96 remove dtype.vec(2) from image always novecimg George Hotz 2026-05-14 19:34:04 +00:00
  • 62ea73719d hcq2: share more with graph (#16196) nimlgen 2026-05-14 22:28:11 +03:00
  • 3b8cc31759 disable fast idiv by default, it's broken (#16197) George Hotz 2026-05-14 11:48:27 -07:00
  • 8f811649ff better compiler_cpu invalid arch errors (#16194) Christopher Milan 2026-05-14 11:36:14 -07:00
  • f03a7fd6d1 viz/cli: readable uop json (#16195) qazal 2026-05-14 15:33:10 +03:00
  • 1b779a9058 add gelu approximate="none" (match pytorch) (#16162) C T 2026-05-14 04:53:24 +03:00
  • dd9187d9ee minor hash cleanups (#16190) chenyu 2026-05-13 20:59:24 -04:00
  • 88ac2ac1fd llama: cleanups (#16189) wozeparrot 2026-05-13 20:08:06 -04:00
  • 9a365d9978 ci: fix null image tests (#16188) Christopher Milan 2026-05-13 15:00:05 -07:00
  • ad1fb7c981 hcq2: graph (#16186) nimlgen 2026-05-13 22:49:43 +03:00
  • 3f9f6a51b2 minor image_conv2d cleanup (#16187) chenyu 2026-05-13 15:47:40 -04:00
  • 59c34b9fe0 llm: precise device (#16159) b1tg 2026-05-13 12:16:42 +08:00
  • 3c806ff406 clean up gguf (#16160) b1tg 2026-05-13 12:16:10 +08:00
  • e97f2c1114 llama: only gemm + fa custom kernel (#16180) wozeparrot 2026-05-13 00:03:49 -04:00
  • 38d407fd58 simplify svd more (#16181) chenyu 2026-05-12 23:48:22 -04:00
  • 42c1f4d5b6 fixes of load alt value earlier_gater George Hotz 2026-05-12 20:44:22 -07:00
  • f1fdd2ccec ci: add IMAGE=1 compile-only tests (#16182) Christopher Milan 2026-05-12 20:40:32 -07:00
  • e36288d047 Merge branch 'master' into earlier_gater George Hotz 2026-05-12 20:27:59 -07:00
  • faf7fb7513 update nir renderer for new image style (#16179) George Hotz 2026-05-12 20:25:01 -07:00
  • 7d0c5ab689 ci: ocelot needs nvcc on linux (#16178) Christopher Milan 2026-05-12 20:13:48 -07:00
  • 41c14d3558 Merge branch 'master' into earlier_gater George Hotz 2026-05-12 19:36:50 -07:00
  • eaa362f49e even earlier George Hotz 2026-05-12 19:32:14 -07:00
  • 32138c2418 svd to mixin (#16175) chenyu 2026-05-12 22:29:01 -04:00
  • ebf45b4f34 move gater earlier George Hotz 2026-05-12 19:26:36 -07:00
  • 69e1f3b551 remove vec2 from image in gater (#16165) George Hotz 2026-05-12 19:25:52 -07:00
  • 2172363be5 don't use Tensor indexing in svd (#16174) chenyu 2026-05-12 21:56:19 -04:00
  • 420a08c6d1 qr to mixin (#16173) chenyu 2026-05-12 21:23:25 -04:00
  • c6a82fe927 functional qr and svd (#16172) chenyu 2026-05-12 19:12:08 -04:00
  • 3844a31f87 ci: untangle cuda/ocelot, less apt (#16171) Christopher Milan 2026-05-12 15:14:03 -07:00
  • 316607f004 dsp: don't use docker in ci (#16167) Christopher Milan 2026-05-12 14:11:03 -07:00
  • bdcdf1f1a1 jittable masked_select and nonzero (#16170) chenyu 2026-05-12 16:39:36 -04:00
  • a613bcfc6d allow after on contiguous in spec (#16169) wozeparrot 2026-05-12 16:11:44 -04:00
  • 7c3e3fa154 fix empty input for masked_select and nonzero (#16168) chenyu 2026-05-12 15:36:51 -04:00
  • da3b7e89a4 atol in test_custom_kernel_multi_output_backward_interacting (#16166) chenyu 2026-05-12 14:42:12 -04:00
  • 25583f6dc1 fix cumsum dtype for 0d input (#16164) chenyu 2026-05-12 14:18:08 -04:00
  • 64c81dfd24 add all codegen stages to spec_tensor (#16163) George Hotz 2026-05-12 10:35:38 -07:00
  • 8612385ccb add all codegen stages to spec_tensor fix_src_spec George Hotz 2026-05-12 10:23:03 -07:00
  • f3e3c3851f explicit args to Tensor.rand (#16161) chenyu 2026-05-12 12:53:39 -04:00
  • e93fb5f9b9 hcq2: remove hcqprogram (#16157) nimlgen 2026-05-12 18:49:13 +03:00
  • a708542308 fix ci spec (#16156) nimlgen 2026-05-12 17:57:11 +03:00
  • e5729935c6 time_call (#16152) nimlgen 2026-05-12 16:58:28 +03:00
  • fe39cf148a add Ops.SOURCE test (#16155) qazal 2026-05-12 16:49:32 +03:00
  • 5cd0494b14 viz: canonicalize ast for schedule to codegen linking (#16154) qazal 2026-05-12 16:40:21 +03:00
  • c1d125ff3b llm: add markers to --benchmark (#16153) qazal 2026-05-12 14:14:11 +03:00
  • e9359d9e7d more llama mp fixes (#16151) wozeparrot 2026-05-12 00:29:23 -04:00
  • 09fd80fba6 fix randperm and _multi_like drop requires_grad (#16150) chenyu 2026-05-11 23:23:34 -04:00
  • 8294d105a7 Update the spec in spec.py to match the current state (#16132) George Hotz 2026-05-11 20:07:47 -07:00
  • 3942a80f66 fix wrong kwargs passed into rands (#16149) chenyu 2026-05-11 22:22:06 -04:00
  • 039d84ff02 Revert "onnx: deduplicate simple proto parsers" (#16148) Christopher Milan 2026-05-11 18:45:17 -07:00
  • 20f587d5d5 nv: rm _download (#16147) Christopher Milan 2026-05-11 16:56:37 -07:00
  • 371ab2023f clean up image_dot and image_conv2d (#16145) chenyu 2026-05-11 19:37:58 -04:00
  • effa263865 Torch backend aten::cat.out fix (#16121) Vikram Rangarajan 2026-05-11 19:28:16 -04:00
  • 63c1f00b80 disable test_svd_general again (#16146) chenyu 2026-05-11 19:24:32 -04:00
  • 2dccd4a3eb am: autogen pmc (#16143) Christopher Milan 2026-05-11 16:22:12 -07:00
  • 7ba55ad3ba nv: autogen regs (#16139) Christopher Milan 2026-05-11 15:52:24 -07:00
  • 0b02fb6797 Revert "[pr] match torch rmsnorm (#16122)" (#16144) chenyu 2026-05-11 17:53:42 -04:00
  • fbe8be0b8b style cleanup to Tensor.qr and svd (#16142) chenyu 2026-05-11 17:16:59 -04:00
  • fc2cc1d77a viz: call graph renderer example (#16141) qazal 2026-05-11 23:07:30 +03:00
  • f65e343fb3 spec.py cleanups (#16140) chenyu 2026-05-11 15:59:49 -04:00
  • 692257dd70 [pr] match torch rmsnorm (#16122) Joshua James Venter 2026-05-11 20:36:41 +02:00
  • 01848d1e17 feat: results mlperf_training_v6.0 Woze Parrot 2026-05-11 17:56:29 +00:00
  • 9938b5da8b feat: training 6.0 Woze Parrot 2026-05-11 16:22:47 +00:00
  • 59a81559d4 fix: add self.device to qr, svd, masked_select intermediates (#16131) Sachith Shetty 2026-05-11 08:22:54 -07:00
  • 70c2480e71 hcq2 to extra (#16126) nimlgen 2026-05-11 17:17:30 +03:00
  • ad9738892c get_buf() for Buffer (#16134) nimlgen 2026-05-11 16:36:14 +03:00
  • 2dd84416bf viz/cli: schedule renderer (#16101) qazal 2026-05-10 19:56:16 +03:00
  • 53f9587099 add canary George Hotz 2026-05-10 09:38:18 -07:00
  • 28cb7f1bcc update readme with contributing guidelines George Hotz 2026-05-10 09:35:48 -07:00
  • daed602569 rename BUFFERIZE to STAGE (#16125) George Hotz 2026-05-10 09:26:46 -07:00
  • 39ce780907 viz/cli: emit all runs of selected kernel, json fixes (#16124) qazal 2026-05-10 15:45:51 +03:00
  • 51c7dafb0d split viz cli test helpers (#16123) qazal 2026-05-10 13:42:24 +03:00
  • b2a682ec60 remove _shape check in pm_mops [pr] (#16120) chenyu 2026-05-09 17:54:22 -04:00
  • 026688f03f llama: move to correct dir (#16118) wozeparrot 2026-05-08 22:42:16 -04:00
  • a7512e0d12 PYTHON: images have no alignment constraints (by default) (#16115) Christopher Milan 2026-05-08 17:35:03 -07:00
  • b910f1d5c0 something image_no_vec George Hotz 2026-05-08 17:32:30 -07:00
  • e14b2b41c6 move image index George Hotz 2026-05-08 17:27:01 -07:00
  • 105b037c3c cl: image alignment in arch (#16106) Christopher Milan 2026-05-08 16:33:33 -07:00
  • bf05a2762e Merge branch 'master' into image_no_vec George Hotz 2026-05-08 16:32:08 -07:00
  • 71a8c0da09 fix: trailing space format string (#16005) Charlie Kerfoot 2026-05-08 18:31:10 -05:00
  • 4dd6ad3514 gradient: add TRUNC backward (#15925) Pawan 2026-05-09 04:57:55 +05:30
  • 5152ff95e7 _pad_constant and avg_pool2d cleanups (#16110) chenyu 2026-05-08 18:09:47 -04:00
  • 08747264cf fixes George Hotz 2026-05-08 11:07:09 -07:00
  • f68c224b71 don't use vec(2) for image index George Hotz 2026-05-08 10:52:24 -07:00
  • e6584532f4 minor elementwise cleanups (#16102) chenyu 2026-05-08 13:38:34 -04:00
  • 49b55af619 jit: simpler free_intermediates (#16099) nimlgen 2026-05-08 19:08:33 +03:00
  • 0f46c08582 div mixin cleanups (#16100) chenyu 2026-05-08 12:05:37 -04:00
  • 235044c9d8 Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD (#16093) chenyu 2026-05-07 23:18:15 -04:00
  • faabe6aa42 nv: remaining firmware from /lib/firmware (#16088) Christopher Milan 2026-05-07 20:07:43 -07:00
  • 7ef901a81d llm: moe speedup (#16059) b1tg 2026-05-08 10:06:35 +08:00
  • 80da8a4b9c add spec to main tinygrad repo (#16092) George Hotz 2026-05-07 18:52:49 -07:00
  • 83eaefcd0f onnx: deduplicate simple proto parsers (#16085) June 2026-05-07 18:44:27 -07:00
  • c106c73e51 remove the gate from index (#16081) George Hotz 2026-05-07 18:42:00 -07:00
  • d11f4d0ec2 fix: don't copy on slice of DP weight (#16089) wozeparrot 2026-05-07 20:58:01 -04:00