qazal
|
bfb2d1f89a
|
Revert "fp8 gemm speedup (#16236)" (#16245)
This reverts commit d95bf394e1.
|
2026-05-19 02:01:44 +09:00 |
|
qazal
|
d95bf394e1
|
fp8 gemm speedup (#16236)
* add asm_gemm option
* milestone
* work
* edit
* only the fast kernel
* diff
|
2026-05-17 04:58:28 +09:00 |
|
wozeparrot
|
528d35e306
|
llama speed 4 (#15993)
|
2026-04-30 17:14:41 -07:00 |
|
chenyu
|
9192c93b7e
|
Tensor.invalid -> Tesnor.invalids (#15849)
matches ones and zeros, and to not share name with UOp.invalid
|
2026-04-21 11:19:51 -04:00 |
|
wozeparrot
|
9e60e4a7e7
|
llama: native fp8 (#15733)
|
2026-04-16 22:16:05 -07:00 |
|
wozeparrot
|
457508d5a0
|
llama: save more 2 (#15681)
|
2026-04-11 01:03:36 -07:00 |
|
wozeparrot
|
7e54992bf6
|
fp8 llama (#15588)
Co-authored-by: qazal <qazal.software@gmail.com>
|
2026-04-04 18:24:57 -07:00 |
|
Christopher Milan
|
645d45d968
|
DEV has arch (#15577)
Co-authored-by: Comma Device <device@comma.ai>
|
2026-04-03 19:17:19 -04:00 |
|
qazal
|
8feb8edc68
|
gemm/asm: add fp8 support to cdna asm_gemm (#15542)
* work
* hmm, mixins
* rhs_transposed
* also fix the dtype
* check for hipcc
* Exception
* select dev
* default
|
2026-03-31 19:32:54 +09:00 |
|
George Hotz
|
6e196195d8
|
add test for flat llama (#15327)
* add test for flat llama
* simpler
* back to split w1/w3
* env
* still too much ram
* invalid
|
2026-03-18 15:16:33 +08:00 |
|
wozeparrot
|
be23772d43
|
llama3 fixes part2 (#15150)
|
2026-03-04 23:43:50 -08:00 |
|
wozeparrot
|
4e9b85ecfd
|
fa: pull inputs out of call (#15127)
|
2026-03-04 03:15:49 -08:00 |
|
George Hotz
|
8ebd24637b
|
fix fa forward building with clang 22 (#15124)
* fix fa forward building with clang 22
* fix: override rocm path
---------
Co-authored-by: Woze Parrot <wozeparrot@gmail.com>
|
2026-03-04 02:32:25 -08:00 |
|
wozeparrot
|
df23057984
|
fa: change bwd grid dim + unshuffle using mops (#15068)
|
2026-03-04 01:23:40 -08:00 |
|
wozeparrot
|
25565b2410
|
fa: test for mp (#14907)
|
2026-02-22 21:47:36 -08:00 |
|
wozeparrot
|
9317e96881
|
fa: explicitly pass shapes (#14857)
|
2026-02-19 05:26:16 -08:00 |
|
wozeparrot
|
45aebe1572
|
hipkittens fa backward (#14723)
|
2026-02-16 00:38:44 -08:00 |
|
wozeparrot
|
0613c0ac0c
|
hipkittens fa forward (#14692)
|
2026-02-12 20:16:43 -08:00 |
|