mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-13 08:28:55 +08:00
* Add mmapeak implementation for 7900 XTX * Change identation * Use a template instead of multiple assebly files * Fix output formatting * Reduce register file bank conflicts * More accurate measurement for quick instructions * Add support for gfx1201 * RDNA4 wmma requires less VGRPs * RDNA4 does not have s_cmpk instructions * Add v_wmma_i32_16x16x32_iu4 for gfx1201 * Add sparse wmma instructions * split to tinybox red MLPerf Benchmark --------- Co-authored-by: Panagiotis Kourouklidis <panagiotis.kourouklidis@gmail.com>