Commit Graph

105 Commits

Author SHA1 Message Date
Christopher Milan
9a6f7f7576 nv: look for fmc firmware in /lib/firmware (#16080) 2026-05-07 18:08:27 -04:00
qazal
b63e0a5f74 viz/sqtt: move amd decoder to extra, don't import from ops_amd (#15969)
* don't import from ops_amd

* start

* cleanup
2026-04-30 00:49:15 +09:00
qazal
8c174bdad4 viz/sqtt: correct exec pipes (#15885)
* wmma

* p2

* test

* left

* work

* pickle

* handwritten failing tests

* start work

* test the pipes

* empirical evidence

* update rdna4 enum types

* VALU pipe 1

* TRANSCENDENTAL pipe

* transcendental function units

* reorder

* wmma pipe

* cleanup and notes

* smaller

* work

* diff cleanup

* pickle

* use se:1

* int
2026-04-28 05:05:49 +09:00
qazal
f379b5a40a sqtt: match amd's TS_DELTA_SHORT offset (#15901) 2026-04-24 06:41:22 +03:00
qazal
ac027055ef viz: no global state (#15705)
* start viz data

* get_full_rewrites also moves

* update ref_map

* work

* update consumers

* cleaner cli

* linter

* cleanup tests

* back

* better

* sqtt tests
2026-04-13 21:35:20 +09:00
qazal
3ac16b3bea viz: add wmma row, update exec duration logic (#15646)
* viz: split wmma to its own row, fix duration logic

* regs

* decrease number of loops, add pickle

* assert overlaps
2026-04-08 20:24:23 +09:00
qazal
266fb07721 viz: show exec duration (#15484)
* duration

* handwritten tests

* rdna3 pickle

* rdna4 pickle

* asserts

* rm that

* wmma work

* r4

* this shows the overlap well

* ohh okay it goes back

* are ds_load and ds_store different queues on RDNA4?

* print msg, v_mul_lo_u32 is 4 cycles?

* discover

* wmma something

* wmma comment

* less

* less

* better comments

* work

* inst st

* delay column

* better cli

* emit_alt

* update test_handwritten

* work
2026-03-28 22:48:59 +09:00
qazal
109472c37e sqtt: new s_barrier pickles, handle rdna4 barriers in emulator (#15437) 2026-03-24 03:25:28 +09:00
qazal
4445f50356 viz: variable duration rdna barriers (#15277)
* viz: variable length rdna barriers

* work

* tiny changes

* simple wave simd test

* small wave sync test

* good multi barrier bug find

* simple fix

* wave_sync asserts

* rdna4 work

* more rdna4

* find more bugs in my model

* it's so much simpler

* wave_sync tests duration

* r4

* should just call this rdna4
2026-03-16 06:06:19 +09:00
qazal
7b6211fdd7 sqtt: remove discover_ops script (#15279) 2026-03-15 22:17:06 +09:00
qazal
3858bfc83d sqtt: CDNA inst decodes (#15274)
* sqtt: CDNA inst decodes

* JUMP packets other way

* cdna insts

* r3

* r4

* lds from simd1 and simd2
2026-03-14 21:03:46 +09:00
qazal
83f1faa142 sqtt: update CDNA wave packet field, start unskipping tests (#15168)
* correct field names

* packet types

* packet 5 is regc

* test skips
2026-03-06 21:37:44 +09:00
qazal
33a1970045 sqtt: simplify inst mapping, validate JUMP processing in CI (#15139)
* jump cleanup

* assert there's a JUMP

* new example for JUMP

* regenerate examples

* rdna4 work

* new packets

* work

* less for branch handling

* less verbose

* fix err message
2026-03-05 09:53:12 +09:00
qazal
8dd691761d sqtt: remove old files (#15108) 2026-03-03 22:43:24 +09:00
qazal
b8a55d5f68 sqtt: new packet types, add discovery script (#14960) 2026-02-28 04:27:27 +09:00
qazal
d6145736c7 sqtt: examples generator changes from inst_discovery (#14961)
* sqtt examples generator changes from inst_discovery

* rdna4

* rdna3

* cdna

* sad reality for mi300x
2026-02-23 14:42:48 +09:00
qazal
32f569b573 viz/sqtt: decoder fixes pre rdna4/cdna4 work (#14900)
* viz/sqtt: decoder fixes pre rdna4/cdna4 work

* fix

* branch_inst + more tests

* smaller
2026-02-20 12:10:15 +09:00
George Hotz
c331798201 move tests to test/backend (#14691)
* move tests to test/backend

* fix imports

* fix CI

* revert that one

* Fix formatting in README for test command
2026-02-12 11:09:44 +08:00
nimlgen
fbeb978170 diff devices for sdma (#14589)
* start

* x

* fix

* sdma

* c

* clean

* x

* hm

* cleaer
2026-02-06 16:39:12 +03:00
qazal
965940dd00 sqtt: update examples after event field change (#14493)
* regen sqtt examples

* cdna

* rdna4

* packet counts for rdna3

* sqttmap work
2026-02-02 21:39:48 +09:00
qazal
647e527a7e viz: replace llvm disasm with our disasm (#14325) 2026-01-25 13:56:56 +09:00
qazal
f3b0e42863 remove extra sqtt pickles in gfx1200 (#14302) 2026-01-23 20:13:48 +09:00
qazal
d7afa02085 clean up the extra/sqtt directory (#14284)
* remove legacy test_timing stuff

* remove legacy test_pmc, update active_sqtt_parse
2026-01-22 19:10:59 +09:00
qazal
4548fcc1b8 amd/sqtt: add rdna4 and cdna sqtt examples (#14251)
* amd/sqtt: add rdna4 and cdna sqtt examples

* work

* comment out rdna and cdna tests
2026-01-20 21:11:48 +09:00
George Hotz
50554115ee fix VALU_SALU / IMMED_MASK and improve amd_asm_matmul (#14196)
* fix VALU_SALU / IMMED_MASK and improve amd_asm_matmul

* immed

* wave override

* restore ALT

* advance sgprs correctly

* no helpers

* decrease to 192 VGPRs
2026-01-17 11:58:34 +09:00
Christopher Milan
0cb024a5bb remove ctypes.Structure (#13651) 2026-01-15 05:06:22 -05:00
qazal
76b577ee76 viz: only SIMD name in sqtt timeline rows (#14146) 2026-01-14 20:13:27 +09:00
qazal
2917ed1616 roc: propagate decoder errors to main thread (#14081)
* roc: propagate decoder errors to main thread

* types

* add cause
2026-01-09 21:10:45 +09:00
chenyu
2e2b5fed12 fix misspellings (#13976) 2026-01-02 10:37:38 -05:00
qazal
d7e1f26e3d command line interface for sqtt viz (#13891)
* command line interface for sqtt viz

* cleanup

* api surface area

* this confuses the llms

* document
2025-12-30 12:33:21 +09:00
qazal
2180eee5e4 use the asm dsl in remu hwtest.py (#13856)
* remu hw test with the asm dsl

* simpler

* nthreads and exec mask

* cmp/cmpx

* assembler error in s_mov_b32

* vopd in dsl?
2025-12-28 11:32:41 +09:00
qazal
f6c660f7fa simplify sqtt decoder infra (#13849)
* more work

* simpler
2025-12-28 00:31:16 +09:00
qazal
a2da61d096 use new style amd compiler in viz (#13848)
* working version, handcode gfx1100 arch

* get target from device properties

* lib in cfg test program spec
2025-12-27 23:59:30 +09:00
qazal
389f01c7f4 viz: amdgpu assembly basic block graph (#13755) 2025-12-22 23:17:16 +08:00
qazal
81d9053013 roc: cast to nullptr instead of changing header (#13801) 2025-12-22 22:34:06 +08:00
qazal
019e71f8ca lds bank count tests from pmc counters (#13667)
* lds bank count tests from pmc counters

* these tests run on the RDNA3 card too

* rename duration to cycles, other rename comment

* add SQ_LDS_IDX_ACTIVE to gfx9 defaults
2025-12-13 17:39:32 +08:00
qazal
93ad1f7732 viz: readable pmc print, share unpacker with tests (#13655)
* viz: readable pmc print, share unpacker with tests

* sections

* static analyzer

* rm that
2025-12-12 19:29:59 +08:00
qazal
d7caae5f61 viz: tabulate pmc (#13574)
* viz: tabulate pmc

* linter

* enable nesting

* pmc comes before waves
2025-12-05 03:08:39 +08:00
qazal
512a8f3dd4 viz: start global memory PMC tests (#13569) 2025-12-05 00:40:27 +08:00
George Hotz
ddf3f2d0c4 rdna3 asm + zip_extract (#13499)
* rdna3 asm + zip_extract

* include sqtt

* fix end parsing

* disassembler working

* parsing fields

* instruction

* op

* more parsing
2025-12-02 22:56:01 -08:00
qazal
c65aa93081 refactor sqtt loader to enable PMC=1 SQTT=0 (#13526) 2025-12-02 22:50:38 +08:00
qazal
a5ec3b24be viz: start PMC in the counters view (#13510) 2025-12-02 00:01:57 +08:00
qazal
9023ca30ef show number of waves in each SE/CU (#13491)
* show number of waves in each SE/CU

* update to test_ones
2025-11-30 22:29:16 +08:00
qazal
d457ee0ba4 viz: correctly handle multiple sqtt traces of the same prg (#13460) 2025-11-29 20:52:41 +08:00
qazal
5520f1fb0b viz: per cu timeline (#13451)
* add cu_loc

* work

* WAVE -> W
2025-11-26 00:05:20 +08:00
qazal
2a9bd12700 sqtt: add occupancy events to the timeline (#13430) 2025-11-24 22:28:05 +08:00
qazal
712c7a6448 sqtt loader cleanups from the occupancy branch (#13431)
* cleanup err handling

* from disasms

* s/wave_execs/wave_insts
2025-11-23 21:50:34 +08:00
George Hotz
9d7a17ee39 beautiful SQTT_PARSE=1 with color (#13428)
* beautiful SQTT_PARSE=1 with color

* linter

* linter 2

* a few more labels

* filter and or

* wave alloc

* a few more
2025-11-23 01:05:14 -08:00
George Hotz
da0aa57a3b add cu parsing to attempt_sqtt_parse 2025-11-22 22:09:05 -08:00
qazal
320ed78803 can view wave timeline with SQTT_ITRACE_SE_MASK=0 (#13427) 2025-11-23 13:55:47 +08:00