mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-13 00:15:35 +08:00
* draft * cleanup test_encodings * cleanup test_isel * model flag state and support rematerialization * woops * add vbroadcastss instruction * don't fuse load if used multiple times in src * add movabs instruction and fix idiv * fixes * add x86 backend to tests * float16 fix * rm TwoAddress2nd * add BARRIER * test windows ci * yup isel fixes the mask stuff too and its beautiful * add cmoves to the spec * support storing imms * no TUPLE_ORDER, breaks tests * fix remaining seg faults * add float max * always fuse index * minor * fix DEFINE_VAR/SPECIAL and enable multithreading * linter * more linter * more * more * more * let's try this * perhaps * start new scheduler * more scheduling info * cleaner shuffle functions * fixup isel tests * skip bounds check when NOOPs exist * skip inf rewrite tests * fix const tag hack and add x86ops to _shape * fix * skip a few tests * func arg order independent from op value * x86 goes in own linearize * switch to PARAM * more * add min x86op and neg in decomps * do mulacc in isel * use def_reg in test_encodings * enable emulated int64 tests * how much does this fix * Ops becomes OpType * fix * rm noqa * rm machine scheduler stuff * and this * allow for extending enums and move X86Ops out of uop * fix imports * rm X86GroupOp from ops.py * spacing * tell mypy to shut up * more linter * add x86op test * allow set[X86Ops] in upat * move NOOPs to pre_isel_matcher and rm NOOP from spec * more asserts * also this * cleanup encode * simplify live range * fix idiv * add Ops.INS to x86 * more changes * more changes * more changes * fix * fix * fix * fix * print formatted assembly * fix 8bit idiv? * oops * enable float16 and unaligned vector load/store * actually no * move x86 tests * no more bool cast * fix * linter * linter * move X86Ops to x86.py * fix vpbroadcast * cleanups * linter * print correct reg names * canonical max * move max/min and add test * support float16 vector load/store * rm bad rewrite * vpsrldq can't access memory * regalloc takes renderer * enable vector load/store on all dtypes * more isel tests * rm this for now * a lot better * fix * fix * fix * deal with flags correctly * fix * enable gep noop rule * fix * fix * fix * add callee saved registers * use Ops.CONST instead of X86Ops.IMM * fix * enable TUPLE_ORDER * fix * rm x86 code in linearizer * fix * fix * fix * move isa rewrites to codegen * fix * fix * skip test_linearizer.py * skip more tests * fix * fix for idiv/mod changes * fix * don't use fmadd if it duplicates fused op * hacky * fix * cleanups * cleanups * fix --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>