mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-11 23:46:02 +08:00
* draft * cleanup test_encodings * cleanup test_isel * model flag state and support rematerialization * woops * add vbroadcastss instruction * don't fuse load if used multiple times in src * add movabs instruction and fix idiv * fixes * add x86 backend to tests * float16 fix * rm TwoAddress2nd * add BARRIER * test windows ci * yup isel fixes the mask stuff too and its beautiful * add cmoves to the spec * support storing imms * no TUPLE_ORDER, breaks tests * fix remaining seg faults * add float max * always fuse index * minor * fix DEFINE_VAR/SPECIAL and enable multithreading * linter * more linter * more * more * more * let's try this * perhaps * start new scheduler * more scheduling info * cleaner shuffle functions * fixup isel tests * skip bounds check when NOOPs exist * skip inf rewrite tests * fix const tag hack and add x86ops to _shape * fix * skip a few tests * func arg order independent from op value * x86 goes in own linearize * switch to PARAM * more * add min x86op and neg in decomps * do mulacc in isel * use def_reg in test_encodings * enable emulated int64 tests * how much does this fix * Ops becomes OpType * fix * rm noqa * rm machine scheduler stuff * and this * allow for extending enums and move X86Ops out of uop * fix imports * rm X86GroupOp from ops.py * spacing * tell mypy to shut up * more linter * add x86op test * allow set[X86Ops] in upat * move NOOPs to pre_isel_matcher and rm NOOP from spec * more asserts * also this * cleanup encode * simplify live range * fix idiv * add Ops.INS to x86 * more changes * more changes * more changes * fix * fix * fix * fix * print formatted assembly * fix 8bit idiv? * oops * enable float16 and unaligned vector load/store * actually no * move x86 tests * no more bool cast * fix * linter * linter * move X86Ops to x86.py * fix vpbroadcast * cleanups * linter * print correct reg names * canonical max * move max/min and add test * support float16 vector load/store * rm bad rewrite * vpsrldq can't access memory * regalloc takes renderer * enable vector load/store on all dtypes * more isel tests * rm this for now * a lot better * fix * fix * fix * deal with flags correctly * fix * enable gep noop rule * fix * fix * fix * add callee saved registers * use Ops.CONST instead of X86Ops.IMM * fix * enable TUPLE_ORDER * fix * rm x86 code in linearizer * fix * fix * fix * move isa rewrites to codegen * fix * fix * skip test_linearizer.py * skip more tests * fix * fix for idiv/mod changes * fix * don't use fmadd if it duplicates fused op * hacky * fix * cleanups * cleanups * fix --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
27 lines
1.1 KiB
Python
27 lines
1.1 KiB
Python
import unittest, io
|
|
from contextlib import redirect_stdout
|
|
from tinygrad import Tensor, Device
|
|
from tinygrad.helpers import Target
|
|
from tinygrad.renderer.nir import LVPRenderer
|
|
from tinygrad.renderer.isa.x86 import X86Renderer
|
|
from tinygrad.codegen import to_program
|
|
|
|
@unittest.skipIf(Device.DEFAULT != "CPU", "only run on CPU")
|
|
class TestCPU(unittest.TestCase):
|
|
def test_arch_feats(self):
|
|
ast = (Tensor.empty(16) + Tensor.empty(16)).schedule_linear().src[-1].src[0]
|
|
for ren in Device[Device.DEFAULT].renderers:
|
|
for arch, expect_vmov in [("x86_64,x86-64,avx", True), ("x86_64,x86-64,-avx", False)]:
|
|
with self.subTest(arch=arch):
|
|
if ren is X86Renderer: continue # X86 requires avx support
|
|
if ren is LVPRenderer: continue # LVP does not play nice with cross compilation
|
|
r = ren(Target(device="CPU", arch=arch))
|
|
p = to_program(ast, r)
|
|
lib = r.compiler.compile(p.src[3].arg)
|
|
out = io.StringIO()
|
|
with redirect_stdout(out): r.compiler.disassemble(lib)
|
|
self.assertEqual("vmov" in out.getvalue(), expect_vmov, out.getvalue())
|
|
|
|
if __name__ == '__main__':
|
|
unittest.main()
|