tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-13 00:15:35 +08:00

Author	SHA1	Message	Date
George Hotz	f5467cfedc	Devicebufferless (#708 ) * runs one metal kernel * conv2d works * ops tests are passing * const folding * all ops work * pre commit always passes * torch works * working still * fix graph test * tests passing * image almost works * image conv works * most images * fix custom * fix assignment * fix compile enet * clean up comments * fix realize return value * include shapetracker in LB repr * copy should make a copy * reenable method cache * fix lna * dtypes in graph * forward only for IMAGE=2 * simple realize * getting close * fixup new api, it's good except the kernel count * back to 197 kernels * tests should pass * go to a real float * no type_on_cpu * fix the docs * put shapetracker back in it's proper place	2023-03-18 14:40:23 -07:00
George Hotz	d8dda2af3a	openpilot fixups	2023-03-06 14:14:44 -08:00
George Hotz	382f346523	clean up opt (#649 ) * clean up opt * don't let global kernels get too small * 8192 -> 1024 * disable local shape for clang * fix can_merge * unroll the 5x5 depthwise convs in op * load float4 check	2023-03-05 20:49:36 -08:00
George Hotz	c53efb3635	optimize for CL (#633 ) * required opt * simplify * works * shift_to_last * required is fine * print shape in colored * better shape * args was wrong * debugs * fix empty shape * colored shape printer	2023-03-03 22:00:09 -08:00
George Hotz	1a84976d4d	fix thneed gflops	2023-03-03 16:52:59 -08:00
George Hotz	b9ce20c374	openpilot test wasn't running, factor out image idx	2023-03-03 07:41:53 -08:00
George Hotz	2e26286294	speed like you wouldn't believe (#626 ) * speed like you wouldn't believe * fix tests	2023-03-02 07:49:19 -08:00
George Hotz	bfcec234a2	Refactor ASTs (#622 ) * ugh worst branch name * compiler refactor continues * scc -> cloc * buf -> _buf * finish _buf, and program -> runtime * gpu is still working, clang isn't * clang in new style * ops_metal * something broke it * improve metal * clean up tons of cl crap * hack fix sync * cleaner gpu * gpu metal clang * cleanups * minor refactor * GPUCodegen * fix up LLVM * blind CUDA refactor * codegen / runtime * keep ops naming * linter passes * woah, llvm was allocing 4x what it needed to * bugfixes * fix openpilot compiler * fix compile_efficientnet * method cache should fix tests * deal with duped functions	2023-03-01 18:57:29 -08:00
voidz	94bec40110	moved extras/jit.py -> tinygrad/jit.py (#599 ) * moved extras/jit.py to tinygrad/jit.py * fixed indent * removed tinygrad.helpers.DEBUG from jit.py	2023-02-25 08:32:33 -08:00
George Hotz	d3029c91c5	no rng for op test	2023-02-24 00:23:20 -08:00
George Hotz	661812ffef	don't ignore type	2023-02-23 19:38:52 -08:00
George Hotz	8b0082540b	openpilot compile cleanups	2023-02-20 09:16:03 -08:00
George Hotz	de71c13934	test speed v torch uses jit	2023-02-12 07:43:17 -08:00
George Hotz	031edd01e6	switch openpilot compile to TinyJit	2023-02-11 09:51:44 -08:00
George Hotz	3d63934995	refactor to keep cl in the runtime (#545 ) * refactor to keep cl in the runtime * fix thneed, rename cl to _cl * bugfix + _cuda * fix tests * thneed more correct	2023-02-08 16:46:09 -06:00
Jacky Lee	799b3f185a	Refactor getenv into helpers (#508 ) * Refactor getenv into helpers * Remove unused os * Fix default value * Fix more defaults for CI * Fix bracket * Revert changes to openpilot/compile.py * Use getenv from helpers when possible	2023-01-31 15:09:09 -08:00
George Hotz	92001a06e1	openpilot/go.sh	2023-01-28 13:57:43 -08:00
George Hotz	6d7658db12	delete opencl <celebration>	2023-01-24 14:18:35 -08:00
George Hotz	e313c8af20	update openpilot tests from OPENCL to GPU	2023-01-24 14:05:20 -08:00
George Hotz	281b0db773	three from image	2023-01-12 12:26:58 -08:00
George Hotz	4885fce56e	shapetracker from newgpu (#456 ) * shapetracker from newgpu * touchup ops * test * testst * thneed deletes unused inputs * test * bugfix	2023-01-09 12:40:01 -08:00
George Hotz	e6b65f8e01	fix graph in openpilot/compile.py	2022-10-28 08:55:34 -07:00
George Hotz	ef62db3186	cleanups, remove E701	2022-10-28 08:28:56 -07:00
George Hotz	b65b70812a	Exec AST (#404 ) * working exec ast * exec_ast is staticmethod * GenericExecAST * fold that sometimes * ExplicitExecAST * exec_ast for GPU * gpu working * get_lazyop_shape * now gpubuffer is ExplicitExecAST * dedup * add a type * RESHAPE in opencl code * fix linter * that too for linter * cleanups * remove dead code * GenericShape is less lines * add ALLOWED_KERNEL_COUNT to tests * fix mypy * that's gotta be recursive * fix opencl shape processing * remove unneeded lambda	2022-10-28 08:27:03 -07:00
George Hotz	6a8fb53304	move ops.py into lazy.py (#402 ) * move ops.py into lazy.py * fix graph and linter * ugh, didn't add	2022-10-25 13:58:03 -07:00
George Hotz	3b9b7eda48	remove run_thneed dead code	2022-10-20 17:24:18 -07:00
George Hotz	1bec4651b3	fix nonstatic weights	2022-10-20 17:04:14 -07:00
George Hotz	50c95c7d9a	add assert to catch issue in attention	2022-10-20 15:13:00 -07:00
George Hotz	26c78ccf7d	remove useless buffer	2022-10-20 14:07:28 -07:00
George Hotz	a18c1f3178	zero out the inputs	2022-10-20 13:46:52 -07:00
George Hotz	61ee428e4c	rerun	2022-10-20 13:29:14 -07:00
George Hotz	5dae64b7b0	read input shapes and break down the layers	2022-10-20 13:11:24 -07:00
George Hotz	e00601faea	fix thneed self test	2022-10-20 12:55:02 -07:00
George Hotz	ace8db29f8	ReduceSum	2022-10-20 12:48:14 -07:00
George Hotz	c400ee0beb	refactoring thneed (#400 ) * refactoring thneed * continue * minor update * looks like it's working * big refactor * confirm thneed got the right output * code is there but it's broken * works now * always OPTWG, input -> dat * fix type issue	2022-10-20 12:35:59 -07:00
YassineYousfi	ae0f9b17df	openpilot: new models and onnx ops (#401 ) * ngrl stuff * fngrl * fix typo in compile script * workflow dispatch * new models in tests * dont need to up this threshold Co-authored-by: HaraldSchafer <harald.the.engineer@gmail.com>	2022-10-20 11:49:19 -07:00
George Hotz	d6f499fd69	improve opencl, why is it OOMing	2022-09-05 20:14:31 -07:00
George Hotz	2e9b7637b3	don't save input buffers	2022-08-31 15:37:38 -07:00
George Hotz	a3fc64a585	fix batchnorm folding in openpilot compile	2022-08-31 13:04:49 -07:00
Comma Device	a734df98fa	TEST_ENET for openpilot compiler	2022-08-31 13:23:36 -04:00
George Hotz	d919ac32af	fix wrong size input	2022-08-31 09:07:34 -07:00
George Hotz	040640a580	fix cl import error	2022-08-31 08:43:44 -07:00
George Hotz	33ac355bcd	still broken	2022-08-29 19:08:07 -07:00
George Hotz	5efab7cf1d	add reciprocal	2022-08-29 18:00:24 -07:00
George Hotz	880707f2d2	no torch test if no torch	2022-08-29 15:29:19 -07:00
George Hotz	5eba228844	print inputs	2022-08-29 08:56:04 -07:00
George Hotz	dd587d26e3	oops, compare with abs	2022-08-28 11:23:21 -07:00
George Hotz	dc7af8c3ac	thneed run float32	2022-08-28 11:03:35 -07:00
Comma Device	f0d11f29c7	float32 in image desc	2022-08-28 08:47:43 -07:00
George Hotz	11626053b0	run_thneed with test	2022-08-22 09:45:46 -07:00

1 2

71 Commits