Commit Graph

101 Commits

Author SHA1 Message Date
Christopher Milan
99a0debd62 Device.count() (#15842) 2026-04-21 16:46:38 -04:00
Christopher Milan
bc180a963c deprecate <dev>=1 in favor of DEV=<dev> (#15467)
* start work on target

* add test

* update actions to use DEV

* update docs

* update readmes

* tests need that too

* update example

* update tests (comments)

* fix that test

* ruff

* mypy

* oops

* remove getenvs

* don't add Target yet

* and the test

* lint

* and docs

* more stuff

* assert

* few more fixes

* test assert
2026-03-26 03:48:03 -04:00
chenyu
1f96cc2b51 update non-contiguous buffer error message [pr] (#15131)
* update non-contiguous buffer error message [pr]

also cleaned up the tests

* order
2026-03-04 11:13:26 -05:00
chenyu
fde7a40bb0 allow dtype mismatched assign on disk (#14993)
reverted #14473, that was a bad idea. also added a test that safe_save only has copy
2026-02-24 20:49:55 -05:00
George Hotz
55d3a5def9 preallocate all realized buffers (#14823)
* preallocate all realized buffers

* contiguous

* work

* comment that out

* move to schedule

* better

* correct fix

* just buffer

* disk bufs

* fixes disk tensor stuff

* fix symbolic stuff

* fix multi

* 162 failures

* bugfixes

* don't check that anymore

* fix schedule tests

* mnist should be contiguious

* type and buffer

* fix tests

* shrink axis correction

* mypy fixes

* tests skips

* same 37 failures

* dedup

* no shrink in the graph

* 29 failures

* skips

* fix custom kernel

* fix training

* those optimizations aren't supported currently

* simpler

* more correct

* tests

* 14 failures

* works

* fix that test

* broken

* 11 failures

* only kernel counts left

* fixes

* all tests pass

* remove tensor_map

* op test

* 200 -> 230

* test fixes

* fixes

* revert test_tiny thing

* guard

* revert that

* test tiny passes

* no contigs there

* base realize back

* Revert "no contigs there"

This reverts commit c45bb9fcfd.

* revert that

* chop many assigns

* 12 failures

* fix tests

* tests

* apply after

* pre-commit

* remove old code

* delete that

* fix types

* remove extra contig

* fix dataloader

* torch fix

* disk fix

* update kernel fusion numbres

* runs on amd

* restore kernel count

* add that rule back

* that

* disable that

* wrong

* add the correct rule for that folding

* more tests

* guard c1.arg

* no newlines

* realize those

* split into a different file

* remove detach/contig back

* skip 2

* update that
2026-02-20 20:05:54 +08:00
Bautista Garcia
0f1ca8eb43 torch_load: fix shared storage slicing (#14771)
* faster zip_extract + usage in torch load

* clean zip in torch load

* working zipextract in torchload

* tar_extract in tar path

* faster tar path

* tests passing, cleanup needed

* faster tar with 1MB buffer

* comments

* unify storage_source with all paths

* use bufferedreader in zip path

* fix ruff

* clean

* removed unnecessary string conversion

* fix for tensors that share storage

* less hacky

* shared storage test

* test comment

* linter

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2026-02-16 14:30:13 +08:00
George Hotz
0e215c433d remove hack from cast (#14760)
* remove hack from cast

* skip tests

* linters to 3.12, another skip

* fix rand

* m_
2026-02-15 13:56:38 +08:00
chenyu
e9f40f49d4 explicitly check advanced setitem (#14644)
advanced setitem DISK would failed in rangeify with bad error, now it's checked directly in setitem. eventully DISK can use regular setitem path
2026-02-09 13:36:46 -05:00
George Hotz
03af2404e2 small changes and test fixes from kernel is call (#14586) 2026-02-06 17:08:33 +08:00
George Hotz
3c26ce29b2 make disk tensor tests process safe (#14584) 2026-02-06 15:39:55 +08:00
chenyu
ea1f1d2b9d test_assign_to_bitcast_view (#14483)
currently disk allows assign same size dtype into a bitcasted view
2026-02-01 16:46:04 -05:00
chenyu
5705398a1f assign cleanup [pr] (#14479)
share more code path between disk and non-disk. also raise RuntimeError instead of Assert for mismatches
2026-02-01 09:10:22 -05:00
chenyu
b38fc43b07 assert assign dtype mismatch for disk [pr] (#14473)
the disk hack is generally wrong, now force bitcast on the source before assign
2026-01-31 17:08:54 -05:00
chenyu
99b44121bc failed test case for non-consecutive disk read (#14455)
silently fail now
2026-01-30 23:44:04 -05:00
chenyu
ddc041854b failed test case for disk setitem (#14426)
strided setitem is wrong
2026-01-29 14:54:19 -05:00
chenyu
c7b8f6496f remove dtypes.index_like and dtypes.fields [pr] (#14207)
barely used, so just use inline and DTYPES_DICT
2026-01-18 11:49:01 -05:00
chenyu
35c9701df0 update outdated tests and comments (#14090) 2026-01-10 01:00:48 -05:00
chenyu
92246ea731 update tests, WEBGPU=1 pytest . passes (#14089)
* update tests, `WEBGPU=1 pytest .` passes

* minor update
2026-01-10 00:03:02 -05:00
chenyu
eacccc5ace more disk assign tests (#14087)
covers more edge cases
2026-01-09 14:14:52 -05:00
chenyu
cff33c8d78 add some disk assign tests (#14085) 2026-01-09 11:50:59 -05:00
Garret Castro
16b652302e skip bf16 test if not supported by device (#14070) 2026-01-08 13:37:24 -05:00
chenyu
af0392efea only set DiskDevice.size if it opens successfully (#13962) 2026-01-01 19:33:26 -05:00
chenyu
e036d6df89 properly fix DiskDevice reuse (#13961) 2026-01-01 18:08:23 -05:00
George Hotz
aeb7516c8a tests passing on tinybox h3 (#13742) 2025-12-17 19:04:34 -04:00
George Hotz
3dbde178c1 mark slow tests as slow instead of as CI (#13736)
* mark slow tests as slow instead of as CI

* CI shouldn't have different behavior

* more skips / CI

* slow
2025-12-17 10:29:57 -04:00
wozeparrot
82f10cfe2e feat: assert on bufferview math (#12772) 2025-10-17 14:20:08 -07:00
George Hotz
af4479c169 faster stable diffusion load (#12725)
* faster stable diffusion load

* failing tests
2025-10-16 18:31:59 +08:00
qazal
4756971c88 skip test_bf16_disk_write_read on CL=1 (#12256) 2025-09-20 17:11:06 +03:00
nimlgen
9182948951 remove llvm_bf16_cast (#12075) 2025-09-08 20:51:15 +03:00
b1tg
fcbefde8f5 fix DiskDevice reuse (#11039)
* fix DiskDevice reuse

* fix mypy and DiskDevice.count

* mypy

* add test

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-07-01 10:29:21 -04:00
George Hotz
53ed64e133 ci speed work 1 (#10676)
* skip a few slow tests

* use a venv for python packages

* create venv

* no user, it's in venv

* ignore venv

* venv

* new cache key

* try that

* this

* version the python cache
2025-06-07 16:33:11 -07:00
qazal
9a9aba4cd5 setitem tests (some failing) from kernelize (#9940) 2025-04-20 18:47:55 +08:00
George Hotz
8919370c76 hotfix: fix test_save_all_dtypes on METAL 2025-04-18 08:42:31 +01:00
hooved
136cf7b8b1 hotfix: load >2 GiB from disk on macOS (#9361)
* enable loading >2 GiB buffer from disk on macOS

* handle None case raised by mypy

* add test

* revert fix to repro bug in CI

* tell CI to run a unit test for macOS

* reapply fix
2025-03-07 14:51:58 +08:00
chenyu
2e7c2780a9 CLANG -> CPU (#9189) 2025-02-20 18:03:09 -05:00
George Hotz
af2c2837f6 hotfix: skip broken test, add KERNEL Op 2025-02-03 14:02:55 +08:00
Ahmed Harmouche
07d3676019 weights_only=False (#8839) 2025-01-31 17:16:47 -05:00
George Hotz
80089536e5 Revert "move llvm_bf16_cast to renderer for CLANG and LLVM [pr] (#8720)" (#8786)
This reverts commit af0452f116.
2025-01-28 18:59:02 +09:00
mesozoic-egg
af0452f116 move llvm_bf16_cast to renderer for CLANG and LLVM [pr] (#8720)
* handle bf16 via bitcasting for CLANG and LLVM

* On LLVM, skip float16 cast

* float32 on llvm lite, float32 elsewhere

* code format

* trigger pr

* move to rewriter

---------

Co-authored-by: Mesozoic Egg <mesozoic.egg@proton.mail>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-01-28 18:16:43 +09:00
George Hotz
018edd934b don't use view in copy [pr] (#8704)
* don't use view in copy [pr]

* oh, remove double contig

* fix reps
2025-01-21 09:57:47 -08:00
qazal
2b7db9b45d delete unused cast/bitcast lines from ops.py [pr] (#8651)
* move cast and bitcast out

* more deletion of bitcast arg

* fix test_bitcast_fuses

* update tests

* work
2025-01-17 03:04:18 -05:00
chenyu
40a4c603b9 remove more test skip for webgpu [pr] (#8192) 2024-12-12 14:06:35 -05:00
qazal
df84dc6444 unrelated test fixups from delete_lazy [pr] (#8088)
* unrelated test fixups from delete_lazy [pr]

* fine if it's scheduled later
2024-12-06 17:31:02 +02:00
George Hotz
df18e7cc37 accept filename decorator [pr] (#8049)
* accept filename decorator [pr]

* add test for safe_load

* bring old tar tests back
2024-12-05 11:40:59 +08:00
leopf
f0401e14e8 tar_extract with Tensors (#7853)
* initial

* USTAR, PAX and GNU support + testing

* from_bytes byteorder

* use TarInfo.frombuf

* tensor only usage

* remove contextlib.suppress

* shorter ow,pax

* more tests

* testing length + move tests

* cleanup

* new approach: RawTensorIO

* fix fetch

* enable read test

* cleanup and ignore fix

* fix for python < 3.12

* make it RawIO

* functions

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-04 17:03:19 +08:00
George Hotz
205befa788 move is_dtype_supported to device [pr] (#7575) 2024-11-07 20:38:03 +08:00
George Hotz
be64ac417e move GGUF test to it's own file [pr] (#7208)
* move GGUF test to it's own file [pr]

* skip tests if modules aren't installed
2024-10-22 13:24:55 +08:00
chenyu
f37e6b453b load_gguf -> gguf_load in doc and test (#7199) 2024-10-21 14:03:33 -04:00
leopf
815e1a340c GGUF Cleanup - raise if type is not supported (#7194)
* raise if ggml type is unsupported

* test raise
2024-10-21 11:32:11 -04:00
leopf
87877d7a91 GGUF cleanup (#7192)
* cleanup

* remove vocab size hard code
2024-10-21 10:44:54 -04:00