### Welcome to the tinygrad documentation General instructions you will find in [README.md](https://github.com/geohot/tinygrad/blob/master/README.md) [abstraction.py](https://github.com/geohot/tinygrad/blob/master/docs/abstractions.py) is a well documented showcase of the abstraction stack. There are plenty of [tests](https://github.com/geohot/tinygrad/tree/master/test) you can read through [Examples](https://github.com/geohot/tinygrad/tree/master/examples) contains tinygrad implementations of popular models (vision and language) and neural networks. LLama, Stable diffusion, GANs and Yolo to name a few ### Environment variables Here is a list of environment variables you can use with tinygrad. Most of these are self-explanatory, and used to enable an option at runtime. Example : `GPU=1 DEBUG=4 python3 -m pytest` The columns are: Variable, Value and Description They are also grouped into either general tinygrad or specific files ##### General tinygrad DEBUG: [1-4], enable debugging output, with 4 you get operations, timings, speed, generated code and more GPU: [1], enable the GPU backend CPU: [1], enable CPU backend MPS: [1], emable MPS device (for Mac M1 and after) METAL: [1], enable Metal backend (for Mac M1 and after) METAL_XCODE: [1], enable Metal using MacOS Xcode sdk TORCH: [1], enable Torch backend CLANG: [1], enable Clang backend LLVM: [1], enable LLVM backend LLVMOPT: [1], enable LLVM optimization LAZY: [1], enable lazy operations OPT: [1-4], enable optimization OPTLOCAL: [1], enable local optimization JIT: [1], enable Jit GRAPH: [1], Create a graph of all operations GRAPHPATH: [/path/to], what path to generate the graph image PRUNEGRAPH, [1], prune movementops and loadops from the graph PRINT_PRG: [1], print program FLOAT16: [1], use float16 instead of float32 ENABLE_METHOD_CACHE: [1], enable method cache EARLY_STOPPING: [1], stop early DISALLOW_ASSIGN: [1], enable not assigning the realized lazydata to the lazy output buffer ##### tinygrad/codegen/cstyle.py NATIVE_EXPLOG: [1], enable using native explog ##### accel/ane/2_compile/hwx_parse.py PRINTALL: [1], print all ane registers ##### extra/onnx.py ONNXLIMIT: [ ], set a limit for Onnx DEBUGONNX: [1], enable Onnx debugging ##### extra/thneed.py DEBUGCL: [1-4], enable Debugging for OpenCL PRINT_KERNEL: [1], Print OpenCL Kernels ##### extra/kernel_search.py OP: [1-3], different operations NOTEST: [1], enable not testing ast DUMP: [1], enable dumping of intervention cache REDUCE: [1], enable reduce operations SIMPLE_REDUCE: [1], enable simpler reduce operations BC: [1], enable big conv operations CONVW: [1], enable convw operations FASTCONV: [1], enable faster conv operations GEMM: [1], enable general matrix multiply operations BROKEN: [1], enable a kind of operation BROKEN3: [1], enable a kind of operation ##### examples/vit.py LARGE: [1], enable larger dimension model ##### examples/llama.py WEIGHTS: [1], enable using weights ##### examples/mlperf MODEL: [resnet,retinanet,unet3d,rnnt,bert,maskrcnn], what models to use ##### examples/benchmark_train_efficientnet.py CNT: [10], the amount of times to loop the benchmark BACKWARD: [1], enable backward call TRAINING: [1], set Tensor.training CLCACHE: [1], enable Cache for OpenCL ##### examples/hlb_cifar10.py TORCHWEIGHTS: [1], use torch to initialize weights DISABLE_BACKWARD: [1], dont use backward operations ##### examples/benchmark_train_efficientnet.py & examples/hlb_cifar10.py ADAM: [1], enable Adam optimization ##### examples/hlb_cifar10.py & xamples/hlb_cifar10_torch.py STEPS: [0-10], number of steps FAKEDATA: [1], enable to use random data ##### examples/train_efficientnet.py STEPS: [1024 dividable], number of steps TINY: [1], use a tiny convolution network IMAGENET: [1], use imagenet for training ##### examples/train_efficientnet.py & examples/train_resnet.py TRANSFER: [1], enable to use pretrained data ##### examples & test/external/external_test_opt.py NUM: [18, 2], what ResNet[18] / EfficientNet[2] to train ##### test/test_ops.py PRINT_TENSORS: [1], print tensors FORWARD_ONLY: [1], use forward operations only ##### test/test_speed_v_torch.py TORCHCUDA: [1], enable the torch cuda backend ##### test/external/external_test_gpu_ast.py KOPT: [1], enable kernel optimization KCACHE: [1], enable kernel cache ##### test/external/external_test_opt.py ENET_NUM: [-2,-1], what EfficientNet to use ##### test/test_dtype.py & test/extra/test_utils.py & extra/training.py CI: [1], enable to avoid some tests to run in CI ##### examples & extra & test BS: [8, 16, 32, 64, 128], bytesize