Commit Graph

124 Commits

Author SHA1 Message Date
Armand du Parc Locmaria
d9553eeec7 Reapply "modeld: split warp" (#38085) (#38086)
* Reapply "modeld: split warp" (#38085)

This reverts commit d489dd8909.

* don't time make_random_inputs

* depend on chunk targets

* also depend on compile_modeld's dependencies
2026-05-23 15:55:40 -07:00
Armand du Parc Locmaria
d489dd8909 Revert "modeld: split warp" (#38085)
Revert "modeld: split warp (#38079)"

This reverts commit a3cc9c7ac3.
2026-05-23 00:53:12 -07:00
Armand du Parc Locmaria
a3cc9c7ac3 modeld: split warp (#38079)
* compiles

* runs

* dedupe compiling model

* always build for both res

* fix does not bind loop variable

* rm size multiplier
2026-05-22 17:50:26 -07:00
Armand du Parc Locmaria
52e182611d initial usbgpu support (#37906)
* zero ll patched big model

* probe in a subprocess so usbgpu lock gets released

* compiles

* runs

* num_jobs gets overwritten, use side effect

* poll tg devices

* make sure build crashes on missing gpu

* fine not to rely on Device.default

* seperate tg env for each model runner

* comment

* Revert "seperate tg env for each model runner"

This reverts commit f6470cc4258eaeb3e8e37907ef370871c9af5aa4.

* env is shared, gate on flag

* no fallback warp dev must be set

* build for current device only, unless pc/release

* comment

* list

* listen for plug in

* add icon to status bar, read params on every frame (?)

* log available devices

* try copy out when loading?

* Revert "log available devices"

This reverts commit e8c52a5d59456d4820ecb13b99a6c46ea1386a20.

* Revert "try copy out when loading?"

This reverts commit 518f403aa03faeda1950fe3dbce0d9e4c1584455.

* don't trigger device probe/caching on modeld prepare

* re-export with ll and road edges

* dont cache devices in manager process

* get USBGPU from params

* no usbgpu env

* missed one

* sconscript don't poll

* unconditional env

* always explicitely set devices on input tensors

* set DEV so amd uses right compiler and iface??

* fix flag

* bump tg

* rm xdg_cache_home

* tg don't bump all the way

* missing gmmu=0 at compile time

* dm set dev

* tg backend

* update gitignore

* missing import

* unused imports

* rely on Device.DEFAULT at compile time (already the case bc onnxrunner)

* comments

* dm warp needs DEV set too

* build both smol and big

* misc typos

* set dev at compile time

* don't need

* DEV=CPU when getting metadata, ensure we don't grab gpu lock

* this would also grab lock

* put bool

* warp compile always prepare only

* missed one

* poll ui

* missing here

* don't force usbgpu at build time

* tmp patch fetch_fw

* catch all, follow hardwared patterns

* simpler

* compile make input queues

* revert this

* group this more readable

* rm empty line

* make dummy frame using numpy

* revert compile make input queues

* no compiler at runtime

* cleanup

* fine to rebuild all on change to device node for now

* fix usbgpu_present

* fix sconscript

* no size in header stream decompress

* DEBUG=2

* minimal viable feedback

* egpu gray

* oops

* gotta do this actually

* modeld build only depends on modeld devices

* don't ship onnx to release? or chunk

* don't need

* can only set compiler on dev=

* none device works, will use default

* make linter happy

* chunk agnostic onnx input to compile_modeld

* chunk big onnx

* +x chunker

* fix #!

* and don't ship chunked onnx to release

* firmware now in correct location

* better err on missing onnx/chunk

* SConscript also need to accept chunked onnx

* metadata also need to load maybe chunked

* dedupe cmd

* this needs to be on cpu

* devices are set in the tgflags, we already depend on them

* rebuilding on changed order is fine

* read file chunked can already load either chunked or not

* chunk all big onnx

* less confusing

* unused import

* python device to load onnx bytes

* default device for runners, python for metadata

* why not

* chunked to shm
2026-05-19 22:41:57 -07:00
Armand du Parc Locmaria
6941a913a3 modeld/SConscript: fix pkl chunking (#38067) 2026-05-18 21:23:24 -07:00
Armand du Parc Locmaria
d4a83deb7d modeld/SConscript: rm unused line (#38047)
rm unused line
2026-05-15 13:26:18 -07:00
Armand du Parc Locmaria
4cfd774855 modeld/dmonitoringmodeld: explicitly set input devices (#38044)
* modeld/dmonitoringmodeld: explicitly set input devices

* lint

* ignore metadata json file
2026-05-14 23:09:50 -07:00
Armand du Parc Locmaria
74554a523f modeld: fold metadata into jit pkl (#38042)
* modeld: fold metadata into jit pkl

* modeld

* no more metadata deps
2026-05-14 22:24:07 -07:00
Armand du Parc Locmaria
2d4ac33ed7 modeld: DEV=AMD dedupe weights across camera resolutions (#38041)
* modeld: dedupe weight accross resolutions

* cleanup

* rm compileconfig

* depends on camera targets

* dedupe doesn't work on qcom as is
2026-05-14 16:42:55 -07:00
Armand du Parc Locmaria
4b81dda1b5 modeld: build single camera (#38008)
* Reapply "modeld: build single camera" (#38007)

This reverts commit edc3ce89fa.

* don't build same cam twice
2026-05-11 16:05:30 -07:00
Armand du Parc Locmaria
edc3ce89fa Revert "modeld: build single camera" (#38007)
Revert "modeld: build single camera (#37990)"

This reverts commit 628e230b63.
2026-05-11 15:57:18 -07:00
Armand du Parc Locmaria
628e230b63 modeld: build single camera (#37990)
* modeld: build single camera

* rm old

* detect release only once

* acados

* rm whitespace change
2026-05-11 15:26:04 -07:00
Armand du Parc Locmaria
b3f369612d modeld: cleanup tg flags (#37903)
* modeld: remove deprecated tg flags

* trigger ci

* JIT_BATCH_SIZE still a thing
2026-04-29 21:28:59 -07:00
Armand du Parc Locmaria
551e2f77bf modeld: standalone compile script (#37851)
* modeld: standalone compile script

* cleanup

* frame skip

* rm last op import

* dm warp

* no graph break

* +x compile_dm_warp.py

* don't import tg before setting device

* compile_modeld exports metadata

* update help

* namedtuple

* lint

* Revert "compile_modeld exports metadata"

This reverts commit 93c3c223567b4d4a074c9071d7f734c56f5aedcc.

* import
2026-04-23 11:55:07 -07:00
Armand du Parc Locmaria
d81d66193f modeld: single jit (#37758)
* compile_modeld.py

* update estimates

* missing image=2?

* Revert "missing image=2?"

This reverts commit 2f5952eb63ba1e3f24cbf5769e6b5e9170d7f0a6.

* Revert "update estimates"

This reverts commit 1f72feef2ffdec6126e3c941e899b46ace7b4b65.

* Revert "compile_modeld.py"

This reverts commit f10541502efca02725f368deda2a21d1f786f57d.

* load warp in ModelState init

* dead code

* prep

* compile modeld

* update SConscript

* tmp save plot locally

* Revert "tmp save plot locally"

This reverts commit ec22f15161ad3b0241a097546b35860f989219f5.

* openpilot hacks?

* no float16

* tmp more chunks

* Revert "tmp more chunks"

This reverts commit 9e1d9b4d0dc36ff530d2a70b565fbfabd7afb00d.

* Revert "no float16"

This reverts commit 6204956e98e3c0818ed1985ede8eeccb810f63e3.

* realize boundaries

* Revert "realize boundaries"

This reverts commit ffaa19259eba70944e7793e8f51a0f87089531b3.

* prune=False?

* Reapply "tmp more chunks"

This reverts commit 2599c41cea93b4a6b4e946cdffc6a617663a7d23.

* tg bug?

* load first?

* Revert "load first?"

This reverts commit f643d082d76a424b23295e254179eb111e936e61.

* revert

* Reapply "tmp save plot locally"

This reverts commit 1b95b82ee58654bd908b1cb04ab0ddbcd1a5955d.

* 0 tol pc

* warp -> modeld

* rename

* bypass chunking?

* dont chunk

* Revert "dont chunk"

This reverts commit cc97fc67b3203456e123f02babe5c83b87c7e264.

* dont chunk

* debug

* Revert "debug"

This reverts commit b3c2f2e7a095fd32f8d8562a68fd1cca42357eac.

* Revert "dont chunk"

This reverts commit 42bd9b6f6ad0722c50348ba11ba7e2a64fdf997d.

* Revert "bypass chunking?"

This reverts commit ad5422a93483ffd8a59ba62e5fb72ced3b5d04d0.

* corrupt model outputs

* Revert "corrupt model outputs"

This reverts commit 245feb94480e02f83a20b65a9488652bcbfc88b0.

* image=0 for warp, match master

* dedupe enqueue

* pass traffic convention

* tg buffer for desire

* dedupe buffer creation

* compile_modeld: nuke stale cached pkl before compiling

The UNSAFE CI checkout keeps gitignored files (.pkl, .sconsign.dblite),
so stale pkl files from previous commits can persist and be reused
instead of being recompiled. Delete them explicitly before compiling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test vs compile

* all outputs need to be different on different inputs

* randomize numpy inputs

* randomize on every step

* SConscript: nuke stale pkl+chunks before compile_modeld

Move the stale artifact cleanup from compile_modeld.py into the
SConscript build command. This ensures stale gitignored pkl and chunk
files are deleted even if scons decides to skip the compile step
(due to a stale .sconsign.dblite from UNSAFE CI checkout).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* compile_modeld: restore Context(IMAGE=0) for warp

The warp operations must run under IMAGE=0 to avoid QCOM image texture
optimizations that corrupt the output buffer after ~33 frames.
This was accidentally commented out in a855173.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* modeld: create SubMaster before model loading

Move PubMaster/SubMaster creation before the model loading step.
During model loading (3.5s+), process_replay may send liveCalibration.
If SubMaster doesn't exist yet, the message is dropped and the warp
transform stays as zeros, producing garbage warped images.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Revert "modeld: create SubMaster before model loading"

This reverts commit 968c987c2fbb3fce141c4e345d10ddea559b6c50.

* stale metadata?

* claude debug

* Revert "claude debug"

This reverts commit 49e754c6affa45a8ea8834588a00227b8090b17a.

* Revert "stale metadata?"

This reverts commit 870388513c0d4a67dcf970cd277b6db56cb2b478.

* modeld: realize jit outputs before parsing

* Update modeld.py

* modeld: fix NameError by removing redundant MODELS_DIR definition

* test buffers in test vs. compile

* 2x inputs before running

* fixup 2x inputs test

* realize onnx weights?

* Revert "realize onnx weights?"

This reverts commit 49c8b9a505db38ff22f342db011a3a6b6526d398.

* move openpilot_hacks flag to sconscript

* stricter test vs compile

* correct timings

* more run more fail?

* Revert "more run more fail?"

This reverts commit 9e94bb63940751ec29e81b634c42449113e1f2e5.

* numpy shenanigans

* correct shapes

* dont assert timings for now

* Revert "correct shapes"

This reverts commit 5b9ff6c84c0022327d21801d179e9e51c39e8f78.

* Revert "numpy shenanigans"

This reverts commit b4f6fb3078d7e9b09698895b88728fd8eea8c8a8.

* no need to nuke

* comment unused

* don't use NPY device

* copy instead of from_blob

* to device before jit

* Revert "to device before jit"

This reverts commit 7a59ed9b1ac88657b5a3917986b6ff92e59a2ee3.

* Revert "copy instead of from_blob"

This reverts commit 196c4892a06ffba89ef631876372cecf137cc1b4.

* Revert "don't use NPY device"

This reverts commit 18abf43bbac46ad47a60c03dd8d1ef40b3f59227.

* 3 runs is enough

* no_memory_planner=1

* lint

* restore model_replay.py

* on policy -> policy

* unused

* prepare only enqueues full images

* warp with image=2?

* unused args

* test vs compile, check different inputs different outputs

* avoid uop cache collision

* dont need realize here

* misc

* input queues diverged

* strict zip

* monkey patch for now

* memory planner

* prev desire correct order

* dedupe pkl paths / compile targets

* don't change behavior, warp and enqueue frames when skipping model eval

* actually prepare only

* warm up warp jit

* correct path

* oops

* explicit warmup

* need continuous + can't have dupplicate jit inputs

* whitespace

* bufs -> input_queues

* master tg

* /N_RUNS

* bump tg, remove uop cache patch

* more readable

* Revert "bump tg, remove uop cache patch"

This reverts commit 499acca2591becd389de4025943f9e776a5b337c.

* missing dep

---------

Co-authored-by: Bruce Wayne <harald.the.engineer@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 12:37:56 -07:00
Harald Schäfer
c91a0a83f6 Revert OP (#37812)
* Revert "OP model 7 (#37760)"

This reverts commit 052692b25d.

* Revert "OP model (#37740)"

This reverts commit cb32793300.

* dead

* parse_model_outputs: drop extra space
2026-04-12 23:47:43 -04:00
Harald Schäfer
09a55a7833 autodetect tg backend: use CPU:LLVM on Linux (#37785)
autodetect tg backend: use CPU:LLVM on Linux, CPU on Darwin
2026-04-08 16:16:39 -07:00
Harald Schäfer
21538e5a09 autodetect tg backend (#37778)
* pick fastest

* save config

* fix

* Ignore generated tg_compiled_flags file

* helper

* cleaner

* whitespace not needed

* no shebang

* whitespace
2026-04-08 13:55:00 -07:00
Armand du Parc Locmaria
55c3885742 bump tg (#37700)
* bump tg

* bump tg

* assign

* bump

* cpu llvm

* frame buffer updated in place, no need to return

* don't bake in stale pointers

* fix update image output indices

* lint

* bump
2026-04-02 09:16:11 -07:00
Harald Schäfer
cb32793300 OP model (#37740)
* Off policy model

* 2f70b996-c604-4a46-9ac9-13ce7534605b/100

* misc fixes

* 1cc1791b-4555-41ce-a5cb-ce046967075a/100

* fix model

* 6ab6fae5-fbbd-4ad0-928a-b33794f60dba/100

* recomp

* update models

* qxfinally correct

* b8b96ac6-7918-401a-a862-eaf1fdbba88d/100

* wrong plan

* wrong plan

* Vf9b3fb5f-4d0d-4dcb-bc3a-5e94d1fdcdaa/200

* bump dbc

* ready to merge

* rename to on-policy

* Just cleanup big models for now

---------

Co-authored-by: Kacper Rączy <gfw.kra@gmail.com>
2026-04-01 16:24:50 -07:00
Shane Smiskol
6e7587a75c modeld: quiet do_chunk output during scons build (#37654)
* modeld: quiet do_chunk output during scons build

SCons default-prints Python function actions with all their args.
The do_chunk function has 1259 tinygrad source files as deps, causing
a wall of text during builds. Wrap in SAction with a short strfunction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* split compile and chunk into separate Commands

cleaner fix: do_chunk only depends on the pkl, not tinygrad files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 23:35:56 -07:00
Harald Schäfer
159d3a30e3 RM onnx (#37377)
* Give tf flags to onnx parse

* rm onnx again

* update lock
2026-02-24 15:35:52 -08:00
Harald Schäfer
16dda06a0c Reapply chunker (#37292)
* Reapply chunker

* good size

* rm glob

* cleaner

* back to 45mb

* warp need not be fixed

* add manifest path

* lil cleaner
2026-02-23 16:49:48 -08:00
Bruce Wayne
d6af0e6eb5 Revert "Simpler file chunker (#37276)"
This reverts commit b27fa58444.
2026-02-20 16:43:43 -08:00
Harald Schäfer
b27fa58444 Simpler file chunker (#37276)
* Chunk tinygrad pkl below GitHub max size

* pull that out

* rm glob

* make work

* Single name def

* unused comment

* more cleanups

* revert that

* 10MB overhead

---------

Co-authored-by: Adeeb Shihadeh <adeebshihadeh@gmail.com>
2026-02-20 10:37:14 -08:00
YassineYousfi
2ba6df2506 chunk tinygrad pkl below GitHub max size - NoCache and AlwaysBuild (#37194)
* nocache

* +

* fixes

* lint

* not split

* use pathlib

* cleanup

* better

* even better
2026-02-13 10:14:24 -08:00
Harald Schäfer
af1583cdfc Reapply tgwarp w NV12 fix (#37168)
* Revert "Revert tgwarp again (#37161)"

This reverts commit 45099e7fcd.

* Weird uv sizes

* Fix interleaving

* Fix on CPU

* make CPU safe

* Prevent corruption without clone

* Claude knows speeed

* fix interleaving

* less kernels

* blob caching

* This is still slightly faster

* Comment for blob cache
2026-02-12 08:59:19 -08:00
Harald Schäfer
45099e7fcd Revert tgwarp again (#37161)
* Reapply "revert tg calib and opencl cleanup (#37113)" (#37115)

This reverts commit 667f3bb32f.

* revert msgq too

* msgq on master
2026-02-10 23:12:41 -08:00
Harald Schäfer
3d11e8ef36 Revert "Chunk big model files (#37134)" (#37139)
This reverts commit a941e8f78f.
2026-02-09 20:58:22 -08:00
Harald Schäfer
a941e8f78f Chunk big model files (#37134)
* file chunking

* try this

* more cleanup

* cleaner
2026-02-09 15:29:50 -08:00
Adeeb Shihadeh
667f3bb32f Revert "revert tg calib and opencl cleanup (#37113)" (#37115)
* Revert "revert tg calib and opencl cleanup (#37113)"

This reverts commit 51312afd3d.

* power draw is a lil higher

* just don't miss a cycle

* fix warp targets

* fix tinygrad dep
2026-02-07 21:36:44 -08:00
Harald Schäfer
51312afd3d revert tg calib and opencl cleanup (#37113)
* Revert "Remove all the OpenCL (#37105)"

This reverts commit d5cbb89d84.

* Revert "rm common/mat.h"

This reverts commit 4ce701150a.

* Revert "Calibrate in tg (#36621)"

This reverts commit 593c3a0c8e.
2026-02-07 09:10:29 -08:00
Harald Schäfer
593c3a0c8e Calibrate in tg (#36621)
* squash

* bump tg

* fix linmt

* Ready to merge

* cleaner

* match modeld

* more dead stuff
2026-02-06 14:13:46 -08:00
King Art
db3df61c34 fix non-determinism in modeld build (#37042)
* fix non-determinism in selfservice model build

also trim down model compile dependencies to the minimum required

* Apply suggestions from code review

---------

Co-authored-by: Shane Smiskol <shane@smiskol.com>
2026-01-30 17:16:56 -08:00
Matt Purnell
1f9efd9311 transformations: move Cython to pure Python (#36830)
* Remove cython for transformations

* Add new test

* Switch back to program to fix mac builds

* Convert to Python instead

* Fix failing builds

* lint

* Implement conversion in pure python/numpy

* Add more tests

* Fix bugs in tests
2026-01-16 22:31:26 -08:00
ZwX1616
b778da1d7c dmonitoringmodeld: clean up data structures (#36624)
* update onnx

* get meta

* start

* cast

* deprecate notready

* more

* line too long

* 2
2025-11-14 14:29:04 -08:00
Harald Schäfer
a1795f80dd Latest tinygrad (#36615)
* Latest tinygrad

* jit batch size

* bump again

* limit upcast

* latest tgf

* upstream tg
2025-11-13 17:08:14 -08:00
Adeeb Shihadeh
cf5b743de6 build system cleanups (#36202)
* it's all common

* never getting fixed

* it's just tici

* reorders

* qcom2 -> tici

* Revert "qcom2 -> tici"

This reverts commit f4d849b2952cb0e662975805db6a1d32511ed392.

* Reapply "qcom2 -> tici"

This reverts commit 58b193cb8de872830f8a7821a339edca14e4a337.

* is tici

* lil more

* Revert "is tici"

This reverts commit a169be18d3fdcb3ef8317a63a89d8becadabfad8.

* Revert "Reapply "qcom2 -> tici""

This reverts commit 26f9c0e7d068fc8a1a5f07383b3616e619cd4e8c.

* qcom2 -> __tici__

* lil more

* mv lenv

* clean that up

* lil more]

* fix

* lil more
2025-09-25 20:55:14 -07:00
commaci-public
b6e0d4807a [bot] Update Python packages (#36184)
* Update Python packages

* not available anymore

* also this

* also this

* maybe?

* version

* try

* Revert "version"

This reverts commit 9ac4401b9ca59677b82736faff8baf66861df5f2.

* revert

* cffi

* issue

* comment

---------

Co-authored-by: Vehicle Researcher <user@comma.ai>
Co-authored-by: Maxime Desroches <desroches.maxime@gmail.com>
2025-09-20 20:10:51 -07:00
Harald Schäfer
35ed6bc3a9 Tinygrad DEV=DEVICE (#35814)
* Reapply "Tinygrad DEV=DEVICE (#35809)"

This reverts commit 5e07636d54.

* bump tg
2025-07-26 21:21:25 -07:00
Bruce Wayne
5e07636d54 Revert "Tinygrad DEV=DEVICE (#35809)"
This reverts commit 47f23828d2.
2025-07-25 12:54:11 -07:00
Harald Schäfer
47f23828d2 Tinygrad DEV=DEVICE (#35809)
* bump tg

* step one cleanup

* cleanup

* typo

* cleaner

* cleaner

* Revert "cleaner"

This reverts commit 9c1abd0dc06b4564e61dd32b0e93375badbc9ca5.

* usbgpu

* bit cleaner

* cleaner sconscript
2025-07-25 11:53:08 -07:00
Shane Smiskol
6f1a1b3213 Revert "modeld: autodetect tinygrad backend" (#35701)
Revert "modeld: autodetect tinygrad backend (#35405)"

This reverts commit ce92fd1a0f.
2025-07-12 00:52:18 -07:00
Andrei Radulescu
ce92fd1a0f modeld: autodetect tinygrad backend (#35405)
* modeld: autodetect tinygrad backend

* modeld: autodetect tinygrad CUDA backend

* Revert "modeld: autodetect tinygrad CUDA backend"

This reverts commit 0e9755fb3c5c2021de27f4d230bd0a162883bc37.

* comment why llvm@19

Co-authored-by: Adeeb Shihadeh <adeebshihadeh@gmail.com>

* backend from jit

* fix static analysis

* simplify

* compile flags log

---------

Co-authored-by: Adeeb Shihadeh <adeebshihadeh@gmail.com>
2025-07-11 19:48:35 -07:00
eFini
c3c5992f88 modeld: avoid using USB GPU on a AMD laptop (#35602)
modeld: avoid using usb GPU if 'USBGPU' is not in os.environ

Co-authored-by: Adeeb Shihadeh <adeebshihadeh@gmail.com>
2025-06-29 14:37:51 -07:00
Kacper Rączy
0218ae82ed Fix openpilot-prebuilt image build (#35607)
Fix tinygrad shell exec
2025-06-27 02:51:20 +00:00
Adeeb Shihadeh
350a235303 modeld: more USB GPU fixes (#35306)
* fixups

* builds
2025-05-20 19:41:58 -07:00
Andrei Radulescu
f630cac06f modeld: replace CLANG=1 with CPU=1 (#35270)
Replace CLANG=1 with CPU=1



---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/andiradulescu/openpilot?shareId=XXXX-XXXX-XXXX-XXXX).
2025-05-18 05:57:45 -07:00
Adeeb Shihadeh
d0bf2be6f0 External GPU support for big models (#35172)
* usb gpu

* cleanup

---------

Co-authored-by: Comma Device <device@comma.ai>
2025-05-13 17:12:32 -07:00
Adeeb Shihadeh
67486ff92d bump tinygrad (#35208)
* bump tinygrad

* fix

* why is mac different?

* fix sim

* relax that
2025-05-13 16:59:35 -07:00