modeld: retain SNPE and thneed drive model support (#555)

* modeld: Retain pre-20hz drive model support

* Method not available anymore on OP

* some fixes

* Revert "Long planner get accel: new function args (#34288)"

* Revert "Fix low-speed allow_throttle behavior in long planner (#33894)"

* Revert "long planner: allow throttle reflects usage (#33792)"

* Revert "Gate acceleration on model gas press predictions (#33643)"

* Reapply "Gate acceleration on model gas press predictions (#33643)"

This reverts commit 76b08e37cb.

* Reapply "long planner: allow throttle reflects usage (#33792)"

This reverts commit c75244ca4e.

* Reapply "Fix low-speed allow_throttle behavior in long planner (#33894)"

This reverts commit b2b7d21b7b.

* Reapply "Long planner get accel: new function args (#34288)"

This reverts commit 74dca2fccf.

* don't need

* retain snpe

* wrong

* they're symlinks

* remove

* put back into VCS

* add back

* don't include built

* Refactor model runner retrieval with caching support

Added caching for active model runner type via `ModelRunnerTypeCache` to enhance performance and avoid redundant checks. Introduced a `force_check` flag to bypass the cache when necessary. Updated related code to handle cache clearing during onroad transitions.

* Update model runner determination logic with caching fix

Enhances `get_active_model_runner` to utilize caching more effectively by ensuring type consistency and updating cache only when necessary. Also updates `is_snpe_model` to pass the `started` state to the runner determination function, improving behavior for dynamic checks.

* default to none

* enable in next PR

* more

---------

Co-authored-by: DevTekVE <devtekve@gmail.com>
This commit is contained in:
Jason Wen
2025-01-10 18:34:06 -05:00
committed by GitHub
parent 7329128325
commit acd46aa94b
109 changed files with 9475 additions and 3 deletions

3
.gitignore vendored
View File

@@ -74,6 +74,9 @@ comma*.sh
selfdrive/modeld/thneed/compile
selfdrive/modeld/models/*.thneed
selfdrive/modeld/models/*.pkl
sunnypilot/modeld/thneed/compile
sunnypilot/modeld/models/*.thneed
sunnypilot/modeld/models/*.pkl
*.bz2
*.zst

2
.gitmodules vendored
View File

@@ -15,4 +15,4 @@
url = https://github.com/commaai/teleoprtc
[submodule "tinygrad"]
path = tinygrad_repo
url = https://github.com/tinygrad/tinygrad.git
url = https://github.com/commaai/tinygrad.git

View File

@@ -392,6 +392,8 @@ SConscript(['third_party/SConscript'])
SConscript(['selfdrive/SConscript'])
SConscript(['sunnypilot/SConscript'])
if Dir('#tools/cabana/').exists() and GetOption('extras'):
SConscript(['tools/replay/SConscript'])
if arch != "larch64":

View File

@@ -64,6 +64,11 @@ struct ModelManagerSP @0xaedffd8f31e7b55d {
progress @1 :Float32;
eta @2 :UInt32;
}
enum Runner {
snpe @0;
tinygrad @1;
}
struct ModelBundle {
index @0 :UInt32;

View File

@@ -203,6 +203,7 @@ std::unordered_map<std::string, uint32_t> keys = {
// --- sunnypilot params --- //
{"ApiCache_DriveStats", PERSISTENT},
{"EnableGithubRunner", PERSISTENT | BACKUP},
{"ModelRunnerTypeCache", CLEAR_ON_ONROAD_TRANSITION},
// MADS params
{"Mads", PERSISTENT | BACKUP},

1
sunnypilot/SConscript Normal file
View File

@@ -0,0 +1 @@
SConscript(['modeld/SConscript'])

1
sunnypilot/modeld/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
*_pyx.cpp

View File

@@ -0,0 +1,65 @@
import glob
Import('env', 'envCython', 'arch', 'cereal', 'messaging', 'common', 'gpucommon', 'visionipc', 'transformations')
lenv = env.Clone()
lenvCython = envCython.Clone()
libs = [cereal, messaging, visionipc, gpucommon, common, 'capnp', 'kj', 'pthread']
frameworks = []
common_src = [
"models/commonmodel.cc",
"transforms/loadyuv.cc",
"transforms/transform.cc",
]
thneed_src_common = [
"thneed/thneed_common.cc",
"thneed/serialize.cc",
]
thneed_src_qcom = thneed_src_common + ["thneed/thneed_qcom2.cc"]
thneed_src_pc = thneed_src_common + ["thneed/thneed_pc.cc"]
thneed_src = thneed_src_qcom if arch == "larch64" else thneed_src_pc
# SNPE except on Mac and ARM Linux
snpe_lib = []
if arch != "Darwin" and arch != "aarch64":
common_src += ['runners/snpemodel.cc']
snpe_lib += ['SNPE']
# OpenCL is a framework on Mac
if arch == "Darwin":
frameworks += ['OpenCL']
else:
libs += ['OpenCL']
# Set path definitions
for pathdef, fn in {'TRANSFORM': 'transforms/transform.cl', 'LOADYUV': 'transforms/loadyuv.cl'}.items():
for xenv in (lenv, lenvCython):
xenv['CXXFLAGS'].append(f'-D{pathdef}_PATH=\\"{File(fn).abspath}\\"')
# Compile cython
snpe_rpath_qcom = "/data/pythonpath/third_party/snpe/larch64"
snpe_rpath_pc = f"{Dir('#').abspath}/third_party/snpe/x86_64-linux-clang"
snpe_rpath = lenvCython['RPATH'] + [snpe_rpath_qcom if arch == "larch64" else snpe_rpath_pc]
cython_libs = envCython["LIBS"] + libs
snpemodel_lib = lenv.Library('snpemodel', ['runners/snpemodel.cc'])
commonmodel_lib = lenv.Library('commonmodel', common_src)
lenvCython.Program('runners/runmodel_pyx.so', 'runners/runmodel_pyx.pyx', LIBS=cython_libs, FRAMEWORKS=frameworks)
lenvCython.Program('runners/snpemodel_pyx.so', 'runners/snpemodel_pyx.pyx', LIBS=[snpemodel_lib, snpe_lib, *cython_libs], FRAMEWORKS=frameworks, RPATH=snpe_rpath)
lenvCython.Program('models/commonmodel_pyx.so', 'models/commonmodel_pyx.pyx', LIBS=[commonmodel_lib, *cython_libs], FRAMEWORKS=frameworks)
tinygrad_files = ["#"+x for x in glob.glob(env.Dir("#tinygrad_repo").relpath + "/**", recursive=True, root_dir=env.Dir("#").abspath)]
# Get model metadata
fn = File("models/supercombo").abspath
cmd = f'python3 {Dir("#selfdrive/modeld").abspath}/get_model_metadata.py {fn}.onnx'
lenv.Command(fn + "_metadata.pkl", [fn + ".onnx"] + tinygrad_files, cmd)
if arch == "larch64":
thneed_lib = env.SharedLibrary('thneed', thneed_src, LIBS=[gpucommon, common, 'OpenCL', 'dl'])
thneedmodel_lib = env.Library('thneedmodel', ['runners/thneedmodel.cc'])
lenvCython.Program('runners/thneedmodel_pyx.so', 'runners/thneedmodel_pyx.pyx', LIBS=envCython["LIBS"]+[thneedmodel_lib, thneed_lib, gpucommon, common, 'dl', 'OpenCL'])

View File

View File

@@ -0,0 +1,85 @@
import numpy as np
def index_function(idx, max_val=192, max_idx=32):
return (max_val) * ((idx/max_idx)**2)
class ModelConstants:
# time and distance indices
IDX_N = 33
T_IDXS = [index_function(idx, max_val=10.0) for idx in range(IDX_N)]
X_IDXS = [index_function(idx, max_val=192.0) for idx in range(IDX_N)]
LEAD_T_IDXS = [0., 2., 4., 6., 8., 10.]
LEAD_T_OFFSETS = [0., 2., 4.]
META_T_IDXS = [2., 4., 6., 8., 10.]
# model inputs constants
MODEL_FREQ = 20
FEATURE_LEN = 512
HISTORY_BUFFER_LEN = 99
DESIRE_LEN = 8
TRAFFIC_CONVENTION_LEN = 2
LAT_PLANNER_STATE_LEN = 4
LATERAL_CONTROL_PARAMS_LEN = 2
PREV_DESIRED_CURV_LEN = 1
# model outputs constants
FCW_THRESHOLDS_5MS2 = np.array([.05, .05, .15, .15, .15], dtype=np.float32)
FCW_THRESHOLDS_3MS2 = np.array([.7, .7], dtype=np.float32)
FCW_5MS2_PROBS_WIDTH = 5
FCW_3MS2_PROBS_WIDTH = 2
DISENGAGE_WIDTH = 5
POSE_WIDTH = 6
WIDE_FROM_DEVICE_WIDTH = 3
SIM_POSE_WIDTH = 6
LEAD_WIDTH = 4
LANE_LINES_WIDTH = 2
ROAD_EDGES_WIDTH = 2
PLAN_WIDTH = 15
DESIRE_PRED_WIDTH = 8
LAT_PLANNER_SOLUTION_WIDTH = 4
DESIRED_CURV_WIDTH = 1
NUM_LANE_LINES = 4
NUM_ROAD_EDGES = 2
LEAD_TRAJ_LEN = 6
DESIRE_PRED_LEN = 4
PLAN_MHP_N = 5
LEAD_MHP_N = 2
PLAN_MHP_SELECTION = 1
LEAD_MHP_SELECTION = 3
FCW_THRESHOLD_5MS2_HIGH = 0.15
FCW_THRESHOLD_5MS2_LOW = 0.05
FCW_THRESHOLD_3MS2 = 0.7
CONFIDENCE_BUFFER_LEN = 5
RYG_GREEN = 0.01165
RYG_YELLOW = 0.06157
POLY_PATH_DEGREE = 4
# model outputs slices
class Plan:
POSITION = slice(0, 3)
VELOCITY = slice(3, 6)
ACCELERATION = slice(6, 9)
T_FROM_CURRENT_EULER = slice(9, 12)
ORIENTATION_RATE = slice(12, 15)
class Meta:
ENGAGED = slice(0, 1)
# next 2, 4, 6, 8, 10 seconds
GAS_DISENGAGE = slice(1, 31, 6)
BRAKE_DISENGAGE = slice(2, 31, 6)
STEER_OVERRIDE = slice(3, 31, 6)
HARD_BRAKE_3 = slice(4, 31, 6)
HARD_BRAKE_4 = slice(5, 31, 6)
HARD_BRAKE_5 = slice(6, 31, 6)
# next 0, 2, 4, 6, 8, 10 seconds
GAS_PRESS = slice(31, 55, 4)
BRAKE_PRESS = slice(32, 55, 4)
LEFT_BLINKER = slice(33, 55, 4)
RIGHT_BLINKER = slice(34, 55, 4)

View File

@@ -0,0 +1,220 @@
import os
import capnp
import numpy as np
from cereal import log
from openpilot.sunnypilot.modeld.constants import ModelConstants, Plan, Meta
SEND_RAW_PRED = os.getenv('SEND_RAW_PRED')
ConfidenceClass = log.ModelDataV2.ConfidenceClass
class PublishState:
def __init__(self):
self.disengage_buffer = np.zeros(ModelConstants.CONFIDENCE_BUFFER_LEN*ModelConstants.DISENGAGE_WIDTH, dtype=np.float32)
self.prev_brake_5ms2_probs = np.zeros(ModelConstants.FCW_5MS2_PROBS_WIDTH, dtype=np.float32)
self.prev_brake_3ms2_probs = np.zeros(ModelConstants.FCW_3MS2_PROBS_WIDTH, dtype=np.float32)
def fill_xyzt(builder, t, x, y, z, x_std=None, y_std=None, z_std=None):
builder.t = t
builder.x = x.tolist()
builder.y = y.tolist()
builder.z = z.tolist()
if x_std is not None:
builder.xStd = x_std.tolist()
if y_std is not None:
builder.yStd = y_std.tolist()
if z_std is not None:
builder.zStd = z_std.tolist()
def fill_xyvat(builder, t, x, y, v, a, x_std=None, y_std=None, v_std=None, a_std=None):
builder.t = t
builder.x = x.tolist()
builder.y = y.tolist()
builder.v = v.tolist()
builder.a = a.tolist()
if x_std is not None:
builder.xStd = x_std.tolist()
if y_std is not None:
builder.yStd = y_std.tolist()
if v_std is not None:
builder.vStd = v_std.tolist()
if a_std is not None:
builder.aStd = a_std.tolist()
def fill_xyz_poly(builder, degree, x, y, z):
xyz = np.stack([x, y, z], axis=1)
coeffs = np.polynomial.polynomial.polyfit(ModelConstants.T_IDXS, xyz, deg=degree)
builder.xCoefficients = coeffs[:, 0].tolist()
builder.yCoefficients = coeffs[:, 1].tolist()
builder.zCoefficients = coeffs[:, 2].tolist()
def fill_model_msg(base_msg: capnp._DynamicStructBuilder, extended_msg: capnp._DynamicStructBuilder,
net_output_data: dict[str, np.ndarray], publish_state: PublishState,
vipc_frame_id: int, vipc_frame_id_extra: int, frame_id: int, frame_drop: float,
timestamp_eof: int, model_execution_time: float, valid: bool) -> None:
frame_age = frame_id - vipc_frame_id if frame_id > vipc_frame_id else 0
frame_drop_perc = frame_drop * 100
extended_msg.valid = valid
base_msg.valid = valid
driving_model_data = base_msg.drivingModelData
driving_model_data.frameId = vipc_frame_id
driving_model_data.frameIdExtra = vipc_frame_id_extra
driving_model_data.frameDropPerc = frame_drop_perc
driving_model_data.modelExecutionTime = model_execution_time
action = driving_model_data.action
action.desiredCurvature = float(net_output_data['desired_curvature'][0,0])
modelV2 = extended_msg.modelV2
modelV2.frameId = vipc_frame_id
modelV2.frameIdExtra = vipc_frame_id_extra
modelV2.frameAge = frame_age
modelV2.frameDropPerc = frame_drop_perc
modelV2.timestampEof = timestamp_eof
modelV2.modelExecutionTime = model_execution_time
# plan
position = modelV2.position
fill_xyzt(position, ModelConstants.T_IDXS, *net_output_data['plan'][0,:,Plan.POSITION].T, *net_output_data['plan_stds'][0,:,Plan.POSITION].T)
velocity = modelV2.velocity
fill_xyzt(velocity, ModelConstants.T_IDXS, *net_output_data['plan'][0,:,Plan.VELOCITY].T)
acceleration = modelV2.acceleration
fill_xyzt(acceleration, ModelConstants.T_IDXS, *net_output_data['plan'][0,:,Plan.ACCELERATION].T)
orientation = modelV2.orientation
fill_xyzt(orientation, ModelConstants.T_IDXS, *net_output_data['plan'][0,:,Plan.T_FROM_CURRENT_EULER].T)
orientation_rate = modelV2.orientationRate
fill_xyzt(orientation_rate, ModelConstants.T_IDXS, *net_output_data['plan'][0,:,Plan.ORIENTATION_RATE].T)
# temporal pose
temporal_pose = modelV2.temporalPose
temporal_pose.trans = net_output_data['plan'][0,0,Plan.VELOCITY].tolist()
temporal_pose.transStd = net_output_data['plan_stds'][0,0,Plan.VELOCITY].tolist()
temporal_pose.rot = net_output_data['plan'][0,0,Plan.ORIENTATION_RATE].tolist()
temporal_pose.rotStd = net_output_data['plan_stds'][0,0,Plan.ORIENTATION_RATE].tolist()
# poly path
poly_path = driving_model_data.path
fill_xyz_poly(poly_path, ModelConstants.POLY_PATH_DEGREE, *net_output_data['plan'][0,:,Plan.POSITION].T)
# lateral planning
action = modelV2.action
action.desiredCurvature = float(net_output_data['desired_curvature'][0,0])
# times at X_IDXS according to model plan
PLAN_T_IDXS = [np.nan] * ModelConstants.IDX_N
PLAN_T_IDXS[0] = 0.0
plan_x = net_output_data['plan'][0,:,Plan.POSITION][:,0].tolist()
for xidx in range(1, ModelConstants.IDX_N):
tidx = 0
# increment tidx until we find an element that's further away than the current xidx
while tidx < ModelConstants.IDX_N - 1 and plan_x[tidx+1] < ModelConstants.X_IDXS[xidx]:
tidx += 1
if tidx == ModelConstants.IDX_N - 1:
# if the Plan doesn't extend far enough, set plan_t to the max value (10s), then break
PLAN_T_IDXS[xidx] = ModelConstants.T_IDXS[ModelConstants.IDX_N - 1]
break
# interpolate to find `t` for the current xidx
current_x_val = plan_x[tidx]
next_x_val = plan_x[tidx+1]
p = (ModelConstants.X_IDXS[xidx] - current_x_val) / (next_x_val - current_x_val) if abs(next_x_val - current_x_val) > 1e-9 else float('nan')
PLAN_T_IDXS[xidx] = p * ModelConstants.T_IDXS[tidx+1] + (1 - p) * ModelConstants.T_IDXS[tidx]
# lane lines
modelV2.init('laneLines', 4)
for i in range(4):
lane_line = modelV2.laneLines[i]
fill_xyzt(lane_line, PLAN_T_IDXS, np.array(ModelConstants.X_IDXS), net_output_data['lane_lines'][0,i,:,0], net_output_data['lane_lines'][0,i,:,1])
modelV2.laneLineStds = net_output_data['lane_lines_stds'][0,:,0,0].tolist()
modelV2.laneLineProbs = net_output_data['lane_lines_prob'][0,1::2].tolist()
lane_line_meta = driving_model_data.laneLineMeta
lane_line_meta.leftY = modelV2.laneLines[1].y[0]
lane_line_meta.leftProb = modelV2.laneLineProbs[1]
lane_line_meta.rightY = modelV2.laneLines[2].y[0]
lane_line_meta.rightProb = modelV2.laneLineProbs[2]
# road edges
modelV2.init('roadEdges', 2)
for i in range(2):
road_edge = modelV2.roadEdges[i]
fill_xyzt(road_edge, PLAN_T_IDXS, np.array(ModelConstants.X_IDXS), net_output_data['road_edges'][0,i,:,0], net_output_data['road_edges'][0,i,:,1])
modelV2.roadEdgeStds = net_output_data['road_edges_stds'][0,:,0,0].tolist()
# leads
modelV2.init('leadsV3', 3)
for i in range(3):
lead = modelV2.leadsV3[i]
fill_xyvat(lead, ModelConstants.LEAD_T_IDXS, *net_output_data['lead'][0,i].T, *net_output_data['lead_stds'][0,i].T)
lead.prob = net_output_data['lead_prob'][0,i].tolist()
lead.probTime = ModelConstants.LEAD_T_OFFSETS[i]
# meta
meta = modelV2.meta
meta.desireState = net_output_data['desire_state'][0].reshape(-1).tolist()
meta.desirePrediction = net_output_data['desire_pred'][0].reshape(-1).tolist()
meta.engagedProb = net_output_data['meta'][0,Meta.ENGAGED].item()
meta.init('disengagePredictions')
disengage_predictions = meta.disengagePredictions
disengage_predictions.t = ModelConstants.META_T_IDXS
disengage_predictions.brakeDisengageProbs = net_output_data['meta'][0,Meta.BRAKE_DISENGAGE].tolist()
disengage_predictions.gasDisengageProbs = net_output_data['meta'][0,Meta.GAS_DISENGAGE].tolist()
disengage_predictions.steerOverrideProbs = net_output_data['meta'][0,Meta.STEER_OVERRIDE].tolist()
disengage_predictions.brake3MetersPerSecondSquaredProbs = net_output_data['meta'][0,Meta.HARD_BRAKE_3].tolist()
disengage_predictions.brake4MetersPerSecondSquaredProbs = net_output_data['meta'][0,Meta.HARD_BRAKE_4].tolist()
disengage_predictions.brake5MetersPerSecondSquaredProbs = net_output_data['meta'][0,Meta.HARD_BRAKE_5].tolist()
disengage_predictions.gasPressProbs = net_output_data['meta'][0,Meta.GAS_PRESS].tolist()
disengage_predictions.brakePressProbs = net_output_data['meta'][0,Meta.BRAKE_PRESS].tolist()
publish_state.prev_brake_5ms2_probs[:-1] = publish_state.prev_brake_5ms2_probs[1:]
publish_state.prev_brake_5ms2_probs[-1] = net_output_data['meta'][0,Meta.HARD_BRAKE_5][0]
publish_state.prev_brake_3ms2_probs[:-1] = publish_state.prev_brake_3ms2_probs[1:]
publish_state.prev_brake_3ms2_probs[-1] = net_output_data['meta'][0,Meta.HARD_BRAKE_3][0]
hard_brake_predicted = (publish_state.prev_brake_5ms2_probs > ModelConstants.FCW_THRESHOLDS_5MS2).all() and \
(publish_state.prev_brake_3ms2_probs > ModelConstants.FCW_THRESHOLDS_3MS2).all()
meta.hardBrakePredicted = hard_brake_predicted.item()
# confidence
if vipc_frame_id % (2*ModelConstants.MODEL_FREQ) == 0:
# any disengage prob
brake_disengage_probs = net_output_data['meta'][0,Meta.BRAKE_DISENGAGE]
gas_disengage_probs = net_output_data['meta'][0,Meta.GAS_DISENGAGE]
steer_override_probs = net_output_data['meta'][0,Meta.STEER_OVERRIDE]
any_disengage_probs = 1-((1-brake_disengage_probs)*(1-gas_disengage_probs)*(1-steer_override_probs))
# independent disengage prob for each 2s slice
ind_disengage_probs = np.r_[any_disengage_probs[0], np.diff(any_disengage_probs) / (1 - any_disengage_probs[:-1])]
# rolling buf for 2, 4, 6, 8, 10s
publish_state.disengage_buffer[:-ModelConstants.DISENGAGE_WIDTH] = publish_state.disengage_buffer[ModelConstants.DISENGAGE_WIDTH:]
publish_state.disengage_buffer[-ModelConstants.DISENGAGE_WIDTH:] = ind_disengage_probs
score = 0.
for i in range(ModelConstants.DISENGAGE_WIDTH):
score += publish_state.disengage_buffer[i*ModelConstants.DISENGAGE_WIDTH+ModelConstants.DISENGAGE_WIDTH-1-i].item() / ModelConstants.DISENGAGE_WIDTH
if score < ModelConstants.RYG_GREEN:
modelV2.confidence = ConfidenceClass.green
elif score < ModelConstants.RYG_YELLOW:
modelV2.confidence = ConfidenceClass.yellow
else:
modelV2.confidence = ConfidenceClass.red
# raw prediction if enabled
if SEND_RAW_PRED:
modelV2.rawPredictions = net_output_data['raw_pred'].tobytes()
def fill_pose_msg(msg: capnp._DynamicStructBuilder, net_output_data: dict[str, np.ndarray],
vipc_frame_id: int, vipc_dropped_frames: int, timestamp_eof: int, live_calib_seen: bool) -> None:
msg.valid = live_calib_seen & (vipc_dropped_frames < 1)
cameraOdometry = msg.cameraOdometry
cameraOdometry.frameId = vipc_frame_id
cameraOdometry.timestampEof = timestamp_eof
cameraOdometry.trans = net_output_data['pose'][0,:3].tolist()
cameraOdometry.rot = net_output_data['pose'][0,3:].tolist()
cameraOdometry.wideFromDeviceEuler = net_output_data['wide_from_device_euler'][0,:].tolist()
cameraOdometry.roadTransformTrans = net_output_data['road_transform'][0,:3].tolist()
cameraOdometry.transStd = net_output_data['pose_stds'][0,:3].tolist()
cameraOdometry.rotStd = net_output_data['pose_stds'][0,3:].tolist()
cameraOdometry.wideFromDeviceEulerStd = net_output_data['wide_from_device_euler_stds'][0,:].tolist()
cameraOdometry.roadTransformTransStd = net_output_data['road_transform_stds'][0,:3].tolist()

View File

@@ -0,0 +1,28 @@
#!/usr/bin/env python3
import sys
import pathlib
import onnx
import codecs
import pickle
def get_name_and_shape(value_info:onnx.ValueInfoProto) -> tuple[str, tuple[int,...]]:
shape = tuple([int(dim.dim_value) for dim in value_info.type.tensor_type.shape.dim])
name = value_info.name
return name, shape
if __name__ == "__main__":
model_path = pathlib.Path(sys.argv[1])
model = onnx.load(str(model_path))
i = [x.key for x in model.metadata_props].index('output_slices')
output_slices = model.metadata_props[i].value
metadata = {}
metadata['output_slices'] = pickle.loads(codecs.decode(output_slices.encode(), "base64"))
metadata['input_shapes'] = dict([get_name_and_shape(x) for x in model.graph.input])
metadata['output_shapes'] = dict([get_name_and_shape(x) for x in model.graph.output])
metadata_path = model_path.parent / (model_path.stem + '_metadata.pkl')
with open(metadata_path, 'wb') as f:
pickle.dump(metadata, f)
print(f'saved metadata to {metadata_path}')

10
sunnypilot/modeld/modeld Executable file
View File

@@ -0,0 +1,10 @@
#!/usr/bin/env bash
DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null && pwd)"
cd "$DIR/../../"
if [ -f "$DIR/libthneed.so" ]; then
export LD_PRELOAD="$DIR/libthneed.so"
fi
exec "$DIR/modeld.py" "$@"

296
sunnypilot/modeld/modeld.py Executable file
View File

@@ -0,0 +1,296 @@
#!/usr/bin/env python3
import os
import time
import pickle
import numpy as np
import cereal.messaging as messaging
from cereal import car, log
from pathlib import Path
from setproctitle import setproctitle
from cereal.messaging import PubMaster, SubMaster
from msgq.visionipc import VisionIpcClient, VisionStreamType, VisionBuf
from opendbc.car.car_helpers import get_demo_car_params
from openpilot.common.swaglog import cloudlog
from openpilot.common.params import Params
from openpilot.common.filter_simple import FirstOrderFilter
from openpilot.common.realtime import config_realtime_process
from openpilot.common.transformations.camera import DEVICE_CAMERAS
from openpilot.common.transformations.model import get_warp_matrix
from openpilot.system import sentry
from openpilot.selfdrive.controls.lib.desire_helper import DesireHelper
from openpilot.sunnypilot.modeld.runners import ModelRunner, Runtime
from openpilot.sunnypilot.modeld.parse_model_outputs import Parser
from openpilot.sunnypilot.modeld.fill_model_msg import fill_model_msg, fill_pose_msg, PublishState
from openpilot.sunnypilot.modeld.constants import ModelConstants
from openpilot.sunnypilot.modeld.models.commonmodel_pyx import ModelFrame, CLContext
PROCESS_NAME = "sunnypilot.modeld.modeld"
SEND_RAW_PRED = os.getenv('SEND_RAW_PRED')
MODEL_PATHS = {
ModelRunner.THNEED: Path(__file__).parent / 'models/supercombo.thneed',
ModelRunner.ONNX: Path(__file__).parent / 'models/supercombo.onnx'}
METADATA_PATH = Path(__file__).parent / 'models/supercombo_metadata.pkl'
class FrameMeta:
frame_id: int = 0
timestamp_sof: int = 0
timestamp_eof: int = 0
def __init__(self, vipc=None):
if vipc is not None:
self.frame_id, self.timestamp_sof, self.timestamp_eof = vipc.frame_id, vipc.timestamp_sof, vipc.timestamp_eof
class ModelState:
frame: ModelFrame
wide_frame: ModelFrame
inputs: dict[str, np.ndarray]
output: np.ndarray
prev_desire: np.ndarray # for tracking the rising edge of the pulse
model: ModelRunner
def __init__(self, context: CLContext):
self.frame = ModelFrame(context)
self.wide_frame = ModelFrame(context)
self.prev_desire = np.zeros(ModelConstants.DESIRE_LEN, dtype=np.float32)
self.inputs = {
'desire': np.zeros(ModelConstants.DESIRE_LEN * (ModelConstants.HISTORY_BUFFER_LEN+1), dtype=np.float32),
'traffic_convention': np.zeros(ModelConstants.TRAFFIC_CONVENTION_LEN, dtype=np.float32),
'lateral_control_params': np.zeros(ModelConstants.LATERAL_CONTROL_PARAMS_LEN, dtype=np.float32),
'prev_desired_curv': np.zeros(ModelConstants.PREV_DESIRED_CURV_LEN * (ModelConstants.HISTORY_BUFFER_LEN+1), dtype=np.float32),
'features_buffer': np.zeros(ModelConstants.HISTORY_BUFFER_LEN * ModelConstants.FEATURE_LEN, dtype=np.float32),
}
with open(METADATA_PATH, 'rb') as f:
model_metadata = pickle.load(f)
self.output_slices = model_metadata['output_slices']
net_output_size = model_metadata['output_shapes']['outputs'][1]
self.output = np.zeros(net_output_size, dtype=np.float32)
self.parser = Parser()
self.model = ModelRunner(MODEL_PATHS, self.output, Runtime.GPU, False, context)
self.model.addInput("input_imgs", None)
self.model.addInput("big_input_imgs", None)
for k,v in self.inputs.items():
self.model.addInput(k, v)
def slice_outputs(self, model_outputs: np.ndarray) -> dict[str, np.ndarray]:
parsed_model_outputs = {k: model_outputs[np.newaxis, v] for k,v in self.output_slices.items()}
if SEND_RAW_PRED:
parsed_model_outputs['raw_pred'] = model_outputs.copy()
return parsed_model_outputs
def run(self, buf: VisionBuf, wbuf: VisionBuf, transform: np.ndarray, transform_wide: np.ndarray,
inputs: dict[str, np.ndarray], prepare_only: bool) -> dict[str, np.ndarray] | None:
# Model decides when action is completed, so desire input is just a pulse triggered on rising edge
inputs['desire'][0] = 0
self.inputs['desire'][:-ModelConstants.DESIRE_LEN] = self.inputs['desire'][ModelConstants.DESIRE_LEN:]
self.inputs['desire'][-ModelConstants.DESIRE_LEN:] = np.where(inputs['desire'] - self.prev_desire > .99, inputs['desire'], 0)
self.prev_desire[:] = inputs['desire']
self.inputs['traffic_convention'][:] = inputs['traffic_convention']
self.inputs['lateral_control_params'][:] = inputs['lateral_control_params']
# if getCLBuffer is not None, frame will be None
self.model.setInputBuffer("input_imgs", self.frame.prepare(buf, transform.flatten(), self.model.getCLBuffer("input_imgs")))
if wbuf is not None:
self.model.setInputBuffer("big_input_imgs", self.wide_frame.prepare(wbuf, transform_wide.flatten(), self.model.getCLBuffer("big_input_imgs")))
if prepare_only:
return None
self.model.execute()
outputs = self.parser.parse_outputs(self.slice_outputs(self.output))
self.inputs['features_buffer'][:-ModelConstants.FEATURE_LEN] = self.inputs['features_buffer'][ModelConstants.FEATURE_LEN:]
self.inputs['features_buffer'][-ModelConstants.FEATURE_LEN:] = outputs['hidden_state'][0, :]
self.inputs['prev_desired_curv'][:-ModelConstants.PREV_DESIRED_CURV_LEN] = self.inputs['prev_desired_curv'][ModelConstants.PREV_DESIRED_CURV_LEN:]
self.inputs['prev_desired_curv'][-ModelConstants.PREV_DESIRED_CURV_LEN:] = outputs['desired_curvature'][0, :]
return outputs
def main(demo=False):
cloudlog.warning("modeld init")
sentry.set_tag("daemon", PROCESS_NAME)
cloudlog.bind(daemon=PROCESS_NAME)
setproctitle(PROCESS_NAME)
config_realtime_process(7, 54)
cloudlog.warning("setting up CL context")
cl_context = CLContext()
cloudlog.warning("CL context ready; loading model")
model = ModelState(cl_context)
cloudlog.warning("models loaded, modeld starting")
# visionipc clients
while True:
available_streams = VisionIpcClient.available_streams("camerad", block=False)
if available_streams:
use_extra_client = VisionStreamType.VISION_STREAM_WIDE_ROAD in available_streams and VisionStreamType.VISION_STREAM_ROAD in available_streams
main_wide_camera = VisionStreamType.VISION_STREAM_ROAD not in available_streams
break
time.sleep(.1)
vipc_client_main_stream = VisionStreamType.VISION_STREAM_WIDE_ROAD if main_wide_camera else VisionStreamType.VISION_STREAM_ROAD
vipc_client_main = VisionIpcClient("camerad", vipc_client_main_stream, True, cl_context)
vipc_client_extra = VisionIpcClient("camerad", VisionStreamType.VISION_STREAM_WIDE_ROAD, False, cl_context)
cloudlog.warning(f"vision stream set up, main_wide_camera: {main_wide_camera}, use_extra_client: {use_extra_client}")
while not vipc_client_main.connect(False):
time.sleep(0.1)
while use_extra_client and not vipc_client_extra.connect(False):
time.sleep(0.1)
cloudlog.warning(f"connected main cam with buffer size: {vipc_client_main.buffer_len} ({vipc_client_main.width} x {vipc_client_main.height})")
if use_extra_client:
cloudlog.warning(f"connected extra cam with buffer size: {vipc_client_extra.buffer_len} ({vipc_client_extra.width} x {vipc_client_extra.height})")
# messaging
pm = PubMaster(["modelV2", "drivingModelData", "cameraOdometry"])
sm = SubMaster(["deviceState", "carState", "roadCameraState", "liveCalibration", "driverMonitoringState", "carControl"])
publish_state = PublishState()
params = Params()
# setup filter to track dropped frames
frame_dropped_filter = FirstOrderFilter(0., 10., 1. / ModelConstants.MODEL_FREQ)
frame_id = 0
last_vipc_frame_id = 0
run_count = 0
model_transform_main = np.zeros((3, 3), dtype=np.float32)
model_transform_extra = np.zeros((3, 3), dtype=np.float32)
live_calib_seen = False
buf_main, buf_extra = None, None
meta_main = FrameMeta()
meta_extra = FrameMeta()
if demo:
CP = get_demo_car_params()
else:
CP = messaging.log_from_bytes(params.get("CarParams", block=True), car.CarParams)
cloudlog.info("modeld got CarParams: %s", CP.carName)
# TODO this needs more thought, use .2s extra for now to estimate other delays
steer_delay = CP.steerActuatorDelay + .2
DH = DesireHelper()
while True:
# Keep receiving frames until we are at least 1 frame ahead of previous extra frame
while meta_main.timestamp_sof < meta_extra.timestamp_sof + 25000000:
buf_main = vipc_client_main.recv()
meta_main = FrameMeta(vipc_client_main)
if buf_main is None:
break
if buf_main is None:
cloudlog.debug("vipc_client_main no frame")
continue
if use_extra_client:
# Keep receiving extra frames until frame id matches main camera
while True:
buf_extra = vipc_client_extra.recv()
meta_extra = FrameMeta(vipc_client_extra)
if buf_extra is None or meta_main.timestamp_sof < meta_extra.timestamp_sof + 25000000:
break
if buf_extra is None:
cloudlog.debug("vipc_client_extra no frame")
continue
if abs(meta_main.timestamp_sof - meta_extra.timestamp_sof) > 10000000:
cloudlog.error(f"frames out of sync! main: {meta_main.frame_id} ({meta_main.timestamp_sof / 1e9:.5f}),\
extra: {meta_extra.frame_id} ({meta_extra.timestamp_sof / 1e9:.5f})")
else:
# Use single camera
buf_extra = buf_main
meta_extra = meta_main
sm.update(0)
desire = DH.desire
is_rhd = sm["driverMonitoringState"].isRHD
frame_id = sm["roadCameraState"].frameId
lateral_control_params = np.array([sm["carState"].vEgo, steer_delay], dtype=np.float32)
if sm.updated["liveCalibration"] and sm.seen['roadCameraState'] and sm.seen['deviceState']:
device_from_calib_euler = np.array(sm["liveCalibration"].rpyCalib, dtype=np.float32)
dc = DEVICE_CAMERAS[(str(sm['deviceState'].deviceType), str(sm['roadCameraState'].sensor))]
model_transform_main = get_warp_matrix(device_from_calib_euler, dc.ecam.intrinsics if main_wide_camera else dc.fcam.intrinsics, False).astype(np.float32)
model_transform_extra = get_warp_matrix(device_from_calib_euler, dc.ecam.intrinsics, True).astype(np.float32)
live_calib_seen = True
traffic_convention = np.zeros(2)
traffic_convention[int(is_rhd)] = 1
vec_desire = np.zeros(ModelConstants.DESIRE_LEN, dtype=np.float32)
if desire >= 0 and desire < ModelConstants.DESIRE_LEN:
vec_desire[desire] = 1
# tracked dropped frames
vipc_dropped_frames = max(0, meta_main.frame_id - last_vipc_frame_id - 1)
frames_dropped = frame_dropped_filter.update(min(vipc_dropped_frames, 10))
if run_count < 10: # let frame drops warm up
frame_dropped_filter.x = 0.
frames_dropped = 0.
run_count = run_count + 1
frame_drop_ratio = frames_dropped / (1 + frames_dropped)
prepare_only = vipc_dropped_frames > 0
if prepare_only:
cloudlog.error(f"skipping model eval. Dropped {vipc_dropped_frames} frames")
inputs:dict[str, np.ndarray] = {
'desire': vec_desire,
'traffic_convention': traffic_convention,
'lateral_control_params': lateral_control_params,
}
mt1 = time.perf_counter()
model_output = model.run(buf_main, buf_extra, model_transform_main, model_transform_extra, inputs, prepare_only)
mt2 = time.perf_counter()
model_execution_time = mt2 - mt1
if model_output is not None:
modelv2_send = messaging.new_message('modelV2')
drivingdata_send = messaging.new_message('drivingModelData')
posenet_send = messaging.new_message('cameraOdometry')
fill_model_msg(drivingdata_send, modelv2_send, model_output, publish_state, meta_main.frame_id, meta_extra.frame_id, frame_id,
frame_drop_ratio, meta_main.timestamp_eof, model_execution_time, live_calib_seen)
desire_state = modelv2_send.modelV2.meta.desireState
l_lane_change_prob = desire_state[log.Desire.laneChangeLeft]
r_lane_change_prob = desire_state[log.Desire.laneChangeRight]
lane_change_prob = l_lane_change_prob + r_lane_change_prob
DH.update(sm['carState'], sm['carControl'].latActive, lane_change_prob)
modelv2_send.modelV2.meta.laneChangeState = DH.lane_change_state
modelv2_send.modelV2.meta.laneChangeDirection = DH.lane_change_direction
drivingdata_send.drivingModelData.meta.laneChangeState = DH.lane_change_state
drivingdata_send.drivingModelData.meta.laneChangeDirection = DH.lane_change_direction
fill_pose_msg(posenet_send, model_output, meta_main.frame_id, vipc_dropped_frames, meta_main.timestamp_eof, live_calib_seen)
pm.send('modelV2', modelv2_send)
pm.send('drivingModelData', drivingdata_send)
pm.send('cameraOdometry', posenet_send)
last_vipc_frame_id = meta_main.frame_id
if __name__ == "__main__":
try:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--demo', action='store_true', help='A boolean for demo mode.')
args = parser.parse_args()
main(demo=args.demo)
except KeyboardInterrupt:
cloudlog.warning(f"child {PROCESS_NAME} got SIGINT")
except Exception:
sentry.capture_exception()
raise

View File

View File

@@ -0,0 +1,50 @@
#include "sunnypilot/modeld/models/commonmodel.h"
#include <cassert>
#include <cmath>
#include <cstring>
#include "common/clutil.h"
ModelFrame::ModelFrame(cl_device_id device_id, cl_context context) {
input_frames = std::make_unique<float[]>(buf_size);
q = CL_CHECK_ERR(clCreateCommandQueue(context, device_id, 0, &err));
y_cl = CL_CHECK_ERR(clCreateBuffer(context, CL_MEM_READ_WRITE, MODEL_WIDTH * MODEL_HEIGHT, NULL, &err));
u_cl = CL_CHECK_ERR(clCreateBuffer(context, CL_MEM_READ_WRITE, (MODEL_WIDTH / 2) * (MODEL_HEIGHT / 2), NULL, &err));
v_cl = CL_CHECK_ERR(clCreateBuffer(context, CL_MEM_READ_WRITE, (MODEL_WIDTH / 2) * (MODEL_HEIGHT / 2), NULL, &err));
net_input_cl = CL_CHECK_ERR(clCreateBuffer(context, CL_MEM_READ_WRITE, MODEL_FRAME_SIZE * sizeof(float), NULL, &err));
transform_init(&transform, context, device_id);
loadyuv_init(&loadyuv, context, device_id, MODEL_WIDTH, MODEL_HEIGHT);
}
float* ModelFrame::prepare(cl_mem yuv_cl, int frame_width, int frame_height, int frame_stride, int frame_uv_offset, const mat3 &projection, cl_mem *output) {
transform_queue(&this->transform, q,
yuv_cl, frame_width, frame_height, frame_stride, frame_uv_offset,
y_cl, u_cl, v_cl, MODEL_WIDTH, MODEL_HEIGHT, projection);
if (output == NULL) {
loadyuv_queue(&loadyuv, q, y_cl, u_cl, v_cl, net_input_cl);
std::memmove(&input_frames[0], &input_frames[MODEL_FRAME_SIZE], sizeof(float) * MODEL_FRAME_SIZE);
CL_CHECK(clEnqueueReadBuffer(q, net_input_cl, CL_TRUE, 0, MODEL_FRAME_SIZE * sizeof(float), &input_frames[MODEL_FRAME_SIZE], 0, nullptr, nullptr));
clFinish(q);
return &input_frames[0];
} else {
loadyuv_queue(&loadyuv, q, y_cl, u_cl, v_cl, *output, true);
// NOTE: Since thneed is using a different command queue, this clFinish is needed to ensure the image is ready.
clFinish(q);
return NULL;
}
}
ModelFrame::~ModelFrame() {
transform_destroy(&transform);
loadyuv_destroy(&loadyuv);
CL_CHECK(clReleaseMemObject(net_input_cl));
CL_CHECK(clReleaseMemObject(v_cl));
CL_CHECK(clReleaseMemObject(u_cl));
CL_CHECK(clReleaseMemObject(y_cl));
CL_CHECK(clReleaseCommandQueue(q));
}

View File

@@ -0,0 +1,36 @@
#pragma once
#include <cfloat>
#include <cstdlib>
#include <memory>
#define CL_USE_DEPRECATED_OPENCL_1_2_APIS
#ifdef __APPLE__
#include <OpenCL/cl.h>
#else
#include <CL/cl.h>
#endif
#include "common/mat.h"
#include "sunnypilot/modeld/transforms/loadyuv.h"
#include "sunnypilot/modeld/transforms/transform.h"
class ModelFrame {
public:
ModelFrame(cl_device_id device_id, cl_context context);
~ModelFrame();
float* prepare(cl_mem yuv_cl, int width, int height, int frame_stride, int frame_uv_offset, const mat3& transform, cl_mem *output);
const int MODEL_WIDTH = 512;
const int MODEL_HEIGHT = 256;
const int MODEL_FRAME_SIZE = MODEL_WIDTH * MODEL_HEIGHT * 3 / 2;
const int buf_size = MODEL_FRAME_SIZE * 2;
private:
Transform transform;
LoadYUVState loadyuv;
cl_command_queue q;
cl_mem y_cl, u_cl, v_cl, net_input_cl;
std::unique_ptr<float[]> input_frames;
};

View File

@@ -0,0 +1,18 @@
# distutils: language = c++
from msgq.visionipc.visionipc cimport cl_device_id, cl_context, cl_mem
cdef extern from "common/mat.h":
cdef struct mat3:
float v[9]
cdef extern from "common/clutil.h":
cdef unsigned long CL_DEVICE_TYPE_DEFAULT
cl_device_id cl_get_device_id(unsigned long)
cl_context cl_create_context(cl_device_id)
cdef extern from "sunnypilot/modeld/models/commonmodel.h":
cppclass ModelFrame:
int buf_size
ModelFrame(cl_device_id, cl_context)
float * prepare(cl_mem, int, int, int, int, mat3, cl_mem*)

View File

@@ -0,0 +1,13 @@
# distutils: language = c++
from msgq.visionipc.visionipc cimport cl_mem
from msgq.visionipc.visionipc_pyx cimport CLContext as BaseCLContext
cdef class CLContext(BaseCLContext):
pass
cdef class CLMem:
cdef cl_mem * mem
@staticmethod
cdef create(void*)

View File

@@ -0,0 +1,45 @@
# distutils: language = c++
# cython: c_string_encoding=ascii, language_level=3
import numpy as np
cimport numpy as cnp
from libc.string cimport memcpy
from msgq.visionipc.visionipc cimport cl_mem
from msgq.visionipc.visionipc_pyx cimport VisionBuf, CLContext as BaseCLContext
from openpilot.sunnypilot.modeld.models.commonmodel cimport CL_DEVICE_TYPE_DEFAULT, cl_get_device_id, cl_create_context
from openpilot.sunnypilot.modeld.models.commonmodel cimport mat3, ModelFrame as cppModelFrame
cdef class CLContext(BaseCLContext):
def __cinit__(self):
self.device_id = cl_get_device_id(CL_DEVICE_TYPE_DEFAULT)
self.context = cl_create_context(self.device_id)
cdef class CLMem:
@staticmethod
cdef create(void * cmem):
mem = CLMem()
mem.mem = <cl_mem*> cmem
return mem
cdef class ModelFrame:
cdef cppModelFrame * frame
def __cinit__(self, CLContext context):
self.frame = new cppModelFrame(context.device_id, context.context)
def __dealloc__(self):
del self.frame
def prepare(self, VisionBuf buf, float[:] projection, CLMem output):
cdef mat3 cprojection
memcpy(cprojection.v, &projection[0], 9*sizeof(float))
cdef float * data
if output is None:
data = self.frame.prepare(buf.buf.buf_cl, buf.width, buf.height, buf.stride, buf.uv_offset, cprojection, NULL)
else:
data = self.frame.prepare(buf.buf.buf_cl, buf.width, buf.height, buf.stride, buf.uv_offset, cprojection, output.mem)
if not data:
return None
return np.asarray(<cnp.float32_t[:self.frame.buf_size]> data)

Binary file not shown.

View File

@@ -0,0 +1,105 @@
import numpy as np
from openpilot.sunnypilot.modeld.constants import ModelConstants
def safe_exp(x, out=None):
# -11 is around 10**14, more causes float16 overflow
return np.exp(np.clip(x, -np.inf, 11), out=out)
def sigmoid(x):
return 1. / (1. + safe_exp(-x))
def softmax(x, axis=-1):
x -= np.max(x, axis=axis, keepdims=True)
if x.dtype == np.float32 or x.dtype == np.float64:
safe_exp(x, out=x)
else:
x = safe_exp(x)
x /= np.sum(x, axis=axis, keepdims=True)
return x
class Parser:
def __init__(self, ignore_missing=False):
self.ignore_missing = ignore_missing
def check_missing(self, outs, name):
if name not in outs and not self.ignore_missing:
raise ValueError(f"Missing output {name}")
return name not in outs
def parse_categorical_crossentropy(self, name, outs, out_shape=None):
if self.check_missing(outs, name):
return
raw = outs[name]
if out_shape is not None:
raw = raw.reshape((raw.shape[0],) + out_shape)
outs[name] = softmax(raw, axis=-1)
def parse_binary_crossentropy(self, name, outs):
if self.check_missing(outs, name):
return
raw = outs[name]
outs[name] = sigmoid(raw)
def parse_mdn(self, name, outs, in_N=0, out_N=1, out_shape=None):
if self.check_missing(outs, name):
return
raw = outs[name]
raw = raw.reshape((raw.shape[0], max(in_N, 1), -1))
n_values = (raw.shape[2] - out_N)//2
pred_mu = raw[:,:,:n_values]
pred_std = safe_exp(raw[:,:,n_values: 2*n_values])
if in_N > 1:
weights = np.zeros((raw.shape[0], in_N, out_N), dtype=raw.dtype)
for i in range(out_N):
weights[:,:,i - out_N] = softmax(raw[:,:,i - out_N], axis=-1)
if out_N == 1:
for fidx in range(weights.shape[0]):
idxs = np.argsort(weights[fidx][:,0])[::-1]
weights[fidx] = weights[fidx][idxs]
pred_mu[fidx] = pred_mu[fidx][idxs]
pred_std[fidx] = pred_std[fidx][idxs]
full_shape = tuple([raw.shape[0], in_N] + list(out_shape))
outs[name + '_weights'] = weights
outs[name + '_hypotheses'] = pred_mu.reshape(full_shape)
outs[name + '_stds_hypotheses'] = pred_std.reshape(full_shape)
pred_mu_final = np.zeros((raw.shape[0], out_N, n_values), dtype=raw.dtype)
pred_std_final = np.zeros((raw.shape[0], out_N, n_values), dtype=raw.dtype)
for fidx in range(weights.shape[0]):
for hidx in range(out_N):
idxs = np.argsort(weights[fidx,:,hidx])[::-1]
pred_mu_final[fidx, hidx] = pred_mu[fidx, idxs[0]]
pred_std_final[fidx, hidx] = pred_std[fidx, idxs[0]]
else:
pred_mu_final = pred_mu
pred_std_final = pred_std
if out_N > 1:
final_shape = tuple([raw.shape[0], out_N] + list(out_shape))
else:
final_shape = tuple([raw.shape[0],] + list(out_shape))
outs[name] = pred_mu_final.reshape(final_shape)
outs[name + '_stds'] = pred_std_final.reshape(final_shape)
def parse_outputs(self, outs: dict[str, np.ndarray]) -> dict[str, np.ndarray]:
self.parse_mdn('plan', outs, in_N=ModelConstants.PLAN_MHP_N, out_N=ModelConstants.PLAN_MHP_SELECTION,
out_shape=(ModelConstants.IDX_N,ModelConstants.PLAN_WIDTH))
self.parse_mdn('lane_lines', outs, in_N=0, out_N=0, out_shape=(ModelConstants.NUM_LANE_LINES,ModelConstants.IDX_N,ModelConstants.LANE_LINES_WIDTH))
self.parse_mdn('road_edges', outs, in_N=0, out_N=0, out_shape=(ModelConstants.NUM_ROAD_EDGES,ModelConstants.IDX_N,ModelConstants.LANE_LINES_WIDTH))
self.parse_mdn('pose', outs, in_N=0, out_N=0, out_shape=(ModelConstants.POSE_WIDTH,))
self.parse_mdn('road_transform', outs, in_N=0, out_N=0, out_shape=(ModelConstants.POSE_WIDTH,))
self.parse_mdn('wide_from_device_euler', outs, in_N=0, out_N=0, out_shape=(ModelConstants.WIDE_FROM_DEVICE_WIDTH,))
self.parse_mdn('lead', outs, in_N=ModelConstants.LEAD_MHP_N, out_N=ModelConstants.LEAD_MHP_SELECTION,
out_shape=(ModelConstants.LEAD_TRAJ_LEN,ModelConstants.LEAD_WIDTH))
if 'lat_planner_solution' in outs:
self.parse_mdn('lat_planner_solution', outs, in_N=0, out_N=0, out_shape=(ModelConstants.IDX_N,ModelConstants.LAT_PLANNER_SOLUTION_WIDTH))
if 'desired_curvature' in outs:
self.parse_mdn('desired_curvature', outs, in_N=0, out_N=0, out_shape=(ModelConstants.DESIRED_CURV_WIDTH,))
for k in ['lead_prob', 'lane_lines_prob', 'meta']:
self.parse_binary_crossentropy(k, outs)
self.parse_categorical_crossentropy('desire_state', outs, out_shape=(ModelConstants.DESIRE_PRED_WIDTH,))
self.parse_categorical_crossentropy('desire_pred', outs, out_shape=(ModelConstants.DESIRE_PRED_LEN,ModelConstants.DESIRE_PRED_WIDTH))
return outs

View File

@@ -0,0 +1,27 @@
import os
from openpilot.system.hardware import TICI
from openpilot.sunnypilot.modeld.runners.runmodel_pyx import RunModel, Runtime
assert Runtime
USE_THNEED = int(os.getenv('USE_THNEED', str(int(TICI))))
USE_SNPE = int(os.getenv('USE_SNPE', str(int(TICI))))
class ModelRunner(RunModel):
THNEED = 'THNEED'
SNPE = 'SNPE'
ONNX = 'ONNX'
def __new__(cls, paths, *args, **kwargs):
if ModelRunner.THNEED in paths and USE_THNEED:
from openpilot.sunnypilot.modeld.runners.thneedmodel_pyx import ThneedModel as Runner
runner_type = ModelRunner.THNEED
elif ModelRunner.SNPE in paths and USE_SNPE:
from openpilot.sunnypilot.modeld.runners.snpemodel_pyx import SNPEModel as Runner
runner_type = ModelRunner.SNPE
elif ModelRunner.ONNX in paths:
from openpilot.sunnypilot.modeld.runners.onnxmodel import ONNXModel as Runner
runner_type = ModelRunner.ONNX
else:
raise Exception("Couldn't select a model runner, make sure to pass at least one valid model path")
return Runner(str(paths[runner_type]), *args, **kwargs)

View File

@@ -0,0 +1,98 @@
import onnx
import itertools
import os
import sys
import numpy as np
from typing import Any
from openpilot.sunnypilot.modeld.runners.runmodel_pyx import RunModel
ORT_TYPES_TO_NP_TYPES = {'tensor(float16)': np.float16, 'tensor(float)': np.float32, 'tensor(uint8)': np.uint8}
def attributeproto_fp16_to_fp32(attr):
float32_list = np.frombuffer(attr.raw_data, dtype=np.float16)
attr.data_type = 1
attr.raw_data = float32_list.astype(np.float32).tobytes()
def convert_fp16_to_fp32(onnx_path_or_bytes):
if isinstance(onnx_path_or_bytes, bytes):
model = onnx.load_from_string(onnx_path_or_bytes)
elif isinstance(onnx_path_or_bytes, str):
model = onnx.load(onnx_path_or_bytes)
for i in model.graph.initializer:
if i.data_type == 10:
attributeproto_fp16_to_fp32(i)
for i in itertools.chain(model.graph.input, model.graph.output):
if i.type.tensor_type.elem_type == 10:
i.type.tensor_type.elem_type = 1
for i in model.graph.node:
if i.op_type == 'Cast' and i.attribute[0].i == 10:
i.attribute[0].i = 1
for a in i.attribute:
if hasattr(a, 't'):
if a.t.data_type == 10:
attributeproto_fp16_to_fp32(a.t)
return model.SerializeToString()
def create_ort_session(path, fp16_to_fp32):
os.environ["OMP_NUM_THREADS"] = "4"
os.environ["OMP_WAIT_POLICY"] = "PASSIVE"
import onnxruntime as ort
print("Onnx available providers: ", ort.get_available_providers(), file=sys.stderr)
options = ort.SessionOptions()
options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL
provider: str | tuple[str, dict[Any, Any]]
if 'OpenVINOExecutionProvider' in ort.get_available_providers() and 'ONNXCPU' not in os.environ:
provider = 'OpenVINOExecutionProvider'
elif 'CUDAExecutionProvider' in ort.get_available_providers() and 'ONNXCPU' not in os.environ:
options.intra_op_num_threads = 2
provider = ('CUDAExecutionProvider', {'cudnn_conv_algo_search': 'DEFAULT'})
else:
options.intra_op_num_threads = 2
options.execution_mode = ort.ExecutionMode.ORT_SEQUENTIAL
options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
provider = 'CPUExecutionProvider'
model_data = convert_fp16_to_fp32(path) if fp16_to_fp32 else path
print("Onnx selected provider: ", [provider], file=sys.stderr)
ort_session = ort.InferenceSession(model_data, options, providers=[provider])
print("Onnx using ", ort_session.get_providers(), file=sys.stderr)
return ort_session
class ONNXModel(RunModel):
def __init__(self, path, output, runtime, use_tf8, cl_context):
self.inputs = {}
self.output = output
self.session = create_ort_session(path, fp16_to_fp32=True)
self.input_names = [x.name for x in self.session.get_inputs()]
self.input_shapes = {x.name: [1, *x.shape[1:]] for x in self.session.get_inputs()}
self.input_dtypes = {x.name: ORT_TYPES_TO_NP_TYPES[x.type] for x in self.session.get_inputs()}
# run once to initialize CUDA provider
if "CUDAExecutionProvider" in self.session.get_providers():
self.session.run(None, {k: np.zeros(self.input_shapes[k], dtype=self.input_dtypes[k]) for k in self.input_names})
print("ready to run onnx model", self.input_shapes, file=sys.stderr)
def addInput(self, name, buffer):
assert name in self.input_names
self.inputs[name] = buffer
def setInputBuffer(self, name, buffer):
assert name in self.inputs
self.inputs[name] = buffer
def getCLBuffer(self, name):
return None
def execute(self):
inputs = {k: v.view(self.input_dtypes[k]) for k,v in self.inputs.items()}
inputs = {k: v.reshape(self.input_shapes[k]).astype(self.input_dtypes[k]) for k,v in inputs.items()}
outputs = self.session.run(None, inputs)
assert len(outputs) == 1, "Only single model outputs are supported"
self.output[:] = outputs[0]
return self.output

View File

@@ -0,0 +1,4 @@
#pragma once
#include "sunnypilot/modeld/runners/runmodel.h"
#include "sunnypilot/modeld/runners/snpemodel.h"

View File

@@ -0,0 +1,49 @@
#pragma once
#include <string>
#include <vector>
#include <memory>
#include <cassert>
#include "common/clutil.h"
#include "common/swaglog.h"
#define USE_CPU_RUNTIME 0
#define USE_GPU_RUNTIME 1
#define USE_DSP_RUNTIME 2
struct ModelInput {
const std::string name;
float *buffer;
int size;
ModelInput(const std::string _name, float *_buffer, int _size) : name(_name), buffer(_buffer), size(_size) {}
virtual void setBuffer(float *_buffer, int _size) {
assert(size == _size || size == 0);
buffer = _buffer;
size = _size;
}
};
class RunModel {
public:
std::vector<std::unique_ptr<ModelInput>> inputs;
virtual ~RunModel() {}
virtual void execute() {}
virtual void* getCLBuffer(const std::string name) { return nullptr; }
virtual void addInput(const std::string name, float *buffer, int size) {
inputs.push_back(std::unique_ptr<ModelInput>(new ModelInput(name, buffer, size)));
}
virtual void setInputBuffer(const std::string name, float *buffer, int size) {
for (auto &input : inputs) {
if (name == input->name) {
input->setBuffer(buffer, size);
return;
}
}
LOGE("Tried to update input `%s` but no input with this name exists", name.c_str());
assert(false);
}
};

View File

@@ -0,0 +1,14 @@
# distutils: language = c++
from libcpp.string cimport string
cdef extern from "sunnypilot/modeld/runners/runmodel.h":
cdef int USE_CPU_RUNTIME
cdef int USE_GPU_RUNTIME
cdef int USE_DSP_RUNTIME
cdef cppclass RunModel:
void addInput(string, float*, int)
void setInputBuffer(string, float*, int)
void * getCLBuffer(string)
void execute()

View File

@@ -0,0 +1,6 @@
# distutils: language = c++
from .runmodel cimport RunModel as cppRunModel
cdef class RunModel:
cdef cppRunModel * model

View File

@@ -0,0 +1,37 @@
# distutils: language = c++
# cython: c_string_encoding=ascii, language_level=3
from libcpp.string cimport string
from .runmodel cimport USE_CPU_RUNTIME, USE_GPU_RUNTIME, USE_DSP_RUNTIME
from sunnypilot.modeld.models.commonmodel_pyx cimport CLMem
class Runtime:
CPU = USE_CPU_RUNTIME
GPU = USE_GPU_RUNTIME
DSP = USE_DSP_RUNTIME
cdef class RunModel:
def __dealloc__(self):
del self.model
def addInput(self, string name, float[:] buffer):
if buffer is not None:
self.model.addInput(name, &buffer[0], len(buffer))
else:
self.model.addInput(name, NULL, 0)
def setInputBuffer(self, string name, float[:] buffer):
if buffer is not None:
self.model.setInputBuffer(name, &buffer[0], len(buffer))
else:
self.model.setInputBuffer(name, NULL, 0)
def getCLBuffer(self, string name):
cdef void * cl_buf = self.model.getCLBuffer(name)
if not cl_buf:
return None
return CLMem.create(cl_buf)
def execute(self):
self.model.execute()

View File

@@ -0,0 +1,116 @@
#pragma clang diagnostic ignored "-Wexceptions"
#include "sunnypilot/modeld/runners/snpemodel.h"
#include <cstring>
#include <memory>
#include <string>
#include <utility>
#include <vector>
#include "common/util.h"
#include "common/timing.h"
void PrintErrorStringAndExit() {
std::cerr << zdl::DlSystem::getLastErrorString() << std::endl;
std::exit(EXIT_FAILURE);
}
SNPEModel::SNPEModel(const std::string path, float *_output, size_t _output_size, int runtime, bool _use_tf8, cl_context context) {
output = _output;
output_size = _output_size;
use_tf8 = _use_tf8;
#ifdef QCOM2
if (runtime == USE_GPU_RUNTIME) {
snpe_runtime = zdl::DlSystem::Runtime_t::GPU;
} else if (runtime == USE_DSP_RUNTIME) {
snpe_runtime = zdl::DlSystem::Runtime_t::DSP;
} else {
snpe_runtime = zdl::DlSystem::Runtime_t::CPU;
}
assert(zdl::SNPE::SNPEFactory::isRuntimeAvailable(snpe_runtime));
#endif
model_data = util::read_file(path);
assert(model_data.size() > 0);
// load model
std::unique_ptr<zdl::DlContainer::IDlContainer> container = zdl::DlContainer::IDlContainer::open((uint8_t*)model_data.data(), model_data.size());
if (!container) { PrintErrorStringAndExit(); }
LOGW("loaded model with size: %lu", model_data.size());
// create model runner
zdl::SNPE::SNPEBuilder snpe_builder(container.get());
while (!snpe) {
#ifdef QCOM2
snpe = snpe_builder.setOutputLayers({})
.setRuntimeProcessor(snpe_runtime)
.setUseUserSuppliedBuffers(true)
.setPerformanceProfile(zdl::DlSystem::PerformanceProfile_t::HIGH_PERFORMANCE)
.build();
#else
snpe = snpe_builder.setOutputLayers({})
.setUseUserSuppliedBuffers(true)
.setPerformanceProfile(zdl::DlSystem::PerformanceProfile_t::HIGH_PERFORMANCE)
.build();
#endif
if (!snpe) std::cerr << zdl::DlSystem::getLastErrorString() << std::endl;
}
// create output buffer
zdl::DlSystem::UserBufferEncodingFloat ub_encoding_float;
zdl::DlSystem::IUserBufferFactory &ub_factory = zdl::SNPE::SNPEFactory::getUserBufferFactory();
const auto &output_tensor_names_opt = snpe->getOutputTensorNames();
if (!output_tensor_names_opt) throw std::runtime_error("Error obtaining output tensor names");
const auto &output_tensor_names = *output_tensor_names_opt;
assert(output_tensor_names.size() == 1);
const char *output_tensor_name = output_tensor_names.at(0);
const zdl::DlSystem::TensorShape &buffer_shape = snpe->getInputOutputBufferAttributes(output_tensor_name)->getDims();
if (output_size != 0) {
assert(output_size == buffer_shape[1]);
} else {
output_size = buffer_shape[1];
}
std::vector<size_t> output_strides = {output_size * sizeof(float), sizeof(float)};
output_buffer = ub_factory.createUserBuffer(output, output_size * sizeof(float), output_strides, &ub_encoding_float);
output_map.add(output_tensor_name, output_buffer.get());
}
void SNPEModel::addInput(const std::string name, float *buffer, int size) {
const int idx = inputs.size();
const auto &input_tensor_names_opt = snpe->getInputTensorNames();
if (!input_tensor_names_opt) throw std::runtime_error("Error obtaining input tensor names");
const auto &input_tensor_names = *input_tensor_names_opt;
const char *input_tensor_name = input_tensor_names.at(idx);
const bool input_tf8 = use_tf8 && strcmp(input_tensor_name, "input_img") == 0; // TODO: This is a terrible hack, get rid of this name check both here and in onnx_runner.py
LOGW("adding index %d: %s", idx, input_tensor_name);
zdl::DlSystem::UserBufferEncodingFloat ub_encoding_float;
zdl::DlSystem::UserBufferEncodingTf8 ub_encoding_tf8(0, 1./255); // network takes 0-1
zdl::DlSystem::IUserBufferFactory &ub_factory = zdl::SNPE::SNPEFactory::getUserBufferFactory();
zdl::DlSystem::UserBufferEncoding *input_encoding = input_tf8 ? (zdl::DlSystem::UserBufferEncoding*)&ub_encoding_tf8 : (zdl::DlSystem::UserBufferEncoding*)&ub_encoding_float;
const auto &buffer_shape_opt = snpe->getInputDimensions(input_tensor_name);
const zdl::DlSystem::TensorShape &buffer_shape = *buffer_shape_opt;
size_t size_of_input = input_tf8 ? sizeof(uint8_t) : sizeof(float);
std::vector<size_t> strides(buffer_shape.rank());
strides[strides.size() - 1] = size_of_input;
size_t product = 1;
for (size_t i = 0; i < buffer_shape.rank(); i++) product *= buffer_shape[i];
size_t stride = strides[strides.size() - 1];
for (size_t i = buffer_shape.rank() - 1; i > 0; i--) {
stride *= buffer_shape[i];
strides[i-1] = stride;
}
auto input_buffer = ub_factory.createUserBuffer(buffer, product*size_of_input, strides, input_encoding);
input_map.add(input_tensor_name, input_buffer.get());
inputs.push_back(std::unique_ptr<SNPEModelInput>(new SNPEModelInput(name, buffer, size, std::move(input_buffer))));
}
void SNPEModel::execute() {
if (!snpe->execute(input_map, output_map)) {
PrintErrorStringAndExit();
}
}

View File

@@ -0,0 +1,52 @@
#pragma once
#pragma clang diagnostic ignored "-Wdeprecated-declarations"
#include <memory>
#include <string>
#include <utility>
#include <DlContainer/IDlContainer.hpp>
#include <DlSystem/DlError.hpp>
#include <DlSystem/ITensor.hpp>
#include <DlSystem/ITensorFactory.hpp>
#include <DlSystem/IUserBuffer.hpp>
#include <DlSystem/IUserBufferFactory.hpp>
#include <SNPE/SNPE.hpp>
#include <SNPE/SNPEBuilder.hpp>
#include <SNPE/SNPEFactory.hpp>
#include "sunnypilot/modeld/runners/runmodel.h"
struct SNPEModelInput : public ModelInput {
std::unique_ptr<zdl::DlSystem::IUserBuffer> snpe_buffer;
SNPEModelInput(const std::string _name, float *_buffer, int _size, std::unique_ptr<zdl::DlSystem::IUserBuffer> _snpe_buffer) : ModelInput(_name, _buffer, _size), snpe_buffer(std::move(_snpe_buffer)) {}
void setBuffer(float *_buffer, int _size) {
ModelInput::setBuffer(_buffer, _size);
assert(snpe_buffer->setBufferAddress(_buffer) == true);
}
};
class SNPEModel : public RunModel {
public:
SNPEModel(const std::string path, float *_output, size_t _output_size, int runtime, bool use_tf8 = false, cl_context context = NULL);
void addInput(const std::string name, float *buffer, int size);
void execute();
private:
std::string model_data;
#ifdef QCOM2
zdl::DlSystem::Runtime_t snpe_runtime;
#endif
// snpe model stuff
std::unique_ptr<zdl::SNPE::SNPE> snpe;
zdl::DlSystem::UserBufferMap input_map;
zdl::DlSystem::UserBufferMap output_map;
std::unique_ptr<zdl::DlSystem::IUserBuffer> output_buffer;
bool use_tf8;
float *output;
size_t output_size;
};

View File

@@ -0,0 +1,9 @@
# distutils: language = c++
from libcpp.string cimport string
from msgq.visionipc.visionipc cimport cl_context
cdef extern from "sunnypilot/modeld/runners/snpemodel.h":
cdef cppclass SNPEModel:
SNPEModel(string, float*, size_t, int, bool, cl_context)

View File

@@ -0,0 +1,17 @@
# distutils: language = c++
# cython: c_string_encoding=ascii, language_level=3
import os
from libcpp cimport bool
from libcpp.string cimport string
from .snpemodel cimport SNPEModel as cppSNPEModel
from sunnypilot.modeld.models.commonmodel_pyx cimport CLContext
from sunnypilot.modeld.runners.runmodel_pyx cimport RunModel
from sunnypilot.modeld.runners.runmodel cimport RunModel as cppRunModel
os.environ['ADSP_LIBRARY_PATH'] = "/data/pythonpath/third_party/snpe/dsp/"
cdef class SNPEModel(RunModel):
def __cinit__(self, string path, float[:] output, int runtime, bool use_tf8, CLContext context):
self.model = <cppRunModel *> new cppSNPEModel(path, &output[0], len(output), runtime, use_tf8, context.context)

View File

@@ -0,0 +1,58 @@
#include "sunnypilot/modeld/runners/thneedmodel.h"
#include <string>
#include "common/swaglog.h"
ThneedModel::ThneedModel(const std::string path, float *_output, size_t _output_size, int runtime, bool luse_tf8, cl_context context) {
thneed = new Thneed(true, context);
thneed->load(path.c_str());
thneed->clexec();
recorded = false;
output = _output;
}
void* ThneedModel::getCLBuffer(const std::string name) {
int index = -1;
for (int i = 0; i < inputs.size(); i++) {
if (name == inputs[i]->name) {
index = i;
break;
}
}
if (index == -1) {
LOGE("Tried to get CL buffer for input `%s` but no input with this name exists", name.c_str());
assert(false);
}
if (thneed->input_clmem.size() >= inputs.size()) {
return &thneed->input_clmem[inputs.size() - index - 1];
} else {
return nullptr;
}
}
void ThneedModel::execute() {
if (!recorded) {
thneed->record = true;
float *input_buffers[inputs.size()];
for (int i = 0; i < inputs.size(); i++) {
input_buffers[inputs.size() - i - 1] = inputs[i]->buffer;
}
thneed->copy_inputs(input_buffers);
thneed->clexec();
thneed->copy_output(output);
thneed->stop();
recorded = true;
} else {
float *input_buffers[inputs.size()];
for (int i = 0; i < inputs.size(); i++) {
input_buffers[inputs.size() - i - 1] = inputs[i]->buffer;
}
thneed->execute(input_buffers, output);
}
}

View File

@@ -0,0 +1,17 @@
#pragma once
#include <string>
#include "sunnypilot/modeld/runners/runmodel.h"
#include "sunnypilot/modeld/thneed/thneed.h"
class ThneedModel : public RunModel {
public:
ThneedModel(const std::string path, float *_output, size_t _output_size, int runtime, bool use_tf8 = false, cl_context context = NULL);
void *getCLBuffer(const std::string name);
void execute();
private:
Thneed *thneed = NULL;
bool recorded;
float *output;
};

View File

@@ -0,0 +1,9 @@
# distutils: language = c++
from libcpp.string cimport string
from msgq.visionipc.visionipc cimport cl_context
cdef extern from "sunnypilot/modeld/runners/thneedmodel.h":
cdef cppclass ThneedModel:
ThneedModel(string, float*, size_t, int, bool, cl_context)

View File

@@ -0,0 +1,14 @@
# distutils: language = c++
# cython: c_string_encoding=ascii, language_level=3
from libcpp cimport bool
from libcpp.string cimport string
from .thneedmodel cimport ThneedModel as cppThneedModel
from sunnypilot.modeld.models.commonmodel_pyx cimport CLContext
from sunnypilot.modeld.runners.runmodel_pyx cimport RunModel
from sunnypilot.modeld.runners.runmodel cimport RunModel as cppRunModel
cdef class ThneedModel(RunModel):
def __cinit__(self, string path, float[:] output, int runtime, bool use_tf8, CLContext context):
self.model = <cppRunModel *> new cppThneedModel(path, &output[0], len(output), runtime, use_tf8, context.context)

View File

@@ -0,0 +1,8 @@
thneed is an SNPE accelerator. I know SNPE is already an accelerator, but sometimes things need to go even faster..
It runs on the local device, and caches a single model run. Then it replays it, but fast.
thneed slices through abstraction layers like a fish.
You need a thneed.

View File

View File

@@ -0,0 +1,154 @@
#include <cassert>
#include <set>
#include "third_party/json11/json11.hpp"
#include "common/util.h"
#include "common/clutil.h"
#include "common/swaglog.h"
#include "sunnypilot/modeld/thneed/thneed.h"
using namespace json11;
extern map<cl_program, string> g_program_source;
void Thneed::load(const char *filename) {
LOGD("Thneed::load: loading from %s\n", filename);
string buf = util::read_file(filename);
int jsz = *(int *)buf.data();
string jsonerr;
string jj(buf.data() + sizeof(int), jsz);
Json jdat = Json::parse(jj, jsonerr);
map<cl_mem, cl_mem> real_mem;
real_mem[NULL] = NULL;
int ptr = sizeof(int)+jsz;
for (auto &obj : jdat["objects"].array_items()) {
auto mobj = obj.object_items();
int sz = mobj["size"].int_value();
cl_mem clbuf = NULL;
if (mobj["buffer_id"].string_value().size() > 0) {
// image buffer must already be allocated
clbuf = real_mem[*(cl_mem*)(mobj["buffer_id"].string_value().data())];
assert(mobj["needs_load"].bool_value() == false);
} else {
if (mobj["needs_load"].bool_value()) {
clbuf = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR | CL_MEM_READ_WRITE, sz, &buf[ptr], NULL);
if (debug >= 1) printf("loading %p %d @ 0x%X\n", clbuf, sz, ptr);
ptr += sz;
} else {
// TODO: is there a faster way to init zeroed out buffers?
void *host_zeros = calloc(sz, 1);
clbuf = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR | CL_MEM_READ_WRITE, sz, host_zeros, NULL);
free(host_zeros);
}
}
assert(clbuf != NULL);
if (mobj["arg_type"] == "image2d_t" || mobj["arg_type"] == "image1d_t") {
cl_image_desc desc = {0};
desc.image_type = (mobj["arg_type"] == "image2d_t") ? CL_MEM_OBJECT_IMAGE2D : CL_MEM_OBJECT_IMAGE1D_BUFFER;
desc.image_width = mobj["width"].int_value();
desc.image_height = mobj["height"].int_value();
desc.image_row_pitch = mobj["row_pitch"].int_value();
assert(sz == desc.image_height*desc.image_row_pitch);
#ifdef QCOM2
desc.buffer = clbuf;
#else
// TODO: we are creating unused buffers on PC
clReleaseMemObject(clbuf);
#endif
cl_image_format format = {0};
format.image_channel_order = CL_RGBA;
format.image_channel_data_type = mobj["float32"].bool_value() ? CL_FLOAT : CL_HALF_FLOAT;
cl_int errcode;
#ifndef QCOM2
if (mobj["needs_load"].bool_value()) {
clbuf = clCreateImage(context, CL_MEM_COPY_HOST_PTR | CL_MEM_READ_WRITE, &format, &desc, &buf[ptr-sz], &errcode);
} else {
clbuf = clCreateImage(context, CL_MEM_READ_WRITE, &format, &desc, NULL, &errcode);
}
#else
clbuf = clCreateImage(context, CL_MEM_READ_WRITE, &format, &desc, NULL, &errcode);
#endif
if (clbuf == NULL) {
LOGE("clError: %s create image %zux%zu rp %zu with buffer %p\n", cl_get_error_string(errcode),
desc.image_width, desc.image_height, desc.image_row_pitch, desc.buffer);
}
assert(clbuf != NULL);
}
real_mem[*(cl_mem*)(mobj["id"].string_value().data())] = clbuf;
}
map<string, cl_program> g_programs;
for (const auto &[name, source] : jdat["programs"].object_items()) {
if (debug >= 1) printf("building %s with size %zu\n", name.c_str(), source.string_value().size());
g_programs[name] = cl_program_from_source(context, device_id, source.string_value());
}
for (auto &obj : jdat["inputs"].array_items()) {
auto mobj = obj.object_items();
int sz = mobj["size"].int_value();
cl_mem aa = real_mem[*(cl_mem*)(mobj["buffer_id"].string_value().data())];
input_clmem.push_back(aa);
input_sizes.push_back(sz);
LOGD("Thneed::load: adding input %s with size %d\n", mobj["name"].string_value().data(), sz);
cl_int cl_err;
void *ret = clEnqueueMapBuffer(command_queue, aa, CL_TRUE, CL_MAP_WRITE, 0, sz, 0, NULL, NULL, &cl_err);
if (cl_err != CL_SUCCESS) LOGE("clError: %s map %p %d\n", cl_get_error_string(cl_err), aa, sz);
assert(cl_err == CL_SUCCESS);
inputs.push_back(ret);
}
for (auto &obj : jdat["outputs"].array_items()) {
auto mobj = obj.object_items();
int sz = mobj["size"].int_value();
LOGD("Thneed::save: adding output with size %d\n", sz);
// TODO: support multiple outputs
output = real_mem[*(cl_mem*)(mobj["buffer_id"].string_value().data())];
assert(output != NULL);
}
for (auto &obj : jdat["binaries"].array_items()) {
string name = obj["name"].string_value();
size_t length = obj["length"].int_value();
if (debug >= 1) printf("binary %s with size %zu\n", name.c_str(), length);
g_programs[name] = cl_program_from_binary(context, device_id, (const uint8_t*)&buf[ptr], length);
ptr += length;
}
for (auto &obj : jdat["kernels"].array_items()) {
auto gws = obj["global_work_size"];
auto lws = obj["local_work_size"];
auto kk = shared_ptr<CLQueuedKernel>(new CLQueuedKernel(this));
kk->name = obj["name"].string_value();
kk->program = g_programs[kk->name];
kk->work_dim = obj["work_dim"].int_value();
for (int i = 0; i < kk->work_dim; i++) {
kk->global_work_size[i] = gws[i].int_value();
kk->local_work_size[i] = lws[i].int_value();
}
kk->num_args = obj["num_args"].int_value();
for (int i = 0; i < kk->num_args; i++) {
string arg = obj["args"].array_items()[i].string_value();
int arg_size = obj["args_size"].array_items()[i].int_value();
kk->args_size.push_back(arg_size);
if (arg_size == 8) {
cl_mem val = *(cl_mem*)(arg.data());
val = real_mem[val];
kk->args.push_back(string((char*)&val, sizeof(val)));
} else {
kk->args.push_back(arg);
}
}
kq.push_back(kk);
}
clFinish(command_queue);
}

View File

@@ -0,0 +1,133 @@
#pragma once
#ifndef __user
#define __user __attribute__(())
#endif
#include <cstdint>
#include <cstdlib>
#include <memory>
#include <string>
#include <vector>
#include <CL/cl.h>
#include "third_party/linux/include/msm_kgsl.h"
using namespace std;
cl_int thneed_clSetKernelArg(cl_kernel kernel, cl_uint arg_index, size_t arg_size, const void *arg_value);
namespace json11 {
class Json;
}
class Thneed;
class GPUMalloc {
public:
GPUMalloc(int size, int fd);
~GPUMalloc();
void *alloc(int size);
private:
uint64_t base;
int remaining;
};
class CLQueuedKernel {
public:
CLQueuedKernel(Thneed *lthneed) { thneed = lthneed; }
CLQueuedKernel(Thneed *lthneed,
cl_kernel _kernel,
cl_uint _work_dim,
const size_t *_global_work_size,
const size_t *_local_work_size);
cl_int exec();
void debug_print(bool verbose);
int get_arg_num(const char *search_arg_name);
cl_program program;
string name;
cl_uint num_args;
vector<string> arg_names;
vector<string> arg_types;
vector<string> args;
vector<int> args_size;
cl_kernel kernel = NULL;
json11::Json to_json() const;
cl_uint work_dim;
size_t global_work_size[3] = {0};
size_t local_work_size[3] = {0};
private:
Thneed *thneed;
};
class CachedIoctl {
public:
virtual void exec() {}
};
class CachedSync: public CachedIoctl {
public:
CachedSync(Thneed *lthneed, string ldata) { thneed = lthneed; data = ldata; }
void exec();
private:
Thneed *thneed;
string data;
};
class CachedCommand: public CachedIoctl {
public:
CachedCommand(Thneed *lthneed, struct kgsl_gpu_command *cmd);
void exec();
private:
void disassemble(int cmd_index);
struct kgsl_gpu_command cache;
unique_ptr<kgsl_command_object[]> cmds;
unique_ptr<kgsl_command_object[]> objs;
Thneed *thneed;
vector<shared_ptr<CLQueuedKernel> > kq;
};
class Thneed {
public:
Thneed(bool do_clinit=false, cl_context _context = NULL);
void stop();
void execute(float **finputs, float *foutput, bool slow=false);
void wait();
vector<cl_mem> input_clmem;
vector<void *> inputs;
vector<size_t> input_sizes;
cl_mem output = NULL;
cl_context context = NULL;
cl_command_queue command_queue;
cl_device_id device_id;
int context_id;
// protected?
bool record = false;
int debug;
int timestamp;
#ifdef QCOM2
unique_ptr<GPUMalloc> ram;
vector<unique_ptr<CachedIoctl> > cmds;
int fd;
#endif
// all CL kernels
void copy_inputs(float **finputs, bool internal=false);
void copy_output(float *foutput);
cl_int clexec();
vector<shared_ptr<CLQueuedKernel> > kq;
// pending CL kernels
vector<shared_ptr<CLQueuedKernel> > ckq;
// loading
void load(const char *filename);
private:
void clinit();
};

View File

@@ -0,0 +1,216 @@
#include "sunnypilot/modeld/thneed/thneed.h"
#include <cassert>
#include <cstring>
#include <map>
#include "common/clutil.h"
#include "common/timing.h"
map<pair<cl_kernel, int>, string> g_args;
map<pair<cl_kernel, int>, int> g_args_size;
map<cl_program, string> g_program_source;
void Thneed::stop() {
//printf("Thneed::stop: recorded %lu commands\n", cmds.size());
record = false;
}
void Thneed::clinit() {
device_id = cl_get_device_id(CL_DEVICE_TYPE_DEFAULT);
if (context == NULL) context = CL_CHECK_ERR(clCreateContext(NULL, 1, &device_id, NULL, NULL, &err));
//cl_command_queue_properties props[3] = {CL_QUEUE_PROPERTIES, CL_QUEUE_PROFILING_ENABLE, 0};
cl_command_queue_properties props[3] = {CL_QUEUE_PROPERTIES, 0, 0};
command_queue = CL_CHECK_ERR(clCreateCommandQueueWithProperties(context, device_id, props, &err));
printf("Thneed::clinit done\n");
}
cl_int Thneed::clexec() {
if (debug >= 1) printf("Thneed::clexec: running %lu queued kernels\n", kq.size());
for (auto &k : kq) {
if (record) ckq.push_back(k);
cl_int ret = k->exec();
assert(ret == CL_SUCCESS);
}
return clFinish(command_queue);
}
void Thneed::copy_inputs(float **finputs, bool internal) {
for (int idx = 0; idx < inputs.size(); ++idx) {
if (debug >= 1) printf("copying %lu -- %p -> %p (cl %p)\n", input_sizes[idx], finputs[idx], inputs[idx], input_clmem[idx]);
if (internal) {
// if it's internal, using memcpy is fine since the buffer sync is cached in the ioctl layer
if (finputs[idx] != NULL) memcpy(inputs[idx], finputs[idx], input_sizes[idx]);
} else {
if (finputs[idx] != NULL) CL_CHECK(clEnqueueWriteBuffer(command_queue, input_clmem[idx], CL_TRUE, 0, input_sizes[idx], finputs[idx], 0, NULL, NULL));
}
}
}
void Thneed::copy_output(float *foutput) {
if (output != NULL) {
size_t sz;
clGetMemObjectInfo(output, CL_MEM_SIZE, sizeof(sz), &sz, NULL);
if (debug >= 1) printf("copying %lu for output %p -> %p\n", sz, output, foutput);
CL_CHECK(clEnqueueReadBuffer(command_queue, output, CL_TRUE, 0, sz, foutput, 0, NULL, NULL));
} else {
printf("CAUTION: model output is NULL, does it have no outputs?\n");
}
}
// *********** CLQueuedKernel ***********
CLQueuedKernel::CLQueuedKernel(Thneed *lthneed,
cl_kernel _kernel,
cl_uint _work_dim,
const size_t *_global_work_size,
const size_t *_local_work_size) {
thneed = lthneed;
kernel = _kernel;
work_dim = _work_dim;
assert(work_dim <= 3);
for (int i = 0; i < work_dim; i++) {
global_work_size[i] = _global_work_size[i];
local_work_size[i] = _local_work_size[i];
}
char _name[0x100];
clGetKernelInfo(kernel, CL_KERNEL_FUNCTION_NAME, sizeof(_name), _name, NULL);
name = string(_name);
clGetKernelInfo(kernel, CL_KERNEL_NUM_ARGS, sizeof(num_args), &num_args, NULL);
// get args
for (int i = 0; i < num_args; i++) {
char arg_name[0x100] = {0};
clGetKernelArgInfo(kernel, i, CL_KERNEL_ARG_NAME, sizeof(arg_name), arg_name, NULL);
arg_names.push_back(string(arg_name));
clGetKernelArgInfo(kernel, i, CL_KERNEL_ARG_TYPE_NAME, sizeof(arg_name), arg_name, NULL);
arg_types.push_back(string(arg_name));
args.push_back(g_args[make_pair(kernel, i)]);
args_size.push_back(g_args_size[make_pair(kernel, i)]);
}
// get program
clGetKernelInfo(kernel, CL_KERNEL_PROGRAM, sizeof(program), &program, NULL);
}
int CLQueuedKernel::get_arg_num(const char *search_arg_name) {
for (int i = 0; i < num_args; i++) {
if (arg_names[i] == search_arg_name) return i;
}
printf("failed to find %s in %s\n", search_arg_name, name.c_str());
assert(false);
}
cl_int CLQueuedKernel::exec() {
if (kernel == NULL) {
kernel = clCreateKernel(program, name.c_str(), NULL);
arg_names.clear();
arg_types.clear();
for (int j = 0; j < num_args; j++) {
char arg_name[0x100] = {0};
clGetKernelArgInfo(kernel, j, CL_KERNEL_ARG_NAME, sizeof(arg_name), arg_name, NULL);
arg_names.push_back(string(arg_name));
clGetKernelArgInfo(kernel, j, CL_KERNEL_ARG_TYPE_NAME, sizeof(arg_name), arg_name, NULL);
arg_types.push_back(string(arg_name));
cl_int ret;
if (args[j].size() != 0) {
assert(args[j].size() == args_size[j]);
ret = thneed_clSetKernelArg(kernel, j, args[j].size(), args[j].data());
} else {
ret = thneed_clSetKernelArg(kernel, j, args_size[j], NULL);
}
assert(ret == CL_SUCCESS);
}
}
if (thneed->debug >= 1) {
debug_print(thneed->debug >= 2);
}
return clEnqueueNDRangeKernel(thneed->command_queue,
kernel, work_dim, NULL, global_work_size, local_work_size, 0, NULL, NULL);
}
void CLQueuedKernel::debug_print(bool verbose) {
printf("%p %56s -- ", kernel, name.c_str());
for (int i = 0; i < work_dim; i++) {
printf("%4zu ", global_work_size[i]);
}
printf(" -- ");
for (int i = 0; i < work_dim; i++) {
printf("%4zu ", local_work_size[i]);
}
printf("\n");
if (verbose) {
for (int i = 0; i < num_args; i++) {
string arg = args[i];
printf(" %s %s", arg_types[i].c_str(), arg_names[i].c_str());
void *arg_value = (void*)arg.data();
int arg_size = arg.size();
if (arg_size == 0) {
printf(" (size) %d", args_size[i]);
} else if (arg_size == 1) {
printf(" = %d", *((char*)arg_value));
} else if (arg_size == 2) {
printf(" = %d", *((short*)arg_value));
} else if (arg_size == 4) {
if (arg_types[i] == "float") {
printf(" = %f", *((float*)arg_value));
} else {
printf(" = %d", *((int*)arg_value));
}
} else if (arg_size == 8) {
cl_mem val = (cl_mem)(*((uintptr_t*)arg_value));
printf(" = %p", val);
if (val != NULL) {
cl_mem_object_type obj_type;
clGetMemObjectInfo(val, CL_MEM_TYPE, sizeof(obj_type), &obj_type, NULL);
if (arg_types[i] == "image2d_t" || arg_types[i] == "image1d_t" || obj_type == CL_MEM_OBJECT_IMAGE2D) {
cl_image_format format;
size_t width, height, depth, array_size, row_pitch, slice_pitch;
cl_mem buf;
clGetImageInfo(val, CL_IMAGE_FORMAT, sizeof(format), &format, NULL);
assert(format.image_channel_order == CL_RGBA);
assert(format.image_channel_data_type == CL_HALF_FLOAT || format.image_channel_data_type == CL_FLOAT);
clGetImageInfo(val, CL_IMAGE_WIDTH, sizeof(width), &width, NULL);
clGetImageInfo(val, CL_IMAGE_HEIGHT, sizeof(height), &height, NULL);
clGetImageInfo(val, CL_IMAGE_ROW_PITCH, sizeof(row_pitch), &row_pitch, NULL);
clGetImageInfo(val, CL_IMAGE_DEPTH, sizeof(depth), &depth, NULL);
clGetImageInfo(val, CL_IMAGE_ARRAY_SIZE, sizeof(array_size), &array_size, NULL);
clGetImageInfo(val, CL_IMAGE_SLICE_PITCH, sizeof(slice_pitch), &slice_pitch, NULL);
assert(depth == 0);
assert(array_size == 0);
assert(slice_pitch == 0);
clGetImageInfo(val, CL_IMAGE_BUFFER, sizeof(buf), &buf, NULL);
size_t sz = 0;
if (buf != NULL) clGetMemObjectInfo(buf, CL_MEM_SIZE, sizeof(sz), &sz, NULL);
printf(" image %zu x %zu rp %zu @ %p buffer %zu", width, height, row_pitch, buf, sz);
} else {
size_t sz;
clGetMemObjectInfo(val, CL_MEM_SIZE, sizeof(sz), &sz, NULL);
printf(" buffer %zu", sz);
}
}
}
printf("\n");
}
}
}
cl_int thneed_clSetKernelArg(cl_kernel kernel, cl_uint arg_index, size_t arg_size, const void *arg_value) {
g_args_size[make_pair(kernel, arg_index)] = arg_size;
if (arg_value != NULL) {
g_args[make_pair(kernel, arg_index)] = string((char*)arg_value, arg_size);
} else {
g_args[make_pair(kernel, arg_index)] = string("");
}
cl_int ret = clSetKernelArg(kernel, arg_index, arg_size, arg_value);
return ret;
}

View File

@@ -0,0 +1,32 @@
#include "sunnypilot/modeld/thneed/thneed.h"
#include <cassert>
#include "common/clutil.h"
#include "common/timing.h"
Thneed::Thneed(bool do_clinit, cl_context _context) {
context = _context;
if (do_clinit) clinit();
char *thneed_debug_env = getenv("THNEED_DEBUG");
debug = (thneed_debug_env != NULL) ? atoi(thneed_debug_env) : 0;
}
void Thneed::execute(float **finputs, float *foutput, bool slow) {
uint64_t tb, te;
if (debug >= 1) tb = nanos_since_boot();
// ****** copy inputs
copy_inputs(finputs);
// ****** run commands
clexec();
// ****** copy outputs
copy_output(foutput);
if (debug >= 1) {
te = nanos_since_boot();
printf("model exec in %lu us\n", (te-tb)/1000);
}
}

View File

@@ -0,0 +1,258 @@
#include "sunnypilot/modeld/thneed/thneed.h"
#include <dlfcn.h>
#include <sys/mman.h>
#include <cassert>
#include <cerrno>
#include <cstring>
#include <map>
#include <string>
#include "common/clutil.h"
#include "common/timing.h"
Thneed *g_thneed = NULL;
int g_fd = -1;
void hexdump(uint8_t *d, int len) {
assert((len%4) == 0);
printf(" dumping %p len 0x%x\n", d, len);
for (int i = 0; i < len/4; i++) {
if (i != 0 && (i%0x10) == 0) printf("\n");
printf("%8x ", d[i]);
}
printf("\n");
}
// *********** ioctl interceptor ***********
extern "C" {
int (*my_ioctl)(int filedes, unsigned long request, void *argp) = NULL;
#undef ioctl
int ioctl(int filedes, unsigned long request, void *argp) {
request &= 0xFFFFFFFF; // needed on QCOM2
if (my_ioctl == NULL) my_ioctl = reinterpret_cast<decltype(my_ioctl)>(dlsym(RTLD_NEXT, "ioctl"));
Thneed *thneed = g_thneed;
// save the fd
if (request == IOCTL_KGSL_GPUOBJ_ALLOC) g_fd = filedes;
// note that this runs always, even without a thneed object
if (request == IOCTL_KGSL_DRAWCTXT_CREATE) {
struct kgsl_drawctxt_create *create = (struct kgsl_drawctxt_create *)argp;
create->flags &= ~KGSL_CONTEXT_PRIORITY_MASK;
create->flags |= 6 << KGSL_CONTEXT_PRIORITY_SHIFT; // priority from 1-15, 1 is max priority
printf("IOCTL_KGSL_DRAWCTXT_CREATE: creating context with flags 0x%x\n", create->flags);
}
if (thneed != NULL) {
if (request == IOCTL_KGSL_GPU_COMMAND) {
struct kgsl_gpu_command *cmd = (struct kgsl_gpu_command *)argp;
if (thneed->record) {
thneed->timestamp = cmd->timestamp;
thneed->context_id = cmd->context_id;
thneed->cmds.push_back(unique_ptr<CachedCommand>(new CachedCommand(thneed, cmd)));
}
if (thneed->debug >= 1) {
printf("IOCTL_KGSL_GPU_COMMAND(%2zu): flags: 0x%lx context_id: %u timestamp: %u numcmds: %d numobjs: %d\n",
thneed->cmds.size(),
cmd->flags,
cmd->context_id, cmd->timestamp, cmd->numcmds, cmd->numobjs);
}
} else if (request == IOCTL_KGSL_GPUOBJ_SYNC) {
struct kgsl_gpuobj_sync *cmd = (struct kgsl_gpuobj_sync *)argp;
struct kgsl_gpuobj_sync_obj *objs = (struct kgsl_gpuobj_sync_obj *)(cmd->objs);
if (thneed->debug >= 2) {
printf("IOCTL_KGSL_GPUOBJ_SYNC count:%d ", cmd->count);
for (int i = 0; i < cmd->count; i++) {
printf(" -- offset:0x%lx len:0x%lx id:%d op:%d ", objs[i].offset, objs[i].length, objs[i].id, objs[i].op);
}
printf("\n");
}
if (thneed->record) {
thneed->cmds.push_back(unique_ptr<CachedSync>(new
CachedSync(thneed, string((char *)objs, sizeof(struct kgsl_gpuobj_sync_obj)*cmd->count))));
}
} else if (request == IOCTL_KGSL_DEVICE_WAITTIMESTAMP_CTXTID) {
struct kgsl_device_waittimestamp_ctxtid *cmd = (struct kgsl_device_waittimestamp_ctxtid *)argp;
if (thneed->debug >= 1) {
printf("IOCTL_KGSL_DEVICE_WAITTIMESTAMP_CTXTID: context_id: %d timestamp: %d timeout: %d\n",
cmd->context_id, cmd->timestamp, cmd->timeout);
}
} else if (request == IOCTL_KGSL_SETPROPERTY) {
if (thneed->debug >= 1) {
struct kgsl_device_getproperty *prop = (struct kgsl_device_getproperty *)argp;
printf("IOCTL_KGSL_SETPROPERTY: 0x%x sizebytes:%zu\n", prop->type, prop->sizebytes);
if (thneed->debug >= 2) {
hexdump((uint8_t *)prop->value, prop->sizebytes);
if (prop->type == KGSL_PROP_PWR_CONSTRAINT) {
struct kgsl_device_constraint *constraint = (struct kgsl_device_constraint *)prop->value;
hexdump((uint8_t *)constraint->data, constraint->size);
}
}
}
} else if (request == IOCTL_KGSL_DRAWCTXT_CREATE || request == IOCTL_KGSL_DRAWCTXT_DESTROY) {
// this happens
} else if (request == IOCTL_KGSL_GPUOBJ_ALLOC || request == IOCTL_KGSL_GPUOBJ_FREE) {
// this happens
} else {
if (thneed->debug >= 1) {
printf("other ioctl %lx\n", request);
}
}
}
int ret = my_ioctl(filedes, request, argp);
// NOTE: This error message goes into stdout and messes up pyenv
// if (ret != 0) printf("ioctl returned %d with errno %d\n", ret, errno);
return ret;
}
}
// *********** GPUMalloc ***********
GPUMalloc::GPUMalloc(int size, int fd) {
struct kgsl_gpuobj_alloc alloc;
memset(&alloc, 0, sizeof(alloc));
alloc.size = size;
alloc.flags = 0x10000a00;
ioctl(fd, IOCTL_KGSL_GPUOBJ_ALLOC, &alloc);
void *addr = mmap64(NULL, alloc.mmapsize, 0x3, 0x1, fd, alloc.id*0x1000);
assert(addr != MAP_FAILED);
base = (uint64_t)addr;
remaining = size;
}
GPUMalloc::~GPUMalloc() {
// TODO: free the GPU malloced area
}
void *GPUMalloc::alloc(int size) {
void *ret = (void*)base;
size = (size+0xff) & (~0xFF);
assert(size <= remaining);
remaining -= size;
base += size;
return ret;
}
// *********** CachedSync, at the ioctl layer ***********
void CachedSync::exec() {
struct kgsl_gpuobj_sync cmd;
cmd.objs = (uint64_t)data.data();
cmd.obj_len = data.length();
cmd.count = data.length() / sizeof(struct kgsl_gpuobj_sync_obj);
int ret = ioctl(thneed->fd, IOCTL_KGSL_GPUOBJ_SYNC, &cmd);
assert(ret == 0);
}
// *********** CachedCommand, at the ioctl layer ***********
CachedCommand::CachedCommand(Thneed *lthneed, struct kgsl_gpu_command *cmd) {
thneed = lthneed;
assert(cmd->numsyncs == 0);
memcpy(&cache, cmd, sizeof(cache));
if (cmd->numcmds > 0) {
cmds = make_unique<struct kgsl_command_object[]>(cmd->numcmds);
memcpy(cmds.get(), (void *)cmd->cmdlist, sizeof(struct kgsl_command_object)*cmd->numcmds);
cache.cmdlist = (uint64_t)cmds.get();
for (int i = 0; i < cmd->numcmds; i++) {
void *nn = thneed->ram->alloc(cmds[i].size);
memcpy(nn, (void*)cmds[i].gpuaddr, cmds[i].size);
cmds[i].gpuaddr = (uint64_t)nn;
}
}
if (cmd->numobjs > 0) {
objs = make_unique<struct kgsl_command_object[]>(cmd->numobjs);
memcpy(objs.get(), (void *)cmd->objlist, sizeof(struct kgsl_command_object)*cmd->numobjs);
cache.objlist = (uint64_t)objs.get();
for (int i = 0; i < cmd->numobjs; i++) {
void *nn = thneed->ram->alloc(objs[i].size);
memset(nn, 0, objs[i].size);
objs[i].gpuaddr = (uint64_t)nn;
}
}
kq = thneed->ckq;
thneed->ckq.clear();
}
void CachedCommand::exec() {
cache.timestamp = ++thneed->timestamp;
int ret = ioctl(thneed->fd, IOCTL_KGSL_GPU_COMMAND, &cache);
if (thneed->debug >= 1) printf("CachedCommand::exec got %d\n", ret);
if (thneed->debug >= 2) {
for (auto &it : kq) {
it->debug_print(false);
}
}
assert(ret == 0);
}
// *********** Thneed ***********
Thneed::Thneed(bool do_clinit, cl_context _context) {
// TODO: QCOM2 actually requires a different context
//context = _context;
if (do_clinit) clinit();
assert(g_fd != -1);
fd = g_fd;
ram = make_unique<GPUMalloc>(0x80000, fd);
timestamp = -1;
g_thneed = this;
char *thneed_debug_env = getenv("THNEED_DEBUG");
debug = (thneed_debug_env != NULL) ? atoi(thneed_debug_env) : 0;
}
void Thneed::wait() {
struct kgsl_device_waittimestamp_ctxtid wait;
wait.context_id = context_id;
wait.timestamp = timestamp;
wait.timeout = -1;
uint64_t tb = nanos_since_boot();
int wret = ioctl(fd, IOCTL_KGSL_DEVICE_WAITTIMESTAMP_CTXTID, &wait);
uint64_t te = nanos_since_boot();
if (debug >= 1) printf("wait %d after %lu us\n", wret, (te-tb)/1000);
}
void Thneed::execute(float **finputs, float *foutput, bool slow) {
uint64_t tb, te;
if (debug >= 1) tb = nanos_since_boot();
// ****** copy inputs
copy_inputs(finputs, true);
// ****** run commands
int i = 0;
for (auto &it : cmds) {
++i;
if (debug >= 1) printf("run %2d @ %7lu us: ", i, (nanos_since_boot()-tb)/1000);
it->exec();
if ((i == cmds.size()) || slow) wait();
}
// ****** copy outputs
copy_output(foutput);
if (debug >= 1) {
te = nanos_since_boot();
printf("model exec in %lu us\n", (te-tb)/1000);
}
}

View File

@@ -0,0 +1,74 @@
#include "sunnypilot/modeld/transforms/loadyuv.h"
#include <cassert>
#include <cstdio>
#include <cstring>
void loadyuv_init(LoadYUVState* s, cl_context ctx, cl_device_id device_id, int width, int height) {
memset(s, 0, sizeof(*s));
s->width = width;
s->height = height;
char args[1024];
snprintf(args, sizeof(args),
"-cl-fast-relaxed-math -cl-denorms-are-zero "
"-DTRANSFORMED_WIDTH=%d -DTRANSFORMED_HEIGHT=%d",
width, height);
cl_program prg = cl_program_from_file(ctx, device_id, LOADYUV_PATH, args);
s->loadys_krnl = CL_CHECK_ERR(clCreateKernel(prg, "loadys", &err));
s->loaduv_krnl = CL_CHECK_ERR(clCreateKernel(prg, "loaduv", &err));
s->copy_krnl = CL_CHECK_ERR(clCreateKernel(prg, "copy", &err));
// done with this
CL_CHECK(clReleaseProgram(prg));
}
void loadyuv_destroy(LoadYUVState* s) {
CL_CHECK(clReleaseKernel(s->loadys_krnl));
CL_CHECK(clReleaseKernel(s->loaduv_krnl));
CL_CHECK(clReleaseKernel(s->copy_krnl));
}
void loadyuv_queue(LoadYUVState* s, cl_command_queue q,
cl_mem y_cl, cl_mem u_cl, cl_mem v_cl,
cl_mem out_cl, bool do_shift) {
cl_int global_out_off = 0;
if (do_shift) {
// shift the image in slot 1 to slot 0, then place the new image in slot 1
global_out_off += (s->width*s->height) + (s->width/2)*(s->height/2)*2;
CL_CHECK(clSetKernelArg(s->copy_krnl, 0, sizeof(cl_mem), &out_cl));
CL_CHECK(clSetKernelArg(s->copy_krnl, 1, sizeof(cl_int), &global_out_off));
const size_t copy_work_size = global_out_off/8;
CL_CHECK(clEnqueueNDRangeKernel(q, s->copy_krnl, 1, NULL,
&copy_work_size, NULL, 0, 0, NULL));
}
CL_CHECK(clSetKernelArg(s->loadys_krnl, 0, sizeof(cl_mem), &y_cl));
CL_CHECK(clSetKernelArg(s->loadys_krnl, 1, sizeof(cl_mem), &out_cl));
CL_CHECK(clSetKernelArg(s->loadys_krnl, 2, sizeof(cl_int), &global_out_off));
const size_t loadys_work_size = (s->width*s->height)/8;
CL_CHECK(clEnqueueNDRangeKernel(q, s->loadys_krnl, 1, NULL,
&loadys_work_size, NULL, 0, 0, NULL));
const size_t loaduv_work_size = ((s->width/2)*(s->height/2))/8;
global_out_off += (s->width*s->height);
CL_CHECK(clSetKernelArg(s->loaduv_krnl, 0, sizeof(cl_mem), &u_cl));
CL_CHECK(clSetKernelArg(s->loaduv_krnl, 1, sizeof(cl_mem), &out_cl));
CL_CHECK(clSetKernelArg(s->loaduv_krnl, 2, sizeof(cl_int), &global_out_off));
CL_CHECK(clEnqueueNDRangeKernel(q, s->loaduv_krnl, 1, NULL,
&loaduv_work_size, NULL, 0, 0, NULL));
global_out_off += (s->width/2)*(s->height/2);
CL_CHECK(clSetKernelArg(s->loaduv_krnl, 0, sizeof(cl_mem), &v_cl));
CL_CHECK(clSetKernelArg(s->loaduv_krnl, 1, sizeof(cl_mem), &out_cl));
CL_CHECK(clSetKernelArg(s->loaduv_krnl, 2, sizeof(cl_int), &global_out_off));
CL_CHECK(clEnqueueNDRangeKernel(q, s->loaduv_krnl, 1, NULL,
&loaduv_work_size, NULL, 0, 0, NULL));
}

View File

@@ -0,0 +1,47 @@
#define UV_SIZE ((TRANSFORMED_WIDTH/2)*(TRANSFORMED_HEIGHT/2))
__kernel void loadys(__global uchar8 const * const Y,
__global float * out,
int out_offset)
{
const int gid = get_global_id(0);
const int ois = gid * 8;
const int oy = ois / TRANSFORMED_WIDTH;
const int ox = ois % TRANSFORMED_WIDTH;
const uchar8 ys = Y[gid];
const float8 ysf = convert_float8(ys);
// 02
// 13
__global float* outy0;
__global float* outy1;
if ((oy & 1) == 0) {
outy0 = out + out_offset; //y0
outy1 = out + out_offset + UV_SIZE*2; //y2
} else {
outy0 = out + out_offset + UV_SIZE; //y1
outy1 = out + out_offset + UV_SIZE*3; //y3
}
vstore4(ysf.s0246, 0, outy0 + (oy/2) * (TRANSFORMED_WIDTH/2) + ox/2);
vstore4(ysf.s1357, 0, outy1 + (oy/2) * (TRANSFORMED_WIDTH/2) + ox/2);
}
__kernel void loaduv(__global uchar8 const * const in,
__global float8 * out,
int out_offset)
{
const int gid = get_global_id(0);
const uchar8 inv = in[gid];
const float8 outv = convert_float8(inv);
out[gid + out_offset / 8] = outv;
}
__kernel void copy(__global float8 * inout,
int in_offset)
{
const int gid = get_global_id(0);
inout[gid] = inout[gid + in_offset / 8];
}

View File

@@ -0,0 +1,16 @@
#pragma once
#include "common/clutil.h"
typedef struct {
int width, height;
cl_kernel loadys_krnl, loaduv_krnl, copy_krnl;
} LoadYUVState;
void loadyuv_init(LoadYUVState* s, cl_context ctx, cl_device_id device_id, int width, int height);
void loadyuv_destroy(LoadYUVState* s);
void loadyuv_queue(LoadYUVState* s, cl_command_queue q,
cl_mem y_cl, cl_mem u_cl, cl_mem v_cl,
cl_mem out_cl, bool do_shift = false);

View File

@@ -0,0 +1,97 @@
#include "sunnypilot/modeld/transforms/transform.h"
#include <cassert>
#include <cstring>
#include "common/clutil.h"
void transform_init(Transform* s, cl_context ctx, cl_device_id device_id) {
memset(s, 0, sizeof(*s));
cl_program prg = cl_program_from_file(ctx, device_id, TRANSFORM_PATH, "");
s->krnl = CL_CHECK_ERR(clCreateKernel(prg, "warpPerspective", &err));
// done with this
CL_CHECK(clReleaseProgram(prg));
s->m_y_cl = CL_CHECK_ERR(clCreateBuffer(ctx, CL_MEM_READ_WRITE, 3*3*sizeof(float), NULL, &err));
s->m_uv_cl = CL_CHECK_ERR(clCreateBuffer(ctx, CL_MEM_READ_WRITE, 3*3*sizeof(float), NULL, &err));
}
void transform_destroy(Transform* s) {
CL_CHECK(clReleaseMemObject(s->m_y_cl));
CL_CHECK(clReleaseMemObject(s->m_uv_cl));
CL_CHECK(clReleaseKernel(s->krnl));
}
void transform_queue(Transform* s,
cl_command_queue q,
cl_mem in_yuv, int in_width, int in_height, int in_stride, int in_uv_offset,
cl_mem out_y, cl_mem out_u, cl_mem out_v,
int out_width, int out_height,
const mat3& projection) {
const int zero = 0;
// sampled using pixel center origin
// (because that's how fastcv and opencv does it)
mat3 projection_y = projection;
// in and out uv is half the size of y.
mat3 projection_uv = transform_scale_buffer(projection, 0.5);
CL_CHECK(clEnqueueWriteBuffer(q, s->m_y_cl, CL_TRUE, 0, 3*3*sizeof(float), (void*)projection_y.v, 0, NULL, NULL));
CL_CHECK(clEnqueueWriteBuffer(q, s->m_uv_cl, CL_TRUE, 0, 3*3*sizeof(float), (void*)projection_uv.v, 0, NULL, NULL));
const int in_y_width = in_width;
const int in_y_height = in_height;
const int in_y_px_stride = 1;
const int in_uv_width = in_width/2;
const int in_uv_height = in_height/2;
const int in_uv_px_stride = 2;
const int in_u_offset = in_uv_offset;
const int in_v_offset = in_uv_offset + 1;
const int out_y_width = out_width;
const int out_y_height = out_height;
const int out_uv_width = out_width/2;
const int out_uv_height = out_height/2;
CL_CHECK(clSetKernelArg(s->krnl, 0, sizeof(cl_mem), &in_yuv)); // src
CL_CHECK(clSetKernelArg(s->krnl, 1, sizeof(cl_int), &in_stride)); // src_row_stride
CL_CHECK(clSetKernelArg(s->krnl, 2, sizeof(cl_int), &in_y_px_stride)); // src_px_stride
CL_CHECK(clSetKernelArg(s->krnl, 3, sizeof(cl_int), &zero)); // src_offset
CL_CHECK(clSetKernelArg(s->krnl, 4, sizeof(cl_int), &in_y_height)); // src_rows
CL_CHECK(clSetKernelArg(s->krnl, 5, sizeof(cl_int), &in_y_width)); // src_cols
CL_CHECK(clSetKernelArg(s->krnl, 6, sizeof(cl_mem), &out_y)); // dst
CL_CHECK(clSetKernelArg(s->krnl, 7, sizeof(cl_int), &out_y_width)); // dst_row_stride
CL_CHECK(clSetKernelArg(s->krnl, 8, sizeof(cl_int), &zero)); // dst_offset
CL_CHECK(clSetKernelArg(s->krnl, 9, sizeof(cl_int), &out_y_height)); // dst_rows
CL_CHECK(clSetKernelArg(s->krnl, 10, sizeof(cl_int), &out_y_width)); // dst_cols
CL_CHECK(clSetKernelArg(s->krnl, 11, sizeof(cl_mem), &s->m_y_cl)); // M
const size_t work_size_y[2] = {(size_t)out_y_width, (size_t)out_y_height};
CL_CHECK(clEnqueueNDRangeKernel(q, s->krnl, 2, NULL,
(const size_t*)&work_size_y, NULL, 0, 0, NULL));
const size_t work_size_uv[2] = {(size_t)out_uv_width, (size_t)out_uv_height};
CL_CHECK(clSetKernelArg(s->krnl, 2, sizeof(cl_int), &in_uv_px_stride)); // src_px_stride
CL_CHECK(clSetKernelArg(s->krnl, 3, sizeof(cl_int), &in_u_offset)); // src_offset
CL_CHECK(clSetKernelArg(s->krnl, 4, sizeof(cl_int), &in_uv_height)); // src_rows
CL_CHECK(clSetKernelArg(s->krnl, 5, sizeof(cl_int), &in_uv_width)); // src_cols
CL_CHECK(clSetKernelArg(s->krnl, 6, sizeof(cl_mem), &out_u)); // dst
CL_CHECK(clSetKernelArg(s->krnl, 7, sizeof(cl_int), &out_uv_width)); // dst_row_stride
CL_CHECK(clSetKernelArg(s->krnl, 8, sizeof(cl_int), &zero)); // dst_offset
CL_CHECK(clSetKernelArg(s->krnl, 9, sizeof(cl_int), &out_uv_height)); // dst_rows
CL_CHECK(clSetKernelArg(s->krnl, 10, sizeof(cl_int), &out_uv_width)); // dst_cols
CL_CHECK(clSetKernelArg(s->krnl, 11, sizeof(cl_mem), &s->m_uv_cl)); // M
CL_CHECK(clEnqueueNDRangeKernel(q, s->krnl, 2, NULL,
(const size_t*)&work_size_uv, NULL, 0, 0, NULL));
CL_CHECK(clSetKernelArg(s->krnl, 3, sizeof(cl_int), &in_v_offset)); // src_ofset
CL_CHECK(clSetKernelArg(s->krnl, 6, sizeof(cl_mem), &out_v)); // dst
CL_CHECK(clEnqueueNDRangeKernel(q, s->krnl, 2, NULL,
(const size_t*)&work_size_uv, NULL, 0, 0, NULL));
}

View File

@@ -0,0 +1,54 @@
#define INTER_BITS 5
#define INTER_TAB_SIZE (1 << INTER_BITS)
#define INTER_SCALE 1.f / INTER_TAB_SIZE
#define INTER_REMAP_COEF_BITS 15
#define INTER_REMAP_COEF_SCALE (1 << INTER_REMAP_COEF_BITS)
__kernel void warpPerspective(__global const uchar * src,
int src_row_stride, int src_px_stride, int src_offset, int src_rows, int src_cols,
__global uchar * dst,
int dst_row_stride, int dst_offset, int dst_rows, int dst_cols,
__constant float * M)
{
int dx = get_global_id(0);
int dy = get_global_id(1);
if (dx < dst_cols && dy < dst_rows)
{
float X0 = M[0] * dx + M[1] * dy + M[2];
float Y0 = M[3] * dx + M[4] * dy + M[5];
float W = M[6] * dx + M[7] * dy + M[8];
W = W != 0.0f ? INTER_TAB_SIZE / W : 0.0f;
int X = rint(X0 * W), Y = rint(Y0 * W);
int sx = convert_short_sat(X >> INTER_BITS);
int sy = convert_short_sat(Y >> INTER_BITS);
short sx_clamp = clamp(sx, 0, src_cols - 1);
short sx_p1_clamp = clamp(sx + 1, 0, src_cols - 1);
short sy_clamp = clamp(sy, 0, src_rows - 1);
short sy_p1_clamp = clamp(sy + 1, 0, src_rows - 1);
int v0 = convert_int(src[mad24(sy_clamp, src_row_stride, src_offset + sx_clamp*src_px_stride)]);
int v1 = convert_int(src[mad24(sy_clamp, src_row_stride, src_offset + sx_p1_clamp*src_px_stride)]);
int v2 = convert_int(src[mad24(sy_p1_clamp, src_row_stride, src_offset + sx_clamp*src_px_stride)]);
int v3 = convert_int(src[mad24(sy_p1_clamp, src_row_stride, src_offset + sx_p1_clamp*src_px_stride)]);
short ay = (short)(Y & (INTER_TAB_SIZE - 1));
short ax = (short)(X & (INTER_TAB_SIZE - 1));
float taby = 1.f/INTER_TAB_SIZE*ay;
float tabx = 1.f/INTER_TAB_SIZE*ax;
int dst_index = mad24(dy, dst_row_stride, dst_offset + dx);
int itab0 = convert_short_sat_rte( (1.0f-taby)*(1.0f-tabx) * INTER_REMAP_COEF_SCALE );
int itab1 = convert_short_sat_rte( (1.0f-taby)*tabx * INTER_REMAP_COEF_SCALE );
int itab2 = convert_short_sat_rte( taby*(1.0f-tabx) * INTER_REMAP_COEF_SCALE );
int itab3 = convert_short_sat_rte( taby*tabx * INTER_REMAP_COEF_SCALE );
int val = v0 * itab0 + v1 * itab1 + v2 * itab2 + v3 * itab3;
uchar pix = convert_uchar_sat((val + (1 << (INTER_REMAP_COEF_BITS-1))) >> INTER_REMAP_COEF_BITS);
dst[dst_index] = pix;
}
}

View File

@@ -0,0 +1,25 @@
#pragma once
#define CL_USE_DEPRECATED_OPENCL_1_2_APIS
#ifdef __APPLE__
#include <OpenCL/cl.h>
#else
#include <CL/cl.h>
#endif
#include "common/mat.h"
typedef struct {
cl_kernel krnl;
cl_mem m_y_cl, m_uv_cl;
} Transform;
void transform_init(Transform* s, cl_context ctx, cl_device_id device_id);
void transform_destroy(Transform* transform);
void transform_queue(Transform* s, cl_command_queue q,
cl_mem yuv, int in_width, int in_height, int in_stride, int in_uv_offset,
cl_mem out_y, cl_mem out_u, cl_mem out_v,
int out_width, int out_height,
const mat3& projection);

View File

@@ -21,7 +21,8 @@ async def verify_file(file_path: str, expected_hash: str) -> bool:
return sha256_hash.hexdigest().lower() == expected_hash.lower()
def get_active_bundle(params: Params) -> custom.ModelManagerSP.ModelBundle:
def get_active_bundle(params: Params = None) -> custom.ModelManagerSP.ModelBundle:
"""Gets the active model bundle from cache"""
if params is None:
params = Params()
@@ -30,3 +31,49 @@ def get_active_bundle(params: Params) -> custom.ModelManagerSP.ModelBundle:
return messaging.log_from_bytes(active_bundle, custom.ModelManagerSP.ModelBundle)
return None
def get_model_runner_by_filename(filename: str) -> custom.ModelManagerSP.Runner:
if filename.endswith(".thneed"):
return custom.ModelManagerSP.Runner.snpe
if filename.endswith("_tinygrad.pkl"):
return custom.ModelManagerSP.Runner.tinygrad
def get_active_model_runner(params: Params = None, force_check=False) -> custom.ModelManagerSP.Runner:
"""
Determines and returns the active model runner type, based on provided parameters.
The function utilizes caching to prevent redundant calculations and checks.
If the cached "ModelRunnerTypeCache" exists in the provided parameters and `force_check`
is set to False, the cached value is directly returned. Otherwise, the function determines
the runner type based on the active model bundle. If a model bundle containing a drive
model exists, the runner type is derived based on the filename of the drive model.
Finally, it updates the cache with the determined runner type, if needed.
:param params: The parameter set used to retrieve caching and runner details. If `None`,
a default `Params` instance is created internally.
:type params: Params
:param force_check: A flag indicating whether to bypass cached results and always
re-determine the runner type. Defaults to `False`.
:type force_check: bool
:return: The determined or cached model runner type.
:rtype: custom.ModelManagerSP.Runner
"""
if params is None:
params = Params()
if (cached_runner_type := params.get("ModelRunnerTypeCache")) and not force_check:
return int(cached_runner_type)
runner_type = custom.ModelManagerSP.Runner.tinygrad
if active_bundle := get_active_bundle(params):
drive_model = next(model for model in active_bundle.models if model.type == custom.ModelManagerSP.Type.drive)
runner_type = get_model_runner_by_filename(drive_model.fileName)
if cached_runner_type != runner_type:
params.put("ModelRunnerTypeCache", str(runner_type))
return runner_type

View File

@@ -5,6 +5,7 @@ from cereal import car
from openpilot.common.params import Params
from openpilot.system.hardware import PC, TICI
from openpilot.system.manager.process import PythonProcess, NativeProcess, DaemonProcess
from sunnypilot.sunnylink.utils import sunnylink_need_register, sunnylink_ready, use_sunnylink_uploader
WEBCAM = os.getenv("USE_WEBCAM") is not None
@@ -70,6 +71,14 @@ def use_sunnylink_uploader_shim(started, params, CP: car.CarParams) -> bool:
"""Shim for use_sunnylink_uploader to match the process manager signature."""
return use_sunnylink_uploader(params)
def is_snpe_model(started, params, CP: car.CarParams) -> bool:
"""Check if the active model runner is SNPE."""
return False # FIXME-SP: Enable in future PR
def is_stock_model(started, params, CP: car.CarParams) -> bool:
"""Check if the active model runner is stock."""
return not is_snpe_model(started, params, CP)
def or_(*fns):
return lambda *args: operator.or_(*(fn(*args) for fn in fns))
@@ -93,7 +102,7 @@ procs = [
NativeProcess("stream_encoderd", "system/loggerd", ["./encoderd", "--stream"], notcar),
NativeProcess("loggerd", "system/loggerd", ["./loggerd"], logging),
# TODO Make python process once TG allows opening QCOM from child proc
NativeProcess("modeld", "selfdrive/modeld", ["./modeld"], only_onroad),
NativeProcess("modeld", "selfdrive/modeld", ["./modeld"], and_(only_onroad, is_stock_model)),
NativeProcess("sensord", "system/sensord", ["./sensord"], only_onroad, enabled=not PC),
NativeProcess("ui", "selfdrive/ui", ["./ui"], always_run, watchdog_max_dt=(5 if not PC else None)),
PythonProcess("soundd", "selfdrive.ui.soundd", only_onroad),
@@ -135,6 +144,7 @@ procs = [
# sunnypilot
procs += [
PythonProcess("models_manager", "sunnypilot.models.manager", only_offroad),
NativeProcess("modeld_snpe", "sunnypilot/modeld", ["./modeld"], and_(only_onroad, is_snpe_model)),
]
if os.path.exists("./github_runner.sh"):

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
third_party/snpe/dsp/libcalculator_skel.so LFS vendored Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,84 @@
//=============================================================================
//
// Copyright (c) 2015, 2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef __IDIAGLOG_HPP_
#define __IDIAGLOG_HPP_
#include <string>
#include "DiagLog/Options.hpp"
#include "DlSystem/String.hpp"
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl
{
namespace DiagLog
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/// @brief .
///
/// Interface for controlling logging for zdl components.
class ZDL_EXPORT IDiagLog
{
public:
/// @brief .
///
/// Sets the options after initialization occurs.
///
/// @param[in] loggingOptions The options to set up diagnostic logging.
///
/// @return False if the options could not be set. Ensure logging is not started.
virtual bool setOptions(const Options& loggingOptions) = 0;
/// @brief .
///
/// Gets the curent options for the diag logger.
///
/// @return Diag log options object.
virtual Options getOptions() = 0;
/// @brief .
///
/// Allows for setting the log mask once diag logging has started
///
/// @return True if the level was set successfully, false if a failure occurred.
virtual bool setDiagLogMask(const std::string& mask) = 0;
/// @brief .
///
/// Allows for setting the log mask once diag logging has started
///
/// @return True if the level was set successfully, false if a failure occurred.
virtual bool setDiagLogMask(const zdl::DlSystem::String& mask) = 0;
/// @brief .
///
/// Enables logging for zdl components.
///
/// Logging should be started prior to the instantiation of zdl components
/// to ensure all events are captured.
///
/// @return False if diagnostic logging could not be started.
virtual bool start(void) = 0;
/// @brief Disables logging for zdl components.
virtual bool stop(void) = 0;
virtual ~IDiagLog() {};
};
} // DiagLog namespace
} // zdl namespace
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif

View File

@@ -0,0 +1,79 @@
//=============================================================================
//
// Copyright (c) 2015, 2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef __DIAGLOG_OPTIONS_HPP_
#define __DIAGLOG_OPTIONS_HPP_
#include <string>
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl
{
namespace DiagLog
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/// @brief .
///
/// Options for setting up diagnostic logging for zdl components.
class ZDL_EXPORT Options
{
public:
Options() :
DiagLogMask(""),
LogFileDirectory("diaglogs"),
LogFileName("DiagLog"),
LogFileRotateCount(20),
LogFileReplace(true)
{
// Solves the empty string problem with multiple std libs
DiagLogMask.reserve(1);
}
/// @brief .
///
/// Enables diag logging only on the specified area mask (DNN_RUNTIME=ON | OFF)
std::string DiagLogMask;
/// @brief .
///
/// The path to the directory where log files will be written.
/// The path may be relative or absolute. Relative paths are interpreted
/// from the current working directory.
/// Default value is "diaglogs"
std::string LogFileDirectory;
/// @brief .
///
//// The name used for log files. If this value is empty then BaseName will be
/// used as the default file name.
/// Default value is "DiagLog"
std::string LogFileName;
/// @brief .
///
/// The maximum number of log files to create. If set to 0 no log rotation
/// will be used and the log file name specified will be used each time, overwriting
/// any existing log file that may exist.
/// Default value is 20
uint32_t LogFileRotateCount;
/// @brief
///
/// If the log file already exists, control whether it will be replaced
/// (existing contents truncated), or appended.
/// Default value is true
bool LogFileReplace;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
} // DiagLog namespace
} // zdl namespace
#endif

View File

@@ -0,0 +1,191 @@
//=============================================================================
//
// Copyright (c) 2015-2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef ZEROTH_IDNC_CONTAINER_HPP
#define ZEROTH_IDNC_CONTAINER_HPP
#include <memory>
#include <stdint.h>
#include <string>
#include <vector>
#include <set>
#include "DlSystem/ZdlExportDefine.hpp"
#include "DlSystem/String.hpp"
namespace zdl {
namespace DlContainer {
/** @addtogroup c_plus_plus_apis C++
@{ */
class IDlContainer;
class dlc_error;
/**
* The structure of a record in a DL container.
*/
struct ZDL_EXPORT DlcRecord
{
/// Name of the record.
std::string name;
/// Byte blob holding the data for the record.
std::vector<uint8_t> data;
DlcRecord();
DlcRecord( DlcRecord&& other )
: name(std::move(other.name))
, data(std::move(other.data))
{}
DlcRecord(const std::string& new_name)
: name(new_name)
, data()
{
if(name.empty())
{
name.reserve(1);
}
}
DlcRecord(const DlcRecord&) = delete;
};
// The maximum length of any record name.
extern const uint32_t RECORD_NAME_MAX_SIZE;
// The maximum size of the record payload (bytes).
extern const uint32_t RECORD_DATA_MAX_SIZE;
// The maximum number of records in an archive at one time.
extern const uint32_t ARCHIVE_MAX_RECORDS;
/**
* Represents a container for a neural network model which can
* be used to load the model into the SNPE runtime.
*/
class ZDL_EXPORT IDlContainer
{
public:
/**
* Initializes a container from a container archive file.
*
* @param[in] filename Container archive file path.
*
* @return A pointer to the initialized container
*/
static std::unique_ptr<IDlContainer>
open(const std::string &filename) noexcept;
/**
* Initializes a container from a container archive file.
*
* @param[in] filename Container archive file path.
*
* @return A pointer to the initialized container
*/
static std::unique_ptr<IDlContainer>
open(const zdl::DlSystem::String &filename) noexcept;
/**
* Initializes a container from a byte buffer.
*
* @param[in] buffer Byte buffer holding the contents of an archive
* file.
*
* @return A pointer to the initialized container
*/
static std::unique_ptr<IDlContainer>
open(const std::vector<uint8_t> &buffer) noexcept;
/**
* Initializes a container from a byte buffer.
*
* @param[in] buffer Byte buffer holding the contents of an archive
* file.
*
* @param[in] size Size of the byte buffer.
*
* @return A pointer to the initialized container
*/
static std::unique_ptr<IDlContainer>
open(const uint8_t* buffer, const size_t size) noexcept;
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
/**
* Get the record catalog for a container.
*
* @param[out] catalog Buffer that will hold the record names on
* return.
*/
virtual void getCatalog(std::set<std::string> &catalog) const = 0;
/**
* Get the record catalog for a container.
*
* @param[out] catalog Buffer that will hold the record names on
* return.
*/
virtual void getCatalog(std::set<zdl::DlSystem::String> &catalog) const = 0;
/**
* Get a record from a container by name.
*
* @param[in] name Name of the record to fetch.
* @param[out] record The passed in record will be populated with the
* record data on return. Note that the caller
* will own the data in the record and is
* responsible for freeing it if needed.
*/
virtual void getRecord(const std::string &name, DlcRecord &record) const = 0;
/**
* Get a record from a container by name.
*
* @param[in] name Name of the record to fetch.
* @param[out] record The passed in record will be populated with the
* record data on return. Note that the caller
* will own the data in the record and is
* responsible for freeing it if needed.
*/
virtual void getRecord(const zdl::DlSystem::String &name, DlcRecord &record) const = 0;
/**
* Save the container to an archive on disk. This function will save the
* container if the filename is different from the file that it was opened
* from, or if at least one record was modified since the container was
* opened.
*
* It will truncate any existing file at the target path.
*
* @param filename Container archive file path.
*
* @return indication of success/failure
*/
virtual bool save(const std::string &filename) = 0;
/**
* Save the container to an archive on disk. This function will save the
* container if the filename is different from the file that it was opened
* from, or if at least one record was modified since the container was
* opened.
*
* It will truncate any existing file at the target path.
*
* @param filename Container archive file path.
*
* @return indication of success/failure
*/
virtual bool save (const zdl::DlSystem::String &filename) = 0;
virtual ~IDlContainer() {}
};
} // ns DlContainer
} // ns zdl
#endif

View File

@@ -0,0 +1,234 @@
//==============================================================================
//
// Copyright (c) 2014-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _DL_ENUMS_HPP_
#define _DL_ENUMS_HPP_
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl {
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* Enumeration of supported target runtimes.
*/
enum class Runtime_t
{
/// Run the processing on Snapdragon CPU.
/// Data: float 32bit
/// Math: float 32bit
CPU_FLOAT32 = 0,
/// Run the processing on the Adreno GPU.
/// Data: float 16bit
/// Math: float 32bit
GPU_FLOAT32_16_HYBRID = 1,
/// Run the processing on the Hexagon DSP.
/// Data: 8bit fixed point Tensorflow style format
/// Math: 8bit fixed point Tensorflow style format
DSP_FIXED8_TF = 2,
/// Run the processing on the Adreno GPU.
/// Data: float 16bit
/// Math: float 16bit
GPU_FLOAT16 = 3,
/// Run the processing on Snapdragon AIX+HVX.
/// Data: 8bit fixed point Tensorflow style format
/// Math: 8bit fixed point Tensorflow style format
AIP_FIXED8_TF = 5,
AIP_FIXED_TF = AIP_FIXED8_TF,
/// Default legacy enum to retain backward compatibility.
/// CPU = CPU_FLOAT32
CPU = CPU_FLOAT32,
/// Default legacy enum to retain backward compatibility.
/// GPU = GPU_FLOAT32_16_HYBRID
GPU = GPU_FLOAT32_16_HYBRID,
/// Default legacy enum to retain backward compatibility.
/// DSP = DSP_FIXED8_TF
DSP = DSP_FIXED8_TF,
/// Special value indicating the property is unset.
UNSET = -1
};
/**
* Enumeration of runtime available check options.
*/
enum class RuntimeCheckOption_t
{
/// Perform standard runtime available check
DEFAULT = 0,
/// Perform standard runtime available check
NORMAL_CHECK = 0,
/// Perform basic runtime available check, may be runtime specific
BASIC_CHECK = 1,
/// Perform unsignedPD runtime available check
UNSIGNEDPD_CHECK = 2,
};
/**
* Enumeration of various performance profiles that can be requested.
*/
enum class PerformanceProfile_t
{
/// Run in a standard mode.
/// This mode will be deprecated in the future and replaced with BALANCED.
DEFAULT = 0,
/// Run in a balanced mode.
BALANCED = 0,
/// Run in high performance mode
HIGH_PERFORMANCE = 1,
/// Run in a power sensitive mode, at the expense of performance.
POWER_SAVER = 2,
/// Use system settings. SNPE makes no calls to any performance related APIs.
SYSTEM_SETTINGS = 3,
/// Run in sustained high performance mode
SUSTAINED_HIGH_PERFORMANCE = 4,
/// Run in burst mode
BURST = 5,
/// Run in lower clock than POWER_SAVER, at the expense of performance.
LOW_POWER_SAVER = 6,
/// Run in higher clock and provides better performance than POWER_SAVER.
HIGH_POWER_SAVER = 7,
/// Run in lower balanced mode
LOW_BALANCED = 8,
};
/**
* Enumeration of various profilngLevels that can be requested.
*/
enum class ProfilingLevel_t
{
/// No profiling.
/// Collects no runtime stats in the DiagLog
OFF = 0,
/// Basic profiling
/// Collects some runtime stats in the DiagLog
BASIC = 1,
/// Detailed profiling
/// Collects more runtime stats in the DiagLog, including per-layer statistics
/// Performance may be impacted
DETAILED = 2,
/// Moderate profiling
/// Collects more runtime stats in the DiagLog, no per-layer statistics
MODERATE = 3
};
/**
* Enumeration of various execution priority hints.
*/
enum class ExecutionPriorityHint_t
{
/// Normal priority
NORMAL = 0,
/// Higher than normal priority
HIGH = 1,
/// Lower priority
LOW = 2
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++*/
/**
* Enumeration that lists the supported image encoding formats.
*/
enum class ImageEncoding_t
{
/// For unknown image type. Also used as a default value for ImageEncoding_t.
UNKNOWN = 0,
/// The RGB format consists of 3 bytes per pixel: one byte for
/// Red, one for Green, and one for Blue. The byte ordering is
/// endian independent and is always in RGB byte order.
RGB = 1,
/// The ARGB32 format consists of 4 bytes per pixel: one byte for
/// Red, one for Green, one for Blue, and one for the alpha channel.
/// The alpha channel is ignored. The byte ordering depends on the
/// underlying CPU. For little endian CPUs, the byte order is BGRA.
/// For big endian CPUs, the byte order is ARGB.
ARGB32 = 2,
/// The RGBA format consists of 4 bytes per pixel: one byte for
/// Red, one for Green, one for Blue, and one for the alpha channel.
/// The alpha channel is ignored. The byte ordering is endian independent
/// and is always in RGBA byte order.
RGBA = 3,
/// The GRAYSCALE format is for 8-bit grayscale.
GRAYSCALE = 4,
/// NV21 is the Android version of YUV. The Chrominance is down
/// sampled and has a subsampling ratio of 4:2:0. Note that this
/// image format has 3 channels, but the U and V channels
/// are subsampled. For every four Y pixels there is one U and one V pixel. @newpage
NV21 = 5,
/// The BGR format consists of 3 bytes per pixel: one byte for
/// Red, one for Green and one for Blue. The byte ordering is
/// endian independent and is always BGR byte order.
BGR = 6
};
/**
* Enumeration that lists the supported LogLevels that can be set by users.
*/
enum class LogLevel_t
{
/// Enumeration variable to be used by user to set logging level to FATAL.
LOG_FATAL = 0,
/// Enumeration variable to be used by user to set logging level to ERROR.
LOG_ERROR = 1,
/// Enumeration variable to be used by user to set logging level to WARN.
LOG_WARN = 2,
/// Enumeration variable to be used by user to set logging level to INFO.
LOG_INFO = 3,
/// Enumeration variable to be used by user to set logging level to VERBOSE.
LOG_VERBOSE = 4
};
typedef enum : int
{
UNSPECIFIED = 0,
FLOATING_POINT_32 = 1,
FLOATING_POINT_16 = 2,
FIXED_POINT_8 = 3,
FIXED_POINT_16 = 4
} IOBufferDataType_t;
}} // namespaces end
#endif

View File

@@ -0,0 +1,259 @@
//==============================================================================
//
// Copyright (c) 2016-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _DL_ERROR_HPP_
#define _DL_ERROR_HPP_
#include <stdint.h>
#include <limits> // numeric_limits
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl {
namespace DlSystem {
// clang and arm gcc different in how ZDL_EXPORT is used with enum class
#if !defined (__clang__)
enum class ErrorCode : uint32_t ZDL_EXPORT {
#else
enum class ZDL_EXPORT ErrorCode : uint32_t {
#endif // ARM64V8A
NONE = 0,
// System config errors
SNPE_CONFIG_MISSING_PARAM = 100,
SNPE_CONFIG_INVALID_PARAM = 101,
SNPE_CONFIG_MISSING_FILE = 102,
SNPE_CONFIG_NNCONFIG_NOT_SET = 103,
SNPE_CONFIG_NNCONFIG_INVALID = 104,
SNPE_CONFIG_WRONG_INPUT_NAME = 105,
SNPE_CONFIG_INCORRECT_INPUT_DIMENSIONS = 106,
SNPE_CONFIG_DIMENSIONS_MODIFICATION_NOT_SUPPORTED = 107,
SNPE_CONFIG_BOTH_OUTPUT_LAYER_TENSOR_NAMES_SET = 108,
SNPE_CONFIG_NNCONFIG_ONLY_TENSOR_SUPPORTED = 120,
SNPE_CONFIG_NNCONFIG_ONLY_USER_BUFFER_SUPPORTED = 121,
// DlSystem errors
SNPE_DLSYSTEM_MISSING_BUFFER = 200,
SNPE_DLSYSTEM_TENSOR_CAST_FAILED = 201,
SNPE_DLSYSTEM_FIXED_POINT_PARAM_INVALID = 202,
SNPE_DLSYSTEM_SIZE_MISMATCH = 203,
SNPE_DLSYSTEM_NAME_NOT_FOUND = 204,
SNPE_DLSYSTEM_VALUE_MISMATCH = 205,
SNPE_DLSYSTEM_INSERT_FAILED = 206,
SNPE_DLSYSTEM_TENSOR_FILE_READ_FAILED = 207,
SNPE_DLSYSTEM_DIAGLOG_FAILURE = 208,
SNPE_DLSYSTEM_LAYER_NOT_SET = 209,
SNPE_DLSYSTEM_WRONG_NUMBER_INPUT_BUFFERS = 210,
SNPE_DLSYSTEM_RUNTIME_TENSOR_SHAPE_MISMATCH = 211,
SNPE_DLSYSTEM_TENSOR_MISSING = 212,
SNPE_DLSYSTEM_TENSOR_ITERATION_UNSUPPORTED = 213,
SNPE_DLSYSTEM_BUFFER_MANAGER_MISSING = 214,
SNPE_DLSYSTEM_RUNTIME_BUFFER_SOURCE_UNSUPPORTED = 215,
SNPE_DLSYSTEM_BUFFER_CAST_FAILED = 216,
SNPE_DLSYSTEM_WRONG_TRANSITION_TYPE = 217,
SNPE_DLSYSTEM_LAYER_ALREADY_REGISTERED = 218,
SNPE_DLSYSTEM_TENSOR_DIM_INVALID = 219,
SNPE_DLSYSTEM_BUFFERENCODING_UNKNOWN = 240,
SNPE_DLSYSTEM_BUFFER_INVALID_PARAM = 241,
// DlContainer errors
SNPE_DLCONTAINER_MODEL_PARSING_FAILED = 300,
SNPE_DLCONTAINER_UNKNOWN_LAYER_CODE = 301,
SNPE_DLCONTAINER_MISSING_LAYER_PARAM = 302,
SNPE_DLCONTAINER_LAYER_PARAM_NOT_SUPPORTED = 303,
SNPE_DLCONTAINER_LAYER_PARAM_INVALID = 304,
SNPE_DLCONTAINER_TENSOR_DATA_MISSING = 305,
SNPE_DLCONTAINER_MODEL_LOAD_FAILED = 306,
SNPE_DLCONTAINER_MISSING_RECORDS = 307,
SNPE_DLCONTAINER_INVALID_RECORD = 308,
SNPE_DLCONTAINER_WRITE_FAILURE = 309,
SNPE_DLCONTAINER_READ_FAILURE = 310,
SNPE_DLCONTAINER_BAD_CONTAINER = 311,
SNPE_DLCONTAINER_BAD_DNN_FORMAT_VERSION = 312,
SNPE_DLCONTAINER_UNKNOWN_AXIS_ANNOTATION = 313,
SNPE_DLCONTAINER_UNKNOWN_SHUFFLE_TYPE = 314,
SNPE_DLCONTAINER_TEMP_FILE_FAILURE = 315,
// Network errors
SNPE_NETWORK_EMPTY_NETWORK = 400,
SNPE_NETWORK_CREATION_FAILED = 401,
SNPE_NETWORK_PARTITION_FAILED = 402,
SNPE_NETWORK_NO_OUTPUT_DEFINED = 403,
SNPE_NETWORK_MISMATCH_BETWEEN_NAMES_AND_DIMS = 404,
SNPE_NETWORK_MISSING_INPUT_NAMES = 405,
SNPE_NETWORK_MISSING_OUTPUT_NAMES = 406,
SNPE_NETWORK_EXECUTION_FAILED = 407,
// Host runtime errors
SNPE_HOST_RUNTIME_TARGET_UNAVAILABLE = 500,
// CPU runtime errors
SNPE_CPU_LAYER_NOT_SUPPORTED = 600,
SNPE_CPU_LAYER_PARAM_NOT_SUPPORTED = 601,
SNPE_CPU_LAYER_PARAM_INVALID = 602,
SNPE_CPU_LAYER_PARAM_COMBINATION_INVALID = 603,
SNPE_CPU_BUFFER_NOT_FOUND = 604,
SNPE_CPU_NETWORK_NOT_SUPPORTED = 605,
SNPE_CPU_UDO_OPERATION_FAILED = 606,
// CPU fixed-point runtime errors
SNPE_CPU_FXP_LAYER_NOT_SUPPORTED = 700,
SNPE_CPU_FXP_LAYER_PARAM_NOT_SUPPORTED = 701,
SNPE_CPU_FXP_LAYER_PARAM_INVALID = 702,
// GPU runtime errors
SNPE_GPU_LAYER_NOT_SUPPORTED = 800,
SNPE_GPU_LAYER_PARAM_NOT_SUPPORTED = 801,
SNPE_GPU_LAYER_PARAM_INVALID = 802,
SNPE_GPU_LAYER_PARAM_COMBINATION_INVALID = 803,
SNPE_GPU_KERNEL_COMPILATION_FAILED = 804,
SNPE_GPU_CONTEXT_NOT_SET = 805,
SNPE_GPU_KERNEL_NOT_SET = 806,
SNPE_GPU_KERNEL_PARAM_INVALID = 807,
SNPE_GPU_OPENCL_CHECK_FAILED = 808,
SNPE_GPU_OPENCL_FUNCTION_ERROR = 809,
SNPE_GPU_BUFFER_NOT_FOUND = 810,
SNPE_GPU_TENSOR_DIM_INVALID = 811,
SNPE_GPU_MEMORY_FLAGS_INVALID = 812,
SNPE_GPU_UNEXPECTED_NUMBER_OF_IO = 813,
SNPE_GPU_LAYER_PROXY_ERROR = 814,
SNPE_GPU_BUFFER_IN_USE = 815,
SNPE_GPU_BUFFER_MODIFICATION_ERROR = 816,
SNPE_GPU_DATA_ARRANGEMENT_INVALID = 817,
SNPE_GPU_UDO_OPERATION_FAILED = 818,
// DSP runtime errors
SNPE_DSP_LAYER_NOT_SUPPORTED = 900,
SNPE_DSP_LAYER_PARAM_NOT_SUPPORTED = 901,
SNPE_DSP_LAYER_PARAM_INVALID = 902,
SNPE_DSP_LAYER_PARAM_COMBINATION_INVALID = 903,
SNPE_DSP_STUB_NOT_PRESENT = 904,
SNPE_DSP_LAYER_NAME_TRUNCATED = 905,
SNPE_DSP_LAYER_INPUT_BUFFER_NAME_TRUNCATED = 906,
SNPE_DSP_LAYER_OUTPUT_BUFFER_NAME_TRUNCATED = 907,
SNPE_DSP_RUNTIME_COMMUNICATION_ERROR = 908,
SNPE_DSP_RUNTIME_INVALID_PARAM_ERROR = 909,
SNPE_DSP_RUNTIME_SYSTEM_ERROR = 910,
SNPE_DSP_RUNTIME_CRASHED_ERROR = 911,
SNPE_DSP_BUFFER_SIZE_ERROR = 912,
SNPE_DSP_UDO_EXECUTE_ERROR = 913,
SNPE_DSP_UDO_LIB_NOT_REGISTERED_ERROR = 914,
SNPE_DSP_UDO_INVALID_QUANTIZATION_TYPE_ERROR = 915,
SNPE_DSP_RUNTIME_INVALID_RPC_DRIVER = 916,
SNPE_DSP_RUNTIME_RPC_PERMISSION_ERROR = 917,
SNPE_DSP_RUNTIME_DSP_FILE_OPEN_ERROR = 918,
// Model validataion errors
SNPE_MODEL_VALIDATION_LAYER_NOT_SUPPORTED = 1000,
SNPE_MODEL_VALIDATION_LAYER_PARAM_NOT_SUPPORTED = 1001,
SNPE_MODEL_VALIDATION_LAYER_PARAM_INVALID = 1002,
SNPE_MODEL_VALIDATION_LAYER_PARAM_MISSING = 1003,
SNPE_MODEL_VALIDATION_LAYER_PARAM_COMBINATION_INVALID = 1004,
SNPE_MODEL_VALIDATION_LAYER_ORDERING_INVALID = 1005,
SNPE_MODEL_VALIDATION_INVALID_CONSTRAINT = 1006,
SNPE_MODEL_VALIDATION_MISSING_BUFFER = 1007,
SNPE_MODEL_VALIDATION_BUFFER_REUSE_NOT_SUPPORTED = 1008,
SNPE_MODEL_VALIDATION_LAYER_COULD_NOT_BE_ASSIGNED = 1009,
SNPE_MODEL_VALIDATION_UDO_LAYER_FAILED = 1010,
// UDL errors
SNPE_UDL_LAYER_EMPTY_UDL_NETWORK = 1100,
SNPE_UDL_LAYER_PARAM_INVALID = 1101,
SNPE_UDL_LAYER_INSTANCE_MISSING = 1102,
SNPE_UDL_LAYER_SETUP_FAILED = 1103,
SNPE_UDL_EXECUTE_FAILED = 1104,
SNPE_UDL_BUNDLE_INVALID = 1105,
SNPE_UDO_REGISTRATION_FAILED = 1106,
SNPE_UDO_GET_PACKAGE_FAILED = 1107,
SNPE_UDO_GET_IMPLEMENTATION_FAILED = 1108,
// Dependent library errors
SNPE_STD_LIBRARY_ERROR = 1200,
// Unknown exception (catch (...)), Has no component attached to this
SNPE_UNKNOWN_EXCEPTION = 1210,
// Storage Errors
SNPE_STORAGE_INVALID_KERNEL_REPO = 1300,
// AIP runtime errors
SNPE_AIP_LAYER_NOT_SUPPORTED = 1400,
SNPE_AIP_LAYER_PARAM_NOT_SUPPORTED = 1401,
SNPE_AIP_LAYER_PARAM_INVALID = 1402,
SNPE_AIP_LAYER_PARAM_COMBINATION_INVALID = 1403,
SNPE_AIP_STUB_NOT_PRESENT = 1404,
SNPE_AIP_LAYER_NAME_TRUNCATED = 1405,
SNPE_AIP_LAYER_INPUT_BUFFER_NAME_TRUNCATED = 1406,
SNPE_AIP_LAYER_OUTPUT_BUFFER_NAME_TRUNCATED = 1407,
SNPE_AIP_RUNTIME_COMMUNICATION_ERROR = 1408,
SNPE_AIP_RUNTIME_INVALID_PARAM_ERROR = 1409,
SNPE_AIP_RUNTIME_SYSTEM_ERROR = 1410,
SNPE_AIP_RUNTIME_TENSOR_MISSING = 1411,
SNPE_AIP_RUNTIME_TENSOR_SHAPE_MISMATCH = 1412,
SNPE_AIP_RUNTIME_BAD_AIX_RECORD = 1413,
// DlCaching errors
SNPE_DLCACHING_INVALID_METADATA = 1500,
SNPE_DLCACHING_INVALID_INITBLOB = 1501,
// Infrastructure Errors
SNPE_INFRA_CLUSTERMGR_INSTANCE_INVALID = 1600,
SNPE_INFRA_CLUSTERMGR_EXECUTE_SYNC_FAILED = 1601,
// Memory Errors
SNPE_MEMORY_CORRUPTION_ERROR = 1700
};
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* Returns the error code of the last error encountered.
*
* @return The error code.
*
* @note The returned error code is significant only when the return
* value of the call indicated an error.
*/
ZDL_EXPORT ErrorCode getLastErrorCode();
/**
* Returns the error string of the last error encountered.
*
* @return The error string.
*
* @note The returned error string is significant only when the return
* value of the call indicated an error.
*/
ZDL_EXPORT const char* getLastErrorString();
/**
* Returns the info string of the last error encountered.
*/
ZDL_EXPORT const char* getLastInfoString();
/**
* Returns the uint32_t representation of the error code enum.
*
* @param[in] code The error code to be converted.
*
* @return uint32_t representation of the error code.
*/
ZDL_EXPORT uint32_t enumToUInt32(zdl::DlSystem::ErrorCode code);
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
} // DlSystem
} // zdl
#endif // _DL_ERROR_HPP_

View File

@@ -0,0 +1,225 @@
//==============================================================================
//
// Copyright (c) 2016, 2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _DL_SYSTEM_OPTIONAL_HPP_
#define _DL_SYSTEM_OPTIONAL_HPP_
#include <cstdio>
#include <utility>
#include <stdexcept>
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl {
namespace DlSystem {
template <typename T>
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* Class to manage a value that may or may not exist. The boolean value
* of the Optional class is true if the object contains a value and false
* if it does not contain a value.
*
* The class must be evaluated and confirmed as true (containing a value)
* before being dereferenced.
*/
class ZDL_EXPORT Optional final {
public:
enum class LIFECYCLE {
NONE = 0,
REFERENCE_OWNED = 1,
POINTER_OWNED = 2,
POINTER_NOT_OWNED = 3
};
struct ReferenceCount {
size_t count = 0;
void increment() { count++; }
size_t decrement() {
if (count > 0) {
count--;
}
return count;
}
};
using U = typename std::remove_pointer<T>::type;
/**
* The default constructor is set to not have any value, and is
* therefore evaluated as false.
*/
// Do not explicit it so we can return {}
Optional() {
m_Type = LIFECYCLE::NONE;
}
/**
* Construct an Optional class using an object.
* @param[in] Reference to an object v
* @param[out] Optional instance of object v
*/
template <typename Q = T>
Optional (const T& v, typename std::enable_if<!std::is_pointer<Q>::value>::type* = 0)
: m_Type(LIFECYCLE::REFERENCE_OWNED) {
try {
m_StoragePtr = new T(v);
} catch (...) {
m_StoragePtr = nullptr;
m_Type = LIFECYCLE::NONE;
}
}
template <typename Q = T>
Optional(U* v, LIFECYCLE type, typename std::enable_if<std::is_pointer<Q>::value>::type* = 0)
: m_Type(type) {
switch (m_Type) {
case LIFECYCLE::POINTER_OWNED:
m_StoragePtr = v;
m_Count = new ReferenceCount();
m_Count->increment();
break;
case LIFECYCLE::POINTER_NOT_OWNED:
m_StoragePtr = v;
break;
case LIFECYCLE::REFERENCE_OWNED:
throw std::bad_exception();
case LIFECYCLE::NONE:
break;
}
}
Optional(const Optional &other) : m_Type(other.m_Type), m_Count(other.m_Count) {
if (isReference()) {
m_StoragePtr = new U(*other.m_StoragePtr);
} else if (isPointer()) {
m_StoragePtr = other.m_StoragePtr;
if (isOwned()) {
m_Count->increment();
}
}
}
Optional& operator=(const Optional& other) noexcept {
Optional tmp(other);
swap(std::move(tmp));
return *this;
}
Optional(Optional&& other) noexcept {
swap(std::move(other));
}
Optional& operator=(Optional&& other) noexcept {
swap(std::move(other));
return *this;
}
~Optional() {
if (isOwned()) {
if (isReference() || (isPointer() && m_Count->decrement() == 0)) {
delete m_StoragePtr;
delete m_Count;
}
}
}
/**
* Boolean value of Optional class is only true when there exists a value.
*/
operator bool() const noexcept { return isValid(); }
bool operator!() const noexcept { return !isValid(); }
/**
* Get reference of Optional object
* @warning User must validate Optional has value before.
*/
const T& operator*() { return this->GetReference(); }
/**
* Get reference of Optional object
* @warning User must validate Optional has value before.
*/
const T& operator*() const { return this->GetReference(); }
operator T&() { return this->GetReference(); }
T operator->() {
T self = this->GetReference();
return self;
}
private:
void swap(Optional&& other) {
m_Type = other.m_Type;
m_StoragePtr = other.m_StoragePtr;
m_Count = other.m_Count;
other.m_Type = LIFECYCLE::NONE;
other.m_StoragePtr = nullptr;
other.m_Count = nullptr;
}
template <typename Q = T>
typename std::enable_if<std::is_same<U, Q>::value, const Q&>::type GetReference() const noexcept {
if (!isReference()) std::terminate();
return *static_cast<const Q*>(m_StoragePtr);
}
template <typename Q = T>
typename std::enable_if<std::is_same<U*, Q>::value, const Q&>::type GetReference() const noexcept {
if (!isPointer()) std::terminate();
return static_cast<const Q&>(m_StoragePtr);
}
template <typename Q = T>
typename std::enable_if<std::is_same<U, Q>::value, Q&>::type GetReference() noexcept {
if (!isReference()) std::terminate();
return *m_StoragePtr;
}
template <typename Q = T>
typename std::enable_if<std::is_same<U*, Q>::value, Q&>::type GetReference() noexcept {
if (!isPointer()) std::terminate();
return m_StoragePtr;
}
bool isPointer() const {
return m_Type == LIFECYCLE::POINTER_OWNED || m_Type == LIFECYCLE::POINTER_NOT_OWNED;
}
bool isOwned() const {
return m_Type == LIFECYCLE::REFERENCE_OWNED || m_Type == LIFECYCLE::POINTER_OWNED;
}
bool isReference() const {
return m_Type == LIFECYCLE::REFERENCE_OWNED;
}
bool isValid() const {
return m_Type != LIFECYCLE::NONE;
}
U* m_StoragePtr = nullptr;
LIFECYCLE m_Type;
ReferenceCount *m_Count = nullptr;
};
} // ns DlSystem
} // ns zdl
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif // _DL_SYSTEM_OPTIONAL_HPP_

View File

@@ -0,0 +1,78 @@
//==============================================================================
//
// Copyright (c) 2014-2015 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _DL_VERSION_HPP_
#define _DL_VERSION_HPP_
#include "ZdlExportDefine.hpp"
#include <stdint.h>
#include <string>
#include "DlSystem/String.hpp"
namespace zdl {
namespace DlSystem
{
class Version_t;
}}
namespace zdl { namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* A class that contains the different portions of a version number.
*/
class ZDL_EXPORT Version_t
{
public:
/// Holds the major version number. Changes in this value indicate
/// major changes that break backward compatibility.
int32_t Major;
/// Holds the minor version number. Changes in this value indicate
/// minor changes made to library that are backwards compatible
/// (such as additions to the interface).
int32_t Minor;
/// Holds the teeny version number. Changes in this value indicate
/// changes such as bug fixes and patches made to the library that
/// do not affect the interface.
int32_t Teeny;
/// This string holds information about the build version.
///
std::string Build;
static zdl::DlSystem::Version_t fromString(const std::string &stringValue);
static zdl::DlSystem::Version_t fromString(const zdl::DlSystem::String &stringValue);
/**
* @brief Returns a string in the form Major.Minor.Teeny.Build
*
* @return A formatted string holding the version information.
*/
const std::string toString() const;
/**
* @brief Returns a string in the form Major.Minor.Teeny.Build
*
* @return A formatted string holding the version information.
*/
const zdl::DlSystem::String asString() const;
};
}}
/** @} */ /* end_addtogroup c_plus_plus_apis */
#endif

View File

@@ -0,0 +1,86 @@
//==============================================================================
//
// Copyright (c) 2017-2019 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _IBUFFER_ATTRIBUTES_HPP
#define _IBUFFER_ATTRIBUTES_HPP
#include "IUserBuffer.hpp"
#include "TensorShape.hpp"
#include "ZdlExportDefine.hpp"
namespace zdl {
namespace DlSystem {
class UserBufferEncoding;
}
}
namespace zdl {
namespace DlSystem {
/**
* @brief IBufferAttributes returns a buffer's dimension and alignment
* requirements, along with info on its encoding type
*/
class ZDL_EXPORT IBufferAttributes {
public:
/**
* @brief Gets the buffer's element size, in bytes
*
* This can be used to compute the memory size required
* to back this buffer.
*
* @return Element size, in bytes
*/
virtual size_t getElementSize() const noexcept = 0;
/**
* @brief Gets the element's encoding type
*
* @return encoding type
*/
virtual zdl::DlSystem::UserBufferEncoding::ElementType_t getEncodingType() const noexcept = 0;
/**
* @brief Gets the number of elements in each dimension
*
* @return Dimension size, in terms of number of elements
*/
virtual const TensorShape getDims() const noexcept = 0;
/**
* @brief Gets the alignment requirement of each dimension
*
* Alignment per each dimension is expressed as an multiple, for
* example, if one particular dimension can accept multiples of 8,
* the alignment will be 8.
*
* @return Alignment in each dimension, in terms of multiple of
* number of elements
*/
virtual const TensorShape getAlignments() const noexcept = 0;
/**
* @brief Gets the buffer encoding returned from the network responsible
* for generating this buffer. Depending on the encoding type, this will
* be an instance of an encoding type specific derived class.
*
* @return Derived user buffer encoding object.
*/
virtual zdl::DlSystem::UserBufferEncoding* getEncoding() const noexcept = 0;
virtual ~IBufferAttributes() {}
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
}
}
#endif

View File

@@ -0,0 +1,127 @@
//=============================================================================
//
// Copyright (c) 2021-2022 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef DL_SYSTEM_IOBUFFER_DATATYPE_MAP_HPP
#define DL_SYSTEM_IOBUFFER_DATATYPE_MAP_HPP
#include <cstddef>
#include <memory>
#include "DlSystem/DlEnums.hpp"
namespace DlSystem
{
// Forward declaration of IOBufferDataTypeMapImpl implementation.
class IOBufferDataTypeMapImpl;
}
namespace zdl
{
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* The IoBufferDataTypeMap class definition
*/
class ZDL_EXPORT IOBufferDataTypeMap final
{
public:
/**
* @brief .
*
* Creates a new Buffer Data type map
*
*/
IOBufferDataTypeMap();
/**
* @brief Adds a name and the corresponding buffer data type
* to the map
*
* @param[name] name The name of the buffer
* @param[bufferDataType] buffer Data Type of the buffer
*
* @note If a buffer with the same name already exists, no new
* buffer is added.
*/
void add(const char* name, zdl::DlSystem::IOBufferDataType_t bufferDataType);
/**
* @brief Removes a buffer name from the map
*
* @param[name] name The name of the buffer
*
*/
void remove(const char* name);
/**
* @brief Returns the type of the named buffer
*
* @param[name] name The name of the buffer
*
* @return The type of the buffer, or UNSPECIFIED if the buffer does not exist
*
*/
zdl::DlSystem::IOBufferDataType_t getBufferDataType(const char* name);
/**
* @brief Returns the type of the first buffer
*
* @return The type of the first buffer, or UNSPECIFIED if the map is empty.
*
*/
zdl::DlSystem::IOBufferDataType_t getBufferDataType();
/**
* @brief Returns the size of the buffer type map.
*
* @return The size of the map
*
*/
size_t size();
/**
* @brief Checks the existence of the named buffer in the map
*
* @return True if the named buffer exists, false otherwise.
*
*/
bool find(const char* name);
/**
* @brief Resets the map
*
*/
void clear();
/**
* @brief Checks whether the map is empty
*
* @return True if the map is empty, false otherwise.
*
*/
bool empty();
/**
* @brief Destroys the map
*
*/
~IOBufferDataTypeMap();
private:
std::shared_ptr<::DlSystem::IOBufferDataTypeMapImpl> m_IOBufferDataTypeMapImpl;
};
}
}
#endif

View File

@@ -0,0 +1,146 @@
//=============================================================================
//
// Copyright (c) 2015-2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef _ITENSOR_HPP_
#define _ITENSOR_HPP_
#include "ITensorItr.hpp"
#include "ITensorItrImpl.hpp"
#include "TensorShape.hpp"
#include "ZdlExportDefine.hpp"
#include <memory>
#include <ostream>
#include <cmath>
namespace zdl {
namespace DlSystem
{
class ITensor;
}}
namespace zdl { namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* Represents a tensor which holds n-dimensional data. It is important to
* understand how the tensor data is represented in memory
* relative to the tensor dimensions. Tensors store data in
* memory in row-major order (i.e. the last tensor dimension is
* the fastest varying one). For example, if you have a two
* dimensional tensor with 3 rows and 2 columns (i.e. the tensor
* dimensions are 3,2 as returned in tensor dimension vectors)
* with the following data in terms rows and columns:
*
* | 1 2 | <br/>
* | 3 4 | <br/>
* | 5 6 | <br/>
*
* This data would be stored in memory as 1,2,3,4,5,6.
*/
class ZDL_EXPORT ITensor
{
public:
typedef zdl::DlSystem::ITensorItr<false> iterator;
typedef zdl::DlSystem::ITensorItr<true> const_iterator;
virtual ~ITensor() {}
/**
* Returns a tensor iterator pointing to the beginning
* of the data in the tensor.
*
* @return A tensor iterator that points to the first data
* element in the tensor.
*/
virtual iterator begin() = 0;
/**
* Returns the const version of a tensor iterator
* pointing to the beginning of the data in the tensor.
*
* @return A tensor const iterator that points to the first data
* element in the tensor.
*/
virtual const_iterator cbegin() const = 0;
/**
* Returns a tensor iterator pointing to the end of the
* data in the tensor. This tensor should not be
* dereferenced.
*
* @return A tensor iterator that points to the end of the data
* (one past the last element) in the tensor.
*/
virtual iterator end() = 0;
/**
* Returns the const version of a tensor iterator
* pointing to the end of the data in the tensor. This
* tensor should not be dereferenced.
*
* @return A tensor const iterator that points to the end of the
* data (one past the last element) in the tensor.
*/
virtual const_iterator cend() const = 0;
/**
* @brief Gets the shape of this tensor.
*
* The last element of the vector represents the fastest varying
* dimension and the zeroth element represents the slowest
* varying dimension, etc.
*
* @return A shape class holding the tensor dimensions.
*/
virtual TensorShape getShape() const = 0;
/**
* Returns the element size of the data in the tensor
* (discounting strides). This is how big a buffer would
* need to be to hold the tensor data contiguously in
* memory.
*
* @return The size of the tensor (in elements).
*/
virtual size_t getSize() const = 0;
/**
* @brief Serializes the tensor to an output stream.
*
* @param[in] output The output stream to which to write the tensor
*
* @throw std::runtime_error If the stream is ever in a bad
* state before the tensor is fully serialized.
*/
virtual void serialize(std::ostream &output) const = 0;
friend iterator;
friend const_iterator;
virtual bool isQuantized() {return false;}
virtual float GetDelta() {return NAN;};
virtual float GetOffset() {return NAN;};
protected:
/**
* Returns the tensor iterator implementation.
*
* @return A pointer to the tensor iterator implementation.
*/
virtual std::unique_ptr<::DlSystem::ITensorItrImpl> getItrImpl() const = 0;
};
}}
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif

View File

@@ -0,0 +1,92 @@
//=============================================================================
//
// Copyright (c) 2015-2016 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef _ITENSOR_FACTORY_HPP
#define _ITENSOR_FACTORY_HPP
#include "ITensor.hpp"
#include "TensorShape.hpp"
#include "ZdlExportDefine.hpp"
#include <istream>
namespace zdl {
namespace DlSystem
{
class ITensor;
class TensorShape;
}
}
namespace zdl { namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* Factory interface class to create ITensor objects.
*/
class ZDL_EXPORT ITensorFactory
{
public:
virtual ~ITensorFactory() = default;
/**
* Creates a new ITensor with uninitialized data.
*
* The strides for the tensor will match the tensor dimensions
* (i.e., the tensor data is contiguous in memory).
*
* @param[in] shape The dimensions for the tensor in which the last
* element of the vector represents the fastest varying
* dimension and the zeroth element represents the slowest
* varying, etc.
*
* @return A pointer to the created tensor or nullptr if creating failed.
*/
virtual std::unique_ptr<ITensor>
createTensor(const TensorShape &shape) noexcept = 0;
/**
* Creates a new ITensor by loading it from a file.
*
* @param[in] input The input stream from which to read the tensor
* data.
*
* @return A pointer to the created tensor or nullptr if creating failed.
*
*/
virtual std::unique_ptr<ITensor> createTensor(std::istream &input) noexcept = 0;
/**
* Create a new ITensor with specific data.
* (i.e. the tensor data is contiguous in memory). This tensor is
* primarily used to create a tensor where tensor size can't be
* computed directly from dimension. One such example is
* NV21-formatted image, or any YUV formatted image
*
* @param[in] shape The dimensions for the tensor in which the last
* element of the vector represents the fastest varying
* dimension and the zeroth element represents the slowest
* varying, etc.
*
* @param[in] data The actual data with which the Tensor object is filled.
*
* @param[in] dataSize The size of data
*
* @return A pointer to the created tensor
*/
virtual std::unique_ptr<ITensor>
createTensor(const TensorShape &shape, const unsigned char *data, size_t dataSize) noexcept = 0;
};
}}
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif

View File

@@ -0,0 +1,182 @@
//=============================================================================
//
// Copyright (c) 2015 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef _ITENSOR_ITR_HPP_
#define _ITENSOR_ITR_HPP_
#include "ZdlExportDefine.hpp"
#include "ITensorItrImpl.hpp"
#include <memory>
#include <iterator>
#include <iostream>
namespace zdl {
namespace DlSystem
{
template<bool IS_CONST> class ITensorItr;
class ITensor;
void ZDL_EXPORT fill(ITensorItr<false> first, ITensorItr<false> end, float val);
template<class InItr, class OutItr> OutItr ZDL_EXPORT copy(InItr first, InItr last, OutItr result)
{
return std::copy(first, last, result);
}
}}
namespace DlSystem
{
class ITensorItrImpl;
}
namespace zdl { namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* A bidirectional iterator (with limited random access
* capabilities) for the zdl::DlSystem::ITensor class.
*
* This is a standard bidrectional iterator and is compatible
* with standard algorithm functions that operate on bidirectional
* access iterators (e.g., std::copy, std::fill, etc.). It uses a
* template parameter to create const and non-const iterators
* from the same code. Iterators are easiest to declare via the
* typedefs iterator and const_iterator in the ITensor class
* (e.g., zdl::DlSystem::ITensor::iterator).
*
* Note that if the tensor the iterator is traversing was
* created with nondefault (i.e., nontrivial) strides, the
* iterator will obey the strides when traversing the tensor
* data.
*
* Also note that nontrivial strides dramatically affect the
* performance of the iterator (on the order of 20x slower).
*/
template<bool IS_CONST=true>
class ZDL_EXPORT ITensorItr : public std::iterator<std::bidirectional_iterator_tag, float>
{
public:
typedef typename std::conditional<IS_CONST, const float&, float&>::type VALUE_REF;
ITensorItr() = delete;
virtual ~ITensorItr() {}
ITensorItr(std::unique_ptr<::DlSystem::ITensorItrImpl> impl,
bool isTrivial = false,
float* data = nullptr)
: m_Impl(impl->clone())
, m_IsTrivial(isTrivial)
, m_Data(data)
, m_DataStart(data) {}
ITensorItr(const ITensorItr<IS_CONST>& itr)
: m_Impl(itr.m_Impl->clone()),
m_IsTrivial(itr.m_IsTrivial),
m_Data(itr.m_Data),
m_DataStart(itr.m_DataStart) {}
zdl::DlSystem::ITensorItr<IS_CONST>& operator=(const ITensorItr<IS_CONST>& other)
{
if (this == &other) return *this;
m_Impl = std::move(other.m_Impl->clone());
m_IsTrivial = other.m_IsTrivial;
m_Data = other.m_Data;
m_DataStart = other.m_DataStart;
return *this;
}
inline zdl::DlSystem::ITensorItr<IS_CONST>& operator++()
{
if (m_IsTrivial) m_Data++; else m_Impl->increment();
return *this;
}
inline zdl::DlSystem::ITensorItr<IS_CONST> operator++(int)
{
ITensorItr tmp(*this);
operator++();
return tmp;
}
inline zdl::DlSystem::ITensorItr<IS_CONST>& operator--()
{
if (m_IsTrivial) m_Data--; else m_Impl->decrement();
return *this;
}
inline zdl::DlSystem::ITensorItr<IS_CONST> operator--(int)
{
ITensorItr tmp(*this);
operator--();
return tmp;
}
inline zdl::DlSystem::ITensorItr<IS_CONST>& operator+=(int rhs)
{
if (m_IsTrivial) m_Data += rhs; else m_Impl->increment(rhs);
return *this;
}
inline friend zdl::DlSystem::ITensorItr<IS_CONST> operator+(zdl::DlSystem::ITensorItr<IS_CONST> lhs, int rhs)
{ lhs += rhs; return lhs; }
inline zdl::DlSystem::ITensorItr<IS_CONST>& operator-=(int rhs)
{
if (m_IsTrivial) m_Data -= rhs; else m_Impl->decrement(rhs);
return *this;
}
inline friend zdl::DlSystem::ITensorItr<IS_CONST> operator-(zdl::DlSystem::ITensorItr<IS_CONST> lhs, int rhs)
{ lhs -= rhs; return lhs; }
inline size_t operator-(const zdl::DlSystem::ITensorItr<IS_CONST>& rhs)
{
if (m_IsTrivial) return (m_Data - m_DataStart) - (rhs.m_Data - rhs.m_DataStart);
return m_Impl->getPosition() - rhs.m_Impl->getPosition();
}
inline friend bool operator<(const ITensorItr<IS_CONST>& lhs, const ITensorItr<IS_CONST>& rhs)
{
if (lhs.m_IsTrivial) return lhs.m_Data < rhs.m_Data;
return lhs.m_Impl->dataPointer() < rhs.m_Impl->dataPointer();
}
inline friend bool operator>(const ITensorItr<IS_CONST>& lhs, const ITensorItr<IS_CONST>& rhs)
{ return rhs < lhs; }
inline friend bool operator<=(const ITensorItr<IS_CONST>& lhs, const ITensorItr<IS_CONST>& rhs)
{ return !(lhs > rhs); }
inline friend bool operator>=(const ITensorItr<IS_CONST>& lhs, const ITensorItr<IS_CONST>& rhs)
{ return !(lhs < rhs); }
inline bool operator==(const ITensorItr<IS_CONST>& rhs) const
{
if (m_IsTrivial) return m_Data == rhs.m_Data;
return m_Impl->dataPointer() == rhs.m_Impl->dataPointer();
}
inline bool operator!=(const ITensorItr<IS_CONST>& rhs) const
{ return !operator==(rhs); }
inline VALUE_REF operator[](size_t idx)
{
if (m_IsTrivial) return *(m_DataStart + idx);
return m_Impl->getReferenceAt(idx);
}
inline VALUE_REF operator*()
{ if (m_IsTrivial) return *m_Data; else return m_Impl->getReference(); }
inline VALUE_REF operator->()
{ return *(*this); }
inline float* dataPointer() const
{ if (m_IsTrivial) return m_Data; else return m_Impl->dataPointer(); }
protected:
std::unique_ptr<::DlSystem::ITensorItrImpl> m_Impl;
bool m_IsTrivial = false;
float* m_Data = nullptr;
float* m_DataStart = nullptr;
};
}}
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif

View File

@@ -0,0 +1,42 @@
//=============================================================================
//
// Copyright (c) 2015-2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef _ITENSOR_ITR_IMPL_HPP_
#define _ITENSOR_ITR_IMPL_HPP_
#include "ZdlExportDefine.hpp"
#include <memory>
#include <iterator>
namespace DlSystem
{
class ITensorItrImpl;
}
class ZDL_EXPORT DlSystem::ITensorItrImpl
{
public:
ITensorItrImpl() {}
virtual ~ITensorItrImpl() {}
virtual float getValue() const = 0;
virtual float& getReference() = 0;
virtual float& getReferenceAt(size_t idx) = 0;
virtual float* dataPointer() const = 0;
virtual void increment(int incVal = 1) = 0;
virtual void decrement(int decVal = 1) = 0;
virtual size_t getPosition() = 0;
virtual std::unique_ptr<DlSystem::ITensorItrImpl> clone() = 0;
private:
ITensorItrImpl& operator=(const ITensorItrImpl& other) = delete;
ITensorItrImpl(const ITensorItrImpl& other) = delete;
};
#endif

View File

@@ -0,0 +1,105 @@
//=============================================================================
//
// Copyright (c) 2016-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef _DL_SYSTEM_IUDL_HPP_
#define _DL_SYSTEM_IUDL_HPP_
#include "ZdlExportDefine.hpp"
namespace zdl {
namespace DlSystem {
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief .
*
* Base class user concrete UDL implementation.
*
* All functions are marked as:
*
* - virtual
* - noexcept
*
* User should make sure no exceptions are propagated outside of
* their module. Errors can be communicated via return values.
*/
class ZDL_EXPORT IUDL {
public:
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief .
*
* Destructor
*/
virtual ~IUDL() = default;
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief Sets up the user's environment.
* This is called by the SNPE framework to allow the user the
* opportunity to setup anything which is needed for running
* user defined layers.
*
* @param cookie User provided opaque data returned by the SNPE
* runtime
*
* @param insz How many elements in input size array
* @param indim Pointer to a buffer that holds input dimension
* array
* @param indimsz Input dimension size array of the buffer
* 'indim'. Corresponds to indim
*
* @param outsz How many elements in output size array
* @param outdim Pointer to a buffer that holds output
* dimension array
* @param outdimsz Output dimension size of the buffer 'oudim'.
* Corresponds to indim
*
* @return true on success, false otherwise
*/
virtual bool setup(void *cookie,
size_t insz, const size_t **indim, const size_t *indimsz,
size_t outsz, const size_t **outdim, const size_t *outdimsz) = 0;
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief Close the instance. Invoked by the SNPE
* framework to allow the user the opportunity to release any resources
* allocated during setup.
*
* @param cookie - User provided opaque data returned by the SNPE runtime
*/
virtual void close(void *cookie) noexcept = 0;
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief Execute the user defined layer
*
* @param cookie User provided opaque data returned by the SNPE
* runtime
*
* @param input Const pointer to a float buffer that contains
* the input
*
* @param output Float pointer to a buffer that would hold
* the user defined layer's output. This buffer
* is allocated and owned by SNPE runtime.
*/
virtual bool execute(void *cookie, const float **input, float **output) = 0;
};
} // ns DlSystem
} // ns zdl
#endif // _DL_SYSTEM_IUDL_HPP_

View File

@@ -0,0 +1,358 @@
//==============================================================================
//
// Copyright (c) 2017-2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _IUSER_BUFFER_HPP
#define _IUSER_BUFFER_HPP
#include "TensorShape.hpp"
#include "ZdlExportDefine.hpp"
#include <math.h>
namespace zdl {
namespace DlSystem {
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* A base class buffer encoding type
*/
class ZDL_EXPORT UserBufferEncoding {
public:
/**
* @brief .
*
* An enum class of all supported element types in a IUserBuffer
*/
enum class ElementType_t
{
/// Unknown element type.
UNKNOWN = 0,
/// Each element is presented by float.
FLOAT = 1,
/// Each element is presented by an unsigned int.
UNSIGNED8BIT = 2,
/// Each element is presented by an 8-bit quantized value.
TF8 = 10,
/// Each element is presented by an 16-bit quantized value.
TF16 = 11
};
/**
* @brief Retrieves the size of the element, in bytes.
*
* @return Size of the element, in bytes.
*/
virtual size_t getElementSize() const noexcept = 0;
/**
* @brief Retrieves the element type
*
* @return Element type
*/
ElementType_t getElementType() const noexcept {return m_ElementType;};
virtual ~UserBufferEncoding() {}
protected:
UserBufferEncoding(ElementType_t elementType) : m_ElementType(elementType) {};
private:
const ElementType_t m_ElementType;
};
/**
* @brief .
*
* A base class buffer source type
*
* @note User buffer from CPU support all kinds of runtimes;
* User buffer from GLBUFFER support only GPU runtime.
*/
class ZDL_EXPORT UserBufferSource {
public:
enum class SourceType_t
{
/// Unknown buffer source type.
UNKNOWN = 0,
/// The network inputs are from CPU buffer.
CPU = 1,
/// The network inputs are from OpenGL buffer.
GLBUFFER = 2
};
/**
* @brief Retrieves the source type
*
* @return Source type
*/
SourceType_t getSourceType() const noexcept {return m_SourceType;};
protected:
UserBufferSource(SourceType_t sourceType): m_SourceType(sourceType) {};
private:
const SourceType_t m_SourceType;
};
/**
* @brief .
*
* An source type where input data is delivered from OpenGL buffer
*/
class ZDL_EXPORT UserBufferSourceGLBuffer : public UserBufferSource{
public:
UserBufferSourceGLBuffer() : UserBufferSource(SourceType_t::GLBUFFER) {};
};
/**
* @brief .
*
* An encoding type where each element is represented by an unsigned int
*/
class ZDL_EXPORT UserBufferEncodingUnsigned8Bit : public UserBufferEncoding {
public:
UserBufferEncodingUnsigned8Bit() : UserBufferEncoding(ElementType_t::UNSIGNED8BIT) {};
size_t getElementSize() const noexcept override;
protected:
UserBufferEncodingUnsigned8Bit(ElementType_t elementType) : UserBufferEncoding(elementType) {};
};
/**
* @brief .
*
* An encoding type where each element is represented by a float
*/
class ZDL_EXPORT UserBufferEncodingFloat : public UserBufferEncoding {
public:
UserBufferEncodingFloat() : UserBufferEncoding(ElementType_t::FLOAT) {};
size_t getElementSize() const noexcept override;
};
/**
* @brief .
*
* An encoding type where each element is represented by tf8, which is an
* 8-bit quantizd value, which has an exact representation of 0.0
*/
class ZDL_EXPORT UserBufferEncodingTfN : public UserBufferEncoding {
public:
UserBufferEncodingTfN() = delete;
UserBufferEncodingTfN(uint64_t stepFor0, float stepSize, uint8_t bWidth=8):
UserBufferEncoding(getTypeFromWidth(bWidth)),
bitWidth(bWidth),
m_StepExactly0(stepFor0),
m_QuantizedStepSize(stepSize){};
UserBufferEncodingTfN(const zdl::DlSystem::UserBufferEncoding &ubEncoding) : UserBufferEncoding(ubEncoding.getElementType()){
const zdl::DlSystem::UserBufferEncodingTfN* ubEncodingTfN
= dynamic_cast <const zdl::DlSystem::UserBufferEncodingTfN*> (&ubEncoding);
if (ubEncodingTfN) {
m_StepExactly0 = ubEncodingTfN->getStepExactly0();
m_QuantizedStepSize = ubEncodingTfN->getQuantizedStepSize();
bitWidth = ubEncodingTfN->bitWidth;
}
}
size_t getElementSize() const noexcept override;
/**
* @brief Sets the step value that represents 0
*
* @param[in] stepExactly0 The step value that represents 0
*
*/
void setStepExactly0(uint64_t stepExactly0) {
m_StepExactly0 = stepExactly0;
}
/**
* @brief Sets the float value that each step represents
*
* @param[in] quantizedStepSize The float value of each step size
*
*/
void setQuantizedStepSize(const float quantizedStepSize) {
m_QuantizedStepSize = quantizedStepSize;
}
/**
* @brief Retrieves the step that represents 0.0
*
* @return Step value
*/
uint64_t getStepExactly0() const {
return m_StepExactly0;
}
/**
* Calculates the minimum floating point value that
* can be represented with this encoding.
*
* @return Minimum representable floating point value
*/
float getMin() const {
return static_cast<float>(m_QuantizedStepSize * (0 - (double)m_StepExactly0));
}
/**
* Calculates the maximum floating point value that
* can be represented with this encoding.
*
* @return Maximum representable floating point value
*/
float getMax() const{
return static_cast<float>(m_QuantizedStepSize * (pow(2,bitWidth)-1 - (double)m_StepExactly0));
};
/**
* @brief Retrieves the step size
*
* @return Step size
*/
float getQuantizedStepSize() const {
return m_QuantizedStepSize;
}
ElementType_t getTypeFromWidth(uint8_t width);
uint8_t bitWidth;
protected:
uint64_t m_StepExactly0;
float m_QuantizedStepSize;
};
class ZDL_EXPORT UserBufferEncodingTf8 : public UserBufferEncodingTfN {
public:
UserBufferEncodingTf8() = delete;
UserBufferEncodingTf8(unsigned char stepFor0, float stepSize) :
UserBufferEncodingTfN(stepFor0, stepSize) {};
UserBufferEncodingTf8(const zdl::DlSystem::UserBufferEncoding &ubEncoding) : UserBufferEncodingTfN(ubEncoding){}
/**
* @brief Sets the step value that represents 0
*
* @param[in] stepExactly0 The step value that represents 0
*
*/
void setStepExactly0(const unsigned char stepExactly0) {
UserBufferEncodingTfN::m_StepExactly0 = stepExactly0;
}
/**
* @brief Retrieves the step that represents 0.0
*
* @return Step value
*/
unsigned char getStepExactly0() const {
return UserBufferEncodingTfN::m_StepExactly0;
}
};
/**
* @brief UserBuffer contains a pointer and info on how to walk it and interpret its content.
*/
class ZDL_EXPORT IUserBuffer {
public:
virtual ~IUserBuffer() = default;
/**
* @brief Retrieves the total number of bytes between elements in each dimension if
* the buffer were to be interpreted as a multi-dimensional array.
*
* @return Number of bytes between elements in each dimension.
* e.g. A tightly packed tensor of floats with dimensions [4, 3, 2] would
* return strides of [24, 8, 4].
*/
virtual const TensorShape& getStrides() const = 0;
/**
* @brief Retrieves the size of the buffer, in bytes.
*
* @return Size of the underlying buffer, in bytes.
*/
virtual size_t getSize() const = 0;
/**
* @brief Retrieves the size of the inference data in the buffer, in bytes.
*
* The inference results from a dynamic-sized model may not be exactly the same size
* as the UserBuffer provided to SNPE. This function can be used to get the amount
* of output inference data, which may be less or greater than the size of the UserBuffer.
*
* If the inference results fit in the UserBuffer, getOutputSize() would be less than
* or equal to getSize(). But if the inference results were more than the capacity of
* the provided UserBuffer, the results would be truncated to fit the UserBuffer. But,
* getOutputSize() would be greater than getSize(), which indicates a bigger buffer
* needs to be provided to SNPE to hold all of the inference results.
*
* @return Size required for the buffer to hold all inference results, which can be less
* or more than the size of the buffer, in bytes.
*/
virtual size_t getOutputSize() const = 0;
/**
* @brief Changes the underlying memory that backs the UserBuffer.
*
* This can be used to avoid creating multiple UserBuffer objects
* when the only thing that differs is the memory location.
*
* @param[in] buffer Pointer to the memory location
*
* @return Whether the set succeeds.
*/
virtual bool setBufferAddress(void *buffer) noexcept = 0;
/**
* @brief Gets a const reference to the data encoding object of
* the underlying buffer
*
* This is necessary when the UserBuffer is filled by SNPE with
* data types such as TF8, where the caller needs to know the quantization
* parameters in order to interpret the data properly
*
* @return A read-only encoding object
*/
virtual const UserBufferEncoding& getEncoding() const noexcept = 0;
/**
* @brief Gets a reference to the data encoding object of
* the underlying buffer
*
* This is necessary when the UserBuffer is re-used, and the encoding
* parameters can change. For example, each input can be quantized with
* different step sizes.
*
* @return Data encoding meta-data
*/
virtual UserBufferEncoding& getEncoding() noexcept = 0;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
}
}
#endif

View File

@@ -0,0 +1,81 @@
//=============================================================================
//
// Copyright (c) 2017 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef _IUSERBUFFER_FACTORY_HPP
#define _IUSERBUFFER_FACTORY_HPP
#include "IUserBuffer.hpp"
#include "TensorShape.hpp"
#include "ZdlExportDefine.hpp"
#include "DlEnums.hpp"
namespace zdl {
namespace DlSystem {
class IUserBuffer;
class TensorShape;
}
}
namespace zdl {
namespace DlSystem {
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* Factory interface class to create IUserBuffer objects.
*/
class ZDL_EXPORT IUserBufferFactory {
public:
virtual ~IUserBufferFactory() = default;
/**
* @brief Creates a UserBuffer
*
* @param[in] buffer Pointer to the buffer that the caller supplies
*
* @param[in] bufSize Buffer size, in bytes
*
* @param[in] strides Total number of bytes between elements in each dimension.
* E.g. A tightly packed tensor of floats with dimensions [4, 3, 2] would have strides of [24, 8, 4].
*
* @param[in] userBufferEncoding Reference to an UserBufferEncoding object
*
* @note Caller has to ensure that memory pointed to by buffer stays accessible
* for the lifetime of the object created
*/
virtual std::unique_ptr<IUserBuffer>
createUserBuffer(void *buffer, size_t bufSize, const zdl::DlSystem::TensorShape &strides, zdl::DlSystem::UserBufferEncoding* userBufferEncoding) noexcept = 0;
/**
* @brief Creates a UserBuffer
*
* @param[in] buffer Pointer to the buffer that the caller supplies
*
* @param[in] bufSize Buffer size, in bytes
*
* @param[in] strides Total number of bytes between elements in each dimension.
* E.g. A tightly packed tensor of floats with dimensions [4, 3, 2] would have strides of [24, 8, 4].
*
* @param[in] userBufferEncoding Reference to an UserBufferEncoding object
*
* @param[in] userBufferSource Reference to an UserBufferSource object
*
* @note Caller has to ensure that memory pointed to by buffer stays accessible
* for the lifetime of the object created
*/
virtual std::unique_ptr<IUserBuffer>
createUserBuffer(void *buffer, size_t bufSize, const zdl::DlSystem::TensorShape &strides, zdl::DlSystem::UserBufferEncoding* userBufferEncoding, zdl::DlSystem::UserBufferSource* userBufferSource) noexcept = 0;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
}
}
#endif

View File

@@ -0,0 +1,230 @@
//=============================================================================
//
// Copyright (c) 2017-2018,2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef _DL_SYSTEM_PLATFORM_CONFIG_HPP_
#define _DL_SYSTEM_PLATFORM_CONFIG_HPP_
#include "DlSystem/ZdlExportDefine.hpp"
#include <string>
namespace zdl{
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* A structure OpenGL configuration
*
* @note When certain OpenGL context and display are provided to UserGLConfig for using
* GPU buffer as input directly, the user MUST ensure the particular OpenGL
* context and display remain vaild throughout the execution of neural network models.
*/
struct ZDL_EXPORT UserGLConfig
{
/// Holds user EGL context.
///
void* userGLContext = nullptr;
/// Holds user EGL display.
void* userGLDisplay = nullptr;
};
/**
* @brief .
*
* A structure Gpu configuration
*/
struct ZDL_EXPORT UserGpuConfig{
/// Holds user OpenGL configuration.
///
UserGLConfig userGLConfig;
};
/**
* @brief .
*
* A class user platform configuration
*/
class ZDL_EXPORT PlatformConfig
{
public:
/**
* @brief .
*
* An enum class of all supported platform types
*/
enum class PlatformType_t
{
/// Unknown platform type.
UNKNOWN = 0,
/// Snapdragon CPU.
CPU = 1,
/// Adreno GPU.
GPU = 2,
/// Hexagon DSP.
DSP = 3
};
/**
* @brief .
*
* A union class user platform configuration information
*/
union PlatformConfigInfo
{
/// Holds user GPU Configuration.
///
UserGpuConfig userGpuConfig;
PlatformConfigInfo(){};
};
PlatformConfig() : m_PlatformType(PlatformType_t::UNKNOWN),
m_PlatformOptions("") {};
/**
* @brief Retrieves the platform type
*
* @return Platform type
*/
PlatformType_t getPlatformType() const {return m_PlatformType;};
/**
* @brief Indicates whther the plaform configuration is valid.
*
* @return True if the platform configuration is valid; false otherwise.
*/
bool isValid() const {return (PlatformType_t::UNKNOWN != m_PlatformType);};
/**
* @brief Retrieves the Gpu configuration
*
* @param[out] userGpuConfig The passed in userGpuConfig populated with the Gpu configuration on return.
*
* @return True if Gpu configuration was retrieved; false otherwise.
*/
bool getUserGpuConfig(UserGpuConfig& userGpuConfig) const
{
if(m_PlatformType == PlatformType_t::GPU)
{
userGpuConfig = m_PlatformConfigInfo.userGpuConfig;
return true;
}
else
{
return false;
}
}
/**
* @brief Sets the Gpu configuration
*
* @param[in] userGpuConfig Gpu Configuration
*
* @return True if Gpu configuration was successfully set; false otherwise.
*/
bool setUserGpuConfig(UserGpuConfig& userGpuConfig)
{
if((userGpuConfig.userGLConfig.userGLContext != nullptr) && (userGpuConfig.userGLConfig.userGLDisplay != nullptr))
{
switch (m_PlatformType)
{
case PlatformType_t::GPU:
m_PlatformConfigInfo.userGpuConfig = userGpuConfig;
return true;
case PlatformType_t::UNKNOWN:
m_PlatformType = PlatformType_t::GPU;
m_PlatformConfigInfo.userGpuConfig = userGpuConfig;
return true;
default:
return false;
}
}
else
return false;
}
/**
* @brief Sets the platform options
*
* @param[in] options Options as a string in the form of "keyword:options"
*
* @return True if options are pass validation; otherwise false. If false, the options are not updated.
*/
bool setPlatformOptions(std::string options) {
std::string oldOptions = m_PlatformOptions;
m_PlatformOptions = options;
if (isOptionsValid()) {
return true;
} else {
m_PlatformOptions = oldOptions;
return false;
}
}
/**
* @brief Indicates whther the plaform configuration is valid.
*
* @return True if the platform configuration is valid; false otherwise.
*/
bool isOptionsValid() const;
/**
* @brief Gets the platform options
*
* @return Options as a string
*/
std::string getPlatformOptions() const { return m_PlatformOptions; }
/**
* @brief Sets the platform options
*
* @param[in] optionName Name of platform options"
* @param[in] value Value of specified optionName
*
* @return If true, add "optionName:value" to platform options if optionName don't exist, otherwise update the
* value of specified optionName.
* If false, the platform options will not be changed.
*/
bool setPlatformOptionValue(const std::string& optionName, const std::string& value);
/**
* @brief Removes the platform options
*
* @param[in] optionName Name of platform options"
* @param[in] value Value of specified optionName
*
* @return If true, removed "optionName:value" to platform options if optionName don't exist, do nothing.
* If false, the platform options will not be changed.
*/
bool removePlatformOptionValue(const std::string& optionName, const std::string& value);
static void SetIsUserGLBuffer(bool isUserGLBuffer);
static bool GetIsUserGLBuffer();
private:
PlatformType_t m_PlatformType;
PlatformConfigInfo m_PlatformConfigInfo;
std::string m_PlatformOptions;
static bool m_IsUserGLBuffer;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
}} //namespace end
#endif

View File

@@ -0,0 +1,154 @@
//=============================================================================
//
// Copyright (c) 2019 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#include "ZdlExportDefine.hpp"
#include "DlSystem/DlEnums.hpp"
#include "DlSystem/StringList.hpp"
#include <cstddef>
#include <memory>
#ifndef DL_SYSTEM_RUNTIME_LIST_HPP
#define DL_SYSTEM_RUNTIME_LIST_HPP
namespace DlSystem
{
// Forward declaration of Runtime List implementation.
class RuntimeListImpl;
}
namespace zdl
{
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* A class representing list of runtimes
*/
class ZDL_EXPORT RuntimeList final
{
public:
/**
* @brief .
*
* Creates a new runtime list
*
*/
RuntimeList();
/**
* @brief .
*
* copy constructor.
* @param[in] other object to copy.
*/
RuntimeList(const RuntimeList& other);
/**
* @brief .
*
* constructor with single Runtime_t object
* @param[in] Runtime_t object
*/
RuntimeList(const zdl::DlSystem::Runtime_t& runtime);
/**
* @brief .
*
* assignment operator.
*/
RuntimeList& operator=(const RuntimeList& other);
/**
* @brief .
*
* subscript operator.
*/
Runtime_t& operator[](size_t index);
/**
* @brief Adds runtime to the end of the runtime list
* order of precedence is former followed by latter entry
*
* @param[in] runtime to add
*
* Ruturns false If the runtime already exists
*/
bool add(const zdl::DlSystem::Runtime_t& runtime);
/**
* @brief Removes the runtime from the list
*
* @param[in] runtime to be removed
*
* @note If the runtime is not found, nothing is done.
*/
void remove(const zdl::DlSystem::Runtime_t runtime) noexcept;
/**
* @brief Returns the number of runtimes in the list
*/
size_t size() const noexcept;
/**
* @brief Returns true if the list is empty
*/
bool empty() const noexcept;
/**
* @brief .
*
* Removes all runtime from the list
*/
void clear() noexcept;
/**
* @brief .
*
* Returns a StringList of names from the runtime list in
* order of precedence
*/
zdl::DlSystem::StringList getRuntimeListNames() const;
/**
* @brief .
*
* @param[in] runtime string
* Returns a Runtime enum corresponding to the in param string
*
*/
static zdl::DlSystem::Runtime_t stringToRuntime(const char* runtimeStr);
/**
* @brief .
*
* @param[in] runtime
* Returns a string corresponding to the in param runtime enum
*
*/
static const char* runtimeToString(const zdl::DlSystem::Runtime_t runtime);
~RuntimeList();
private:
void deepCopy(const RuntimeList &other);
std::unique_ptr<::DlSystem::RuntimeListImpl> m_RuntimeListImpl;
};
} // DlSystem namespace
} // zdl namespace
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif // DL_SYSTEM_RUNTIME_LIST_HPP

View File

@@ -0,0 +1,104 @@
//=============================================================================
//
// Copyright (c) 2017, 2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#ifndef PLATFORM_STANDARD_STRING_HPP
#define PLATFORM_STANDARD_STRING_HPP
#include <cstdio>
#include <string>
#include <ostream>
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl
{
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* Class for wrapping char * as a really stripped down std::string replacement.
*/
class ZDL_EXPORT String final
{
public:
String() = delete;
/**
* Construct a string from std::string reference.
* @param str Reference to a std::string
*/
explicit String(const std::string& str);
/**
* Construct a string from char* reference.
* @param a char*
*/
explicit String(const char* str);
/**
* move constructor.
*/
String(String&& other) noexcept;
/**
* copy constructor.
*/
String(const String& other) = delete;
/**
* assignment operator.
*/
String& operator=(const String&) = delete;
/**
* move assignment operator.
*/
String& operator=(String&&) = delete;
/**
* class comparators
*/
bool operator<(const String& rhs) const noexcept;
bool operator>(const String& rhs) const noexcept;
bool operator<=(const String& rhs) const noexcept;
bool operator>=(const String& rhs) const noexcept;
bool operator==(const String& rhs) const noexcept;
bool operator!=(const String& rhs) const noexcept;
/**
* class comparators against std::string
*/
bool operator<(const std::string& rhs) const noexcept;
bool operator>(const std::string& rhs) const noexcept;
bool operator<=(const std::string& rhs) const noexcept;
bool operator>=(const std::string& rhs) const noexcept;
bool operator==(const std::string& rhs) const noexcept;
bool operator!=(const std::string& rhs) const noexcept;
const char* c_str() const noexcept;
~String();
private:
char* m_string;
};
/**
* overloaded << operator
*/
ZDL_EXPORT std::ostream& operator<<(std::ostream& os, const String& str) noexcept;
} // DlSystem namespace
} // zdl namespace
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif // PLATFORM_STANDARD_STRING_HPP

View File

@@ -0,0 +1,107 @@
//=============================================================================
//
// Copyright (c) 2016 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#include <cstdio>
#include "ZdlExportDefine.hpp"
#ifndef DL_SYSTEM_STRINGLIST_HPP
#define DL_SYSTEM_STRINGLIST_HPP
namespace zdl
{
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* Class for holding an order list of null-terminated ASCII strings.
*/
class ZDL_EXPORT StringList final
{
public:
StringList() {}
/**
* Construct a string list with some pre-allocated memory.
* @warning Contents of the list will be uninitialized
* @param[in] length Number of elements for which to pre-allocate space.
*/
explicit StringList(size_t length);
/**
* Append a string to the list.
* @param[in] str Null-terminated ASCII string to append to the list.
*/
void append(const char* str);
/**
* Returns the string at the indicated position,
* or an empty string if the positions is greater than the size
* of the list.
* @param[in] idx Position in the list of the desired string
*/
const char* at(size_t idx) const noexcept;
/**
* Pointer to the first string in the list.
* Can be used to iterate through the list.
*/
const char** begin() const noexcept;
/**
* Pointer to one after the last string in the list.
* Can be used to iterate through the list.
*/
const char** end() const noexcept;
/**
* Return the number of valid string pointers held by this list.
*/
size_t size() const noexcept;
/**
* assignment operator.
*/
StringList& operator=(const StringList&) noexcept;
/**
* copy constructor.
* @param[in] other object to copy.
*/
StringList(const StringList& other);
/**
* move constructor.
* @param[in] other object to move.
*/
StringList(StringList&& other) noexcept;
~StringList();
private:
void copy(const StringList& other);
void resize(size_t length);
void clear();
static const char* s_Empty;
const char** m_Strings = nullptr;
const char** m_End = nullptr;
size_t m_Size = 0;
};
} // DlSystem namespace
} // zdl namespace
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif // DL_SYSTEM_STRINGLIST_HPP

View File

@@ -0,0 +1,120 @@
//=============================================================================
//
// Copyright (c) 2016 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#include <memory>
#include "ZdlExportDefine.hpp"
#include "ITensor.hpp"
#include "StringList.hpp"
#ifndef DL_SYSTEM_TENSOR_MAP_HPP
#define DL_SYSTEM_TENSOR_MAP_HPP
namespace DlSystem
{
// Forward declaration of tensor map implementation.
class TensorMapImpl;
}
namespace zdl
{
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* A class representing the map of tensor.
*/
class ZDL_EXPORT TensorMap final
{
public:
/**
* @brief .
*
* Creates a new empty tensor map
*/
TensorMap();
/**
* copy constructor.
* @param[in] other object to copy.
*/
TensorMap(const TensorMap& other);
/**
* assignment operator.
*/
TensorMap& operator=(const TensorMap& other);
/**
* @brief Adds a name and the corresponding tensor pointer
* to the map
*
* @param[in] name The name of the tensor
* @param[out] tensor The pointer to the tensor
*
* @note If a tensor with the same name already exists, the
* tensor is replaced with the existing tensor.
*/
void add(const char *name, zdl::DlSystem::ITensor *tensor);
/**
* @brief Removes a mapping of tensor and its name by its name
*
* @param[in] name The name of tensor to be removed
*
* @note If no tensor with the specified name is found, nothing
* is done.
*/
void remove(const char *name) noexcept;
/**
* @brief Returns the number of tensors in the map
*/
size_t size() const noexcept;
/**
* @brief .
*
* Removes all tensors from the map
*/
void clear() noexcept;
/**
* @brief Returns the tensor given its name.
*
* @param[in] name The name of the tensor to get.
*
* @return nullptr if no tensor with the specified name is
* found; otherwise, a valid pointer to the tensor.
*/
zdl::DlSystem::ITensor* getTensor(const char *name) const noexcept;
/**
* @brief .
*
* Returns the names of all tensors
*/
zdl::DlSystem::StringList getTensorNames() const;
~TensorMap();
private:
void swap(const TensorMap &other);
std::unique_ptr<::DlSystem::TensorMapImpl> m_TensorMapImpl;
};
} // DlSystem namespace
} // zdl namespace
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif // DL_SYSTEM_TENSOR_MAP_HPP

View File

@@ -0,0 +1,203 @@
//=============================================================================
//
// Copyright (c) 2016 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#include <initializer_list>
#include <cstdio>
#include <memory>
#include <vector>
#include "ZdlExportDefine.hpp"
#ifndef DL_SYSTEM_TENSOR_SHAPE_HPP
#define DL_SYSTEM_TENSOR_SHAPE_HPP
namespace DlSystem
{
// Forward declaration of tensor shape implementation.
class TensorShapeImpl;
}
namespace zdl
{
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* Convenient typedef to represent dimension
*/
using Dimension = size_t;
/**
* @brief .
*
* A class representing the shape of tensor. It is used at the
* time of creation of tensor.
*/
class ZDL_EXPORT TensorShape final
{
public:
/**
* @brief .
*
* Creates a new shape with a list of dims specified in
* initializer list fashion.
*
* @param[in] dims The dimensions are specified in which the last
* element of the vector represents the fastest varying
* dimension and the zeroth element represents the slowest
* varying, etc.
*
*/
TensorShape(std::initializer_list<Dimension> dims);
/**
* @brief .
*
* Creates a new shape with a list of dims specified in array
*
* @param[in] dims The dimensions are specified in which the last
* element of the vector represents the fastest varying
* dimension and the zeroth element represents the slowest
* varying, etc.
*
* @param[in] size Size of the array.
*
*/
TensorShape(const Dimension *dims, size_t size);
/**
* @brief .
*
* Creates a new shape with a vector of dims specified in
* vector fashion.
*
* @param[in] dims The dimensions are specified in which the last
* element of the vector represents the fastest varying
* dimension and the zeroth element represents the slowest
* varying, etc.
*
*/
TensorShape(std::vector<Dimension> dims);
/**
* @brief .
*
* copy constructor.
* @param[in] other object to copy.
*/
TensorShape(const TensorShape& other);
/**
* @brief .
*
* assignment operator.
*/
TensorShape& operator=(const TensorShape& other);
/**
* @brief .
*
* Creates a new shape with no dims. It can be extended later
* by invoking concatenate.
*/
TensorShape();
/**
* @brief .
*
* Concatenates additional dimensions specified in
* initializer list fashion to the existing dimensions.
*
* @param[in] dims The dimensions are specified in which the last
* element of the vector represents the fastest varying
* dimension and the zeroth element represents the slowest
* varying, etc.
*
*/
void concatenate(std::initializer_list<Dimension> dims);
/**
* @brief .
*
* Concatenates additional dimensions specified in
* the array to the existing dimensions.
*
* @param[in] dims The dimensions are specified in which the last
* element of the vector represents the fastest varying
* dimension and the zeroth element represents the slowest
* varying, etc.
*
* @param[in] size Size of the array.
*
*/
void concatenate(const Dimension *dims, size_t size);
/**
* @brief .
*
* Concatenates an additional dimension to the existing
* dimensions.
*
* @param[in] dim The dimensions are specified in which the last element
* of the vector represents the fastest varying dimension and the
* zeroth element represents the slowest varying, etc.
*
*/
void concatenate(const Dimension &dim);
/**
* @brief .
*
* Retrieves a single dimension, based on its index.
*
* @return The value of dimension
*
* @throws std::out_of_range if the index is >= the number of
* dimensions (or rank).
*/
Dimension& operator[](size_t index);
Dimension& operator[](size_t index) const;
/**
* @brief .
*
* Retrieves the rank i.e. number of dimensions.
*
* @return The rank
*/
size_t rank() const;
/**
* @brief .
*
* Retrieves a pointer to the first dimension of shape
*
* @return nullptr if no dimension exists; otherwise, points to
* the first dimension.
*
*/
const Dimension* getDimensions() const;
~TensorShape();
private:
void swap(const TensorShape &other);
std::unique_ptr<::DlSystem::TensorShapeImpl> m_TensorShapeImpl;
};
} // DlSystem namespace
} // zdl namespace
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif // DL_SYSTEM_TENSOR_SHAPE_HPP

View File

@@ -0,0 +1,127 @@
//=============================================================================
//
// Copyright (c) 2017-2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#include <initializer_list>
#include <cstdio>
#include <memory>
#include "ZdlExportDefine.hpp"
#include "DlSystem/TensorShape.hpp"
#include "DlSystem/StringList.hpp"
#ifndef DL_SYSTEM_TENSOR_SHAPE_MAP_HPP
#define DL_SYSTEM_TENSOR_SHAPE_MAP_HPP
namespace DlSystem
{
// Forward declaration of tensor shape map implementation.
class TensorShapeMapImpl;
}
namespace zdl
{
namespace DlSystem
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* A class representing the map of names and tensorshapes.
*/
class ZDL_EXPORT TensorShapeMap final
{
public:
/**
* @brief .
*
* Creates a new tensor shape map
*
*/
TensorShapeMap();
/**
* @brief .
*
* copy constructor.
* @param[in] other object to copy.
*/
TensorShapeMap(const TensorShapeMap& other);
/**
* @brief .
*
* assignment operator.
*/
TensorShapeMap& operator=(const TensorShapeMap& other);
/**
* @brief Adds a name and the corresponding tensor pointer
* to the map
*
* @param[in] name The name of the tensor
* @param[out] tensor The pointer to the tensor
*
* @note If a tensor with the same name already exists, no new
* tensor is added.
*/
void add(const char *name, const zdl::DlSystem::TensorShape& tensorShape);
/**
* @brief Removes a mapping of tensor and its name by its name
*
* @param[in] name The name of tensor to be removed
*
* @note If no tensor with the specified name is found, nothing
* is done.
*/
void remove(const char *name) noexcept;
/**
* @brief Returns the number of tensors in the map
*/
size_t size() const noexcept;
/**
* @brief .
*
* Removes all tensors from the map
*/
void clear() noexcept;
/**
* @brief Returns the tensor given its name.
*
* @param[in] name The name of the tensor to get.
*
* @return nullptr if no tensor with the specified name is
* found; otherwise, a valid pointer to the tensor.
*/
zdl::DlSystem::TensorShape getTensorShape(const char *name) const noexcept;
/**
* @brief .
*
* Returns the names of all tensor shapes
*/
zdl::DlSystem::StringList getTensorShapeNames() const;
~TensorShapeMap();
private:
void swap(const TensorShapeMap &other);
std::unique_ptr<::DlSystem::TensorShapeMapImpl> m_TensorShapeMapImpl;
};
} // DlSystem namespace
} // zdl namespace
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif // DL_SYSTEM_TENSOR_SHAPE_MAP_HPP

View File

@@ -0,0 +1,243 @@
//==============================================================================
//
// Copyright (c) 2016-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef UDL_CONTEXT_HPP
#define UDL_CONTEXT_HPP
#include <cstring> // memset
#include <tuple>
#include "ZdlExportDefine.hpp"
namespace zdl { namespace DlSystem {
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief .
*
* UDLContext holds the user defined layer context which
* consists of a layer name, layer ID, blob and blob size.
*
* An instance of UDLContext is passed as an argument to the
* UDLFactoryFunc provided by the user every time the SNPE
* runtime encounters an unknown layer descriptor. The instance
* of a UDLContext is created by the SNPE runtime and is
* consumed by the user's factory function. The user should
* obtain a copy of this class and should not assume any
* prolonged object lifetime beyond the UDLFactoryFunction.
*/
class ZDL_EXPORT UDLContext final {
public:
/**
* @brief Constructor
*
* @param[in] name name of the layer
*
* @param[in] type layer type
*
* @param[in] id identifier for the layer
*
* @param[in] id Blob/bytes as packed by the user code as part of
* the Python converter script
*/
UDLContext(const std::string& name,
const std::string& type,
int32_t id,
const std::string& blob) :
m_Name(name), m_Type(type), m_Size(blob.size()), m_Id(id) {
// FIXME not dealing with alloc error
m_Buffer = new uint8_t[m_Size];
std::memcpy(m_Buffer, blob.data(), m_Size);
}
/**
* @brief .
*
* Empty constructor is useful for
* creating an empty UDLContext and then run copy constructor
* from a fully initialized one.
*/
explicit UDLContext() {}
/**
* @brief .
*
* destructor Deallocates any internal allocated memory
*/
~UDLContext() { release(); }
/**
* @brief .
*
* Deallocate any internally allocated memory
*/
void release() {
if (m_Buffer && m_Size)
std::memset(m_Buffer, 0, m_Size);
delete []m_Buffer;
m_Buffer = nullptr;
m_Size = 0;
}
/**
* @brief .
*
* Copy Constructor - makes a copy from ctx
*
* @param[in] ctx Source UDLContext to copy from
*/
UDLContext(const UDLContext& ctx) : m_Name(ctx.m_Name),
m_Type(ctx.m_Type),
m_Id(ctx.m_Id) {
std::tuple<uint8_t*, size_t> cpy = ctx.getCopy();
// current compiler does not support get<type>
m_Buffer = std::get<0>(cpy);
m_Size = std::get<1>(cpy);
}
/**
* @brief
*
* Assignment operator - makes a copy from ctx
*
* @param[in] ctx Source UDLContext to copy from
*
* @return this
*/
UDLContext& operator=(const UDLContext& ctx) {
UDLContext c (ctx);
this->swap(c); // non throwing swap
return *this;
}
/**
* @brief .
*
* Move Constructor - Move internals from ctx into this
*
* @param[in] ctx Source UDLContext to move from
*/
UDLContext(UDLContext&& ctx) :
m_Name(std::move(ctx.m_Name)),
m_Type(std::move(ctx.m_Type)),
m_Buffer(ctx.m_Buffer),
m_Size(ctx.m_Size),
m_Id(ctx.m_Id) {
ctx.clear();
}
/**
* @brief .
*
* Assignment move - Move assignment operator from ctx
*
* @param[in] ctx Source UDLContext to move from
*
* @return this
*/
UDLContext& operator=(UDLContext&& ctx) {
m_Name = std::move(ctx.m_Name);
m_Type = std::move(ctx.m_Type);
m_Buffer = ctx.m_Buffer;
m_Size = ctx.m_Size;
m_Id = ctx.m_Id;
ctx.clear();
return *this;
}
/**
* @brief .
*
* Obtain the name of the layer
*
* @return const reference to the name of the layer
*/
const std::string& getName() const noexcept { return m_Name; }
/**
* @brief .
*
* Obtain the type of the layer
*
* @return const reference to the type of the layer
*/
const std::string& getType() const noexcept { return m_Type; }
/**
* @brief .
*
* Obtain the Id of the layer
*
* @return The id of the layer
*/
int32_t getId() const noexcept { return m_Id; }
/**
* @brief .
*
* Obtain the size of the blob
*
* @return Size of the internal blob
*/
size_t getSize() const noexcept { return m_Size; }
/**
* @brief .
*
* Get a const pointer to the internal blob
*
* @return Const pointer to the internal blob
*/
const uint8_t* getBlob() const noexcept { return m_Buffer; }
/**
* @brief .
*
* Get a copy of the blob/size into a tuple
*
* @return A tuple with a pointer to a copy of the blob and a
* size
*/
std::tuple<uint8_t*, size_t> getCopy() const {
uint8_t* buf = new uint8_t[m_Size];
// FIXME missing memcpy
std::memcpy(buf, m_Buffer, m_Size);
return std::make_tuple(buf, m_Size);
}
/**
* @brief .
*
* Set zeros in the internals members
*/
void clear() {
m_Name.clear();
m_Type.clear();
m_Buffer = 0;
m_Size = 0;
m_Id = -1;
}
private:
void swap(UDLContext& c) noexcept {
std::swap(m_Name, c.m_Name);
std::swap(m_Type, c.m_Type);
std::swap(m_Id, c.m_Id);
std::swap(m_Buffer, c.m_Buffer);
std::swap(m_Size, c.m_Size);
}
std::string m_Name; // name of the layer instance
std::string m_Type; // The actual layer type
uint8_t* m_Buffer = nullptr;
size_t m_Size = 0;
int32_t m_Id = -1;
};
}}
#endif /* UDL_CONTEXT_HPP */

View File

@@ -0,0 +1,87 @@
//==============================================================================
//
// Copyright (c) 2015-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _UDL_FUNC_HPP_
#define _UDL_FUNC_HPP_
#include <functional>
#include "ZdlExportDefine.hpp"
#include <DlSystem/IUDL.hpp>
namespace zdl {
namespace DlSystem {
class UDLContext;
}
}
namespace zdl { namespace DlSystem {
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief .
*
* Definition of UDLFactoyFunc, using/typedef and default FactoryFunction
* UDLBundle - a simple way to bundle func and cookie into one type
*/
/**
* @brief .
*
* Convenient typedef for user defined layer creation factory
*
* @param[out] void* Cookie - a user opaque data that was passed during SNPE's runtime's
* CreateInstance. SNPE's runtime is passing this back to the user.
*
* @param[out] DlSystem::UDLContext* - The specific Layer Description context what is passe
* SNPE runtime.
*
* @return IUDL* - a Concrete instance of IUDL derivative
*/
using UDLFactoryFunc = std::function<zdl::DlSystem::IUDL* (void*, const zdl::DlSystem::UDLContext*)>;
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief .
*
* default UDL factory implementation
*
* @param[out] DlSystem::UDLContext* - The specific Layer Description context what is passe
* SNPE runtime.
*
* @param[out] void* Cookie - a user opaque data that was passed during SNPE's runtime's
* CreateInstance. SNPE's runtime is passing this back to the user.
*
* @return IUDL* - nullptr to indicate SNPE's runtime that there is no specific
* implementation for UDL. When SNPE's runtime sees nullptr as a return
* value from the factory, it will halt execution if model has an unknown layer
*
*/
inline ZDL_EXPORT zdl::DlSystem::IUDL* DefaultUDLFunc(void*, const zdl::DlSystem::UDLContext*) { return nullptr; }
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE.
*
* @brief .
*
* Simple struct to bundle 2 elements.
* A user defined cookie that would be returned for each
* IUDL call. The user can place anything there and the
* SNPE runtime will provide it back
*/
struct ZDL_EXPORT UDLBundle {
void *cookie = nullptr;
UDLFactoryFunc func = DefaultUDLFunc;
};
}}
#endif // _UDL_FUNC_HPP_

View File

@@ -0,0 +1,122 @@
//=============================================================================
//
// Copyright (c) 2017 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#include <memory>
#include "ZdlExportDefine.hpp"
#include "StringList.hpp"
#ifndef DL_SYSTEM_USER_BUFFER_MAP_HPP
#define DL_SYSTEM_USER_BUFFER_MAP_HPP
namespace DlSystem
{
// Forward declaration of UserBuffer map implementation.
class UserBufferMapImpl;
}
namespace zdl
{
namespace DlSystem
{
class IUserBuffer;
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* A class representing the map of UserBuffer.
*/
class ZDL_EXPORT UserBufferMap final
{
public:
/**
* @brief .
*
* Creates a new empty UserBuffer map
*/
UserBufferMap();
/**
* copy constructor.
* @param[in] other object to copy.
*/
UserBufferMap(const UserBufferMap& other);
/**
* assignment operator.
*/
UserBufferMap& operator=(const UserBufferMap& other);
/**
* @brief Adds a name and the corresponding UserBuffer pointer
* to the map
*
* @param[in] name The name of the UserBuffer
* @param[in] userBuffer The pointer to the UserBuffer
*
* @note If a UserBuffer with the same name already exists, the new
* UserBuffer pointer would be updated.
*/
void add(const char *name, zdl::DlSystem::IUserBuffer *buffer);
/**
* @brief Removes a mapping of one UserBuffer and its name by its name
*
* @param[in] name The name of UserBuffer to be removed
*
* @note If no UserBuffer with the specified name is found, nothing
* is done.
*/
void remove(const char *name) noexcept;
/**
* @brief Returns the number of UserBuffers in the map
*/
size_t size() const noexcept;
/**
* @brief .
*
* Removes all UserBuffers from the map
*/
void clear() noexcept;
/**
* @brief Returns the UserBuffer given its name.
*
* @param[in] name The name of the UserBuffer to get.
*
* @return nullptr if no UserBuffer with the specified name is
* found; otherwise, a valid pointer to the UserBuffer.
*/
zdl::DlSystem::IUserBuffer* getUserBuffer(const char *name) const noexcept;
/**
* @brief .
*
* Returns the names of all UserBuffers
*
* @return A list of UserBuffer names.
*/
zdl::DlSystem::StringList getUserBufferNames() const;
~UserBufferMap();
private:
void swap(const UserBufferMap &other);
std::unique_ptr<::DlSystem::UserBufferMapImpl> m_UserBufferMapImpl;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
} // DlSystem namespace
} // zdl namespace
#endif // DL_SYSTEM_TENSOR_MAP_HPP

View File

@@ -0,0 +1,129 @@
//=============================================================================
//
// Copyright (c) 2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#include <memory>
#include "ZdlExportDefine.hpp"
#include "StringList.hpp"
#ifndef DL_SYSTEM_USER_MEMORY_MAP_HPP
#define DL_SYSTEM_USER_MEMORY_MAP_HPP
namespace DlSystem
{
// Forward declaration of UserMemory map implementation.
class UserMemoryMapImpl;
}
namespace zdl
{
namespace DlSystem
{
class IUserBuffer;
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* A class representing the map of UserMemory.
*/
class ZDL_EXPORT UserMemoryMap final
{
public:
/**
* @brief .
*
* Creates a new empty UserMemory map
*/
UserMemoryMap();
/**
* copy constructor.
* @param[in] other object to copy.
*/
UserMemoryMap(const UserMemoryMap& other);
/**
* assignment operator.
*/
UserMemoryMap& operator=(const UserMemoryMap& other);
/**
* @brief Adds a name and the corresponding buffer address
* to the map
*
* @param[in] name The name of the UserMemory
* @param[in] address The pointer to the Buffer Memory
*
* @note If a UserBuffer with the same name already exists, the new
* address would be updated.
*/
void add(const char *name, void *address);
/**
* @brief Removes a mapping of one Buffer address and its name by its name
*
* @param[in] name The name of Memory address to be removed
*
* @note If no UserBuffer with the specified name is found, nothing
* is done.
*/
void remove(const char *name) noexcept;
/**
* @brief Returns the number of User Memory addresses in the map
*/
size_t size() const noexcept;
/**
* @brief .
*
* Removes all User Memory from the map
*/
void clear() noexcept;
/**
* @brief .
*
* Returns the names of all User Memory
*
* @return A list of Buffer names.
*/
zdl::DlSystem::StringList getUserBufferNames() const;
/**
* @brief Returns the no of UserMemory addresses mapped to the buffer
*
* @param[in] name The name of the UserMemory
*
*/
size_t getUserMemoryAddressCount(const char *name) const noexcept;
/**
* @brief Returns address at a specified index corresponding to a UserMemory buffer name
*
* @param[in] name The name of the buffer
* @param[in] index The index in the list of addresses
*
*/
void* getUserMemoryAddressAtIndex(const char *name, uint32_t index) const noexcept;
~UserMemoryMap();
private:
void swap(const UserMemoryMap &other);
std::unique_ptr<::DlSystem::UserMemoryMapImpl> m_UserMemoryMapImpl;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
} // DlSystem namespace
} // zdl namespace
#endif // DL_SYSTEM_TENSOR_MAP_HPP

View File

@@ -0,0 +1,13 @@
//=============================================================================
//
// Copyright (c) 2015, 2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//=============================================================================
#pragma once
#ifndef ZDL_EXPORT
#define ZDL_EXPORT
#endif

View File

@@ -0,0 +1,118 @@
// =============================================================================
//
// Copyright (c) 2018-2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
// =============================================================================
#ifndef SNPE_PLATFORMVALIDATOR_HPP
#define SNPE_PLATFORMVALIDATOR_HPP
#include "DlSystem/DlEnums.hpp"
#include "DlSystem/ZdlExportDefine.hpp"
#define DO_PRAGMA(s) _Pragma(#s)
#define NO_WARNING "-Wunused-variable"
#ifdef __clang__
#define SNPE_DISABLE_WARNINGS(clang_warning,gcc_warning) \
_Pragma("clang diagnostic push") \
DO_PRAGMA(clang diagnostic ignored clang_warning)
#define SNPE_ENABLE_WARNINGS \
_Pragma("clang diagnostic pop")
#elif defined __GNUC__
#define SNPE_DISABLE_WARNINGS(clang_warning,gcc_warning) \
_Pragma("GCC diagnostic push") \
DO_PRAGMA(GCC diagnostic ignored gcc_warning)
#define SNPE_ENABLE_WARNINGS \
_Pragma("GCC diagnostic pop")
#else
#define SNPE_DISABLE_WARNINGS(...)
#define SNPE_ENABLE_WARNINGS
#endif
SNPE_DISABLE_WARNINGS("-Wdelete-non-virtual-dtor","-Wdelete-non-virtual-dtor")
#include <string>
#include <memory>
SNPE_ENABLE_WARNINGS
namespace zdl
{
namespace SNPE
{
class PlatformValidator;
class IPlatformValidatorRuntime;
}
}
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* The class for checking SNPE compatibility/capability of a device.
*
*/
class ZDL_EXPORT zdl::SNPE::PlatformValidator
{
public:
/**
* @brief Default Constructor of the PlatformValidator Class
*
* @return A new instance of a PlatformValidator object
* that can be used to check the SNPE compatibility
* of a device
*/
PlatformValidator();
~PlatformValidator();
/**
* @brief Sets the runtime processor for compatibility check
*
* @return Void
*/
void setRuntime(zdl::DlSystem::Runtime_t runtime);
/**
* @brief Checks if the Runtime prerequisites for SNPE are available.
*
* @return True if the Runtime prerequisites are available, else false.
*/
bool isRuntimeAvailable();
/**
* @brief Returns the core version for the Runtime selected.
*
* @return String which contains the actual core version value
*/
std::string getCoreVersion();
/**
* @brief Returns the library version for the Runtime selected.
*
* @return String which contains the actual lib version value
*/
std::string getLibVersion();
/**
* @brief Runs a small program on the runtime and Checks if SNPE is supported for Runtime.
*
* @return If True, the device is ready for SNPE execution, else not.
*/
bool runtimeCheck();
private:
zdl::DlSystem::Runtime_t m_runtimeType;
std::unique_ptr<IPlatformValidatorRuntime> m_platformValidatorRuntime;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif //SNPE_PLATFORMVALIDATOR_HPP

View File

@@ -0,0 +1,101 @@
//==============================================================================
//
// Copyright (c) 2019 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef PSNPE_APPLICATIONBUFFERMAP_HPP
#define PSNPE_APPLICATIONBUFFERMAP_HPP
#include <vector>
#include <string>
#include <unordered_map>
#include "DlSystem/UserBufferMap.hpp"
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl
{
namespace PSNPE
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* A class representing the UserBufferMap of Input and Output asynchronous mode.
*/
class ZDL_EXPORT ApplicationBufferMap final
{
public:
/**
* @brief Adds a name and the corresponding buffer
* to the map
*
* @param[in] name The name of the UserBuffer
* @param[in] buffer The vector of the uint8_t data
*
* @note If a UserBuffer with the same name already exists, the new
* UserBuffer pointer would be updated.
*/
void add(const char* name, std::vector<uint8_t>& buff) noexcept;
void add(const char* name, std::vector<float>& buff) noexcept;
/**
* @brief Removes a mapping of one UserBuffer and its name by its name
*
* @param[in] name The name of UserBuffer to be removed
*
* @note If no UserBuffer with the specified name is found, nothing
* is done.
*/
void remove(const char* name) noexcept;
/**
* @brief Returns the number of UserBuffers in the map
*/
size_t size() const noexcept;
/**
* @brief .
*
* Removes all UserBuffers from the map
*/
void clear() noexcept;
/**
* @brief Returns the UserBuffer given its name.
*
* @param[in] name The name of the UserBuffer to get.
*
* @return nullptr if no UserBuffer with the specified name is
* found; otherwise, a valid pointer to the UserBuffer.
*/
const std::vector<uint8_t>& getUserBuffer(const char* name) const;
const std::vector<uint8_t>& operator[](const char* name) const;
/**
* @brief .
*
* Returns the names of all UserAsyncBufferMap
*
* @return A list of UserBuffer names.
*/
zdl::DlSystem::StringList getUserBufferNames() const;
const std::unordered_map<std::string, std::vector<uint8_t>>& getUserBuffer() const;
explicit ApplicationBufferMap();
~ApplicationBufferMap();
explicit ApplicationBufferMap(
const std::unordered_map<std::string, std::vector<uint8_t>> buffer);
private:
std::unordered_map<std::string, std::vector<uint8_t>> m_UserMap;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
} // namespace PSNPE
} // namespace zdl
#endif // PSNPE_APPLICATIONBUFFERMAP_HPP

205
third_party/snpe/include/SNPE/PSNPE.hpp vendored Normal file
View File

@@ -0,0 +1,205 @@
// =============================================================================
//
// Copyright (c) 2019-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
// =============================================================================
#ifndef PSNPE_HPP
#define PSNPE_HPP
#include <cstdlib>
#include <functional>
#include "SNPE/SNPE.hpp"
#include "DlSystem/UserBufferMap.hpp"
#include "DlContainer/IDlContainer.hpp"
#include "DlSystem/DlEnums.hpp"
#include "DlSystem/ZdlExportDefine.hpp"
#include "UserBufferList.hpp"
#include "RuntimeConfigList.hpp"
#include "ApplicationBufferMap.hpp"
namespace zdl
{
namespace PSNPE
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
*@ brief build snpe instance in serial or parallel
*
*/
enum ZDL_EXPORT BuildMode {
SERIAL = 0,
PARALLEL = 1
};
/**
* @brief Input and output transmission mode
*/
enum ZDL_EXPORT InputOutputTransmissionMode
{
sync = 0,
outputAsync = 1,
inputOutputAsync = 2
};
/**
* @brief A structure representing parameters of callback function of Async Output mode
*/
struct ZDL_EXPORT OutputAsyncCallbackParam
{
size_t dataIndex;
bool executeStatus;
std::string errorMsg;
OutputAsyncCallbackParam(size_t _index,bool _status, const std::string& _errorMsg = std::string())
: dataIndex(_index),executeStatus(_status), errorMsg(_errorMsg){};
};
/**
* @brief A structure representing parameters of callback function of Async Input/Output mode
*/
struct ZDL_EXPORT InputOutputAsyncCallbackParam
{
size_t dataIndex;
const ApplicationBufferMap& outputMap;
bool executeStatus;
std::string errorMsg;
InputOutputAsyncCallbackParam(size_t _index, const ApplicationBufferMap& output_map,bool _status,
const std::string _ErrorMsg = std::string())
: dataIndex(_index)
, outputMap(output_map)
, executeStatus(_status)
, errorMsg(_ErrorMsg){};
};
/**
* @brief This callback is called when the output data is ready, only use for Output Async mode
*/
using OutputAsyncCallbackFunc = std::function<void(OutputAsyncCallbackParam)>;
/**
* @brief This callback is called when the output data is ready, only use for Output-Input Async mode
*/
using InputOutputAsyncCallbackFunc = std::function<void(InputOutputAsyncCallbackParam)>;
/**
* @brief This callback is called when the input data is ready,only use for Output-Input Async mode
*/
using InputOutputAsyncInputCallback = std::function<std::shared_ptr<ApplicationBufferMap>(const std::vector<std::string> &,
const zdl::DlSystem::StringList &)>;
/**
* @brief .
*
* A structure PSNPE configuration
*
*/
struct ZDL_EXPORT BuildConfig final
{
BuildMode buildMode = BuildMode::SERIAL; ///< Specify build in serial mode or parallel mode
zdl::DlContainer::IDlContainer* container;///< The opened container ptr
zdl::DlSystem::StringList outputBufferNames;///< Specify the output layer name
zdl::DlSystem::StringList outputTensors;///< Specify the output layer name
RuntimeConfigList runtimeConfigList;///< The runtime config list for PSNPE, @see RuntimeConfig
size_t inputThreadNumbers = 1;///< Specify the number of threads used in the execution phase to process input data, only used in inputOutputAsync mode
size_t outputThreadNumbers = 1;///< Specify the number of threads used in the execution phase to process output data, only used in inputOutputAsync and outputAsync mode
OutputAsyncCallbackFunc outputCallback;///< The callback to deal with output data ,only used in outputAsync mode
InputOutputAsyncCallbackFunc inputOutputCallback;///< The callback to deal with output data ,only used in inputOutputAsync mode
InputOutputAsyncInputCallback inputOutputInputCallback;///< The callback to deal with input data ,only used in inputOutputAsync mode
InputOutputTransmissionMode inputOutputTransmissionMode = InputOutputTransmissionMode::sync;///< Specify execution mode
zdl::DlSystem::ProfilingLevel_t profilingLevel = zdl::DlSystem::ProfilingLevel_t::OFF;///< Specify profiling level for Diaglog
uint64_t encode[2] = {0, 0};
bool enableInitCache = false;
std::string platformOptions;
std::string diaglogOutputDir = "./diaglogs/"; ///< Specify a diaglog output directory to save the generated Diaglog files.
};
/**
* @brief .
*
* The class for executing SNPE instances in parallel.
*/
class ZDL_EXPORT PSNPE final
{
public:
~PSNPE();
explicit PSNPE() noexcept :m_TransmissionMode(InputOutputTransmissionMode::sync){};
/**
* @brief Build snpe instances.
*
*/
bool build(BuildConfig& buildConfig) noexcept;
/**
* @brief Execute snpe instances in Async Output mode and Sync mode
*
* @param[in] inputBufferList A list of user buffers that contains the input data
*
* @param[in,out] outputBufferList A list of user buffers that will hold the output data
*
*/
bool execute(UserBufferList& inputBufferList, UserBufferList& outputBufferList) noexcept;
/**
* @brief Execute snpe instances in Async Input/Output mode
*
* @param[in]inputMap A map of input buffers that contains input data. The names of buffers
* need to be matched with names retrived through getInputTensorNames()
*
* @param dataIndex Index of the input data
*
* @param isTF8buff Whether prefer to using 8 bit quantized element for inference
*
* @return True if executed successfully; flase, otherwise.
*/
bool executeInputOutputAsync(const std::vector<std::string>& inputMap, size_t dataIndex, bool isTF8buff) noexcept;
bool executeInputOutputAsync(const std::vector<std::string>& inputMap, size_t dataIndex, bool isTF8buff,bool isTF8Outputbuff) noexcept;
/**
* @brief Returns the input layer names of the network.
*
* @return StringList which contains the input layer names
*/
const zdl::DlSystem::StringList getInputTensorNames() const noexcept;
/**
* @brief Returns the output layer names of the network.
*
* @return StringList which contains the output layer names
*/
const zdl::DlSystem::StringList getOutputTensorNames() const noexcept;
/**
* @brief Returns the input tensor dimensions of the network.
*
* @return TensorShape which contains the dimensions.
*/
const zdl::DlSystem::TensorShape getInputDimensions() const noexcept;
const zdl::DlSystem::TensorShape getInputDimensions(const char *name) const noexcept;
/**
* @brief Returns attributes of buffers.
*
* @see zdl::SNPE
*
* @return BufferAttributes of input/output tensor named.
*/
const zdl::DlSystem::TensorShape getBufferAttributesDims(const char *name) const noexcept;
zdl::DlSystem::Optional<zdl::DlSystem::IBufferAttributes*> getInputOutputBufferAttributes(const char *name) const noexcept;
bool registerIonBuffers(const zdl::DlSystem::UserMemoryMap& ionBufferMap) const noexcept;
bool deregisterIonBuffers(const zdl::DlSystem::StringList& ionBufferNames) const noexcept;
const char* getLastErrorString();
private:
PSNPE(const PSNPE&) = delete;
PSNPE& operator=(const PSNPE&) = delete;
zdl::PSNPE::InputOutputTransmissionMode m_TransmissionMode;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
} // namespace PSNPE
} // namespace zdl
#endif // PSNPE_HPP

View File

@@ -0,0 +1,85 @@
//==============================================================================
//
// Copyright (c) 2019-2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef PSNPE_RUNTIMECONFIGLIST_HPP
#define PSNPE_RUNTIMECONFIGLIST_HPP
#include <iostream>
#include "DlContainer/IDlContainer.hpp"
#include "DlSystem/DlEnums.hpp"
#include "DlSystem/RuntimeList.hpp"
#include "DlSystem/TensorShapeMap.hpp"
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl {
namespace PSNPE {
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* The structure for configuring a BulkSNPE runtime
*
*/
struct ZDL_EXPORT RuntimeConfig final {
zdl::DlSystem::Runtime_t runtime;
zdl::DlSystem::RuntimeList runtimeList;
zdl::DlSystem::PerformanceProfile_t perfProfile;
zdl::DlSystem::TensorShapeMap inputDimensionsMap;
bool enableCPUFallback;
RuntimeConfig()
: runtime{zdl::DlSystem::Runtime_t::CPU_FLOAT32},
perfProfile{zdl::DlSystem::PerformanceProfile_t::HIGH_PERFORMANCE},
enableCPUFallback{false} {}
RuntimeConfig(const RuntimeConfig& other) {
runtime = other.runtime;
runtimeList = other.runtimeList;
perfProfile = other.perfProfile;
enableCPUFallback = other.enableCPUFallback;
inputDimensionsMap = other.inputDimensionsMap;
}
RuntimeConfig& operator=(const RuntimeConfig& other) {
this->runtimeList = other.runtimeList;
this->runtime = other.runtime;
this->perfProfile = other.perfProfile;
this->enableCPUFallback = other.enableCPUFallback;
this->inputDimensionsMap = other.inputDimensionsMap;
return *this;
}
~RuntimeConfig() {}
};
/**
* @brief .
*
* The class for creating a RuntimeConfig container.
*
*/
class ZDL_EXPORT RuntimeConfigList final {
public:
RuntimeConfigList();
RuntimeConfigList(const size_t size);
void push_back(const RuntimeConfig& runtimeConfig);
RuntimeConfig& operator[](const size_t index);
RuntimeConfigList& operator=(const RuntimeConfigList& other);
size_t size() const noexcept;
size_t capacity() const noexcept;
void clear() noexcept;
~RuntimeConfigList() = default;
private:
void swap(const RuntimeConfigList& other);
std::vector<RuntimeConfig> m_runtimeConfigs;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
} // namespace PSNPE
} // namespace zdl
#endif // PSNPE_RUNTIMECONFIGLIST_HPP

258
third_party/snpe/include/SNPE/SNPE.hpp vendored Normal file
View File

@@ -0,0 +1,258 @@
//==============================================================================
//
// Copyright (c) 2015-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _SNPE_SNPE_HPP_
#define _SNPE_SNPE_HPP_
#include "DlSystem/DlOptional.hpp"
#include "DlSystem/DlVersion.hpp"
#include "DlSystem/IBufferAttributes.hpp"
#include "DlSystem/ITensor.hpp"
#include "DlSystem/TensorShape.hpp"
#include "DlSystem/TensorMap.hpp"
#include "DlSystem/String.hpp"
#include "DlSystem/StringList.hpp"
#include "DlSystem/IUserBuffer.hpp"
#include "DlSystem/UserBufferMap.hpp"
#include "DlSystem/UserMemoryMap.hpp"
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl {
namespace SNPE
{
class SnpeRuntime;
}
}
namespace zdl {
namespace DiagLog
{
class IDiagLog;
}
}
namespace zdl { namespace SNPE {
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* The SNPE interface class definition
*/
class ZDL_EXPORT SNPE final
{
public:
// keep this undocumented to be hidden in doxygen using HIDE_UNDOC_MEMBERS
explicit SNPE(std::unique_ptr<zdl::SNPE::SnpeRuntime>&& runtime) noexcept;
~SNPE();
/**
* @brief Gets the names of input tensors to the network
*
* To support multiple input scenarios, where multiple tensors are
* passed through execute() in a TensorMap, each tensor needs to
* be uniquely named. The names of tensors can be retrieved
* through this function.
*
* In the case of a single input, one name will be returned.
*
* @note Note that because the returned value is an Optional list,
* the list must be verified as boolean true value before being
* dereferenced.
*
* @return An Optional List of input tensor names.
*
* @see zdl::DlSystem::Optional
*/
zdl::DlSystem::Optional<zdl::DlSystem::StringList>
getInputTensorNames() const noexcept;
/**
* @brief Gets the names of output tensors to the network
*
* @return List of output tensor names.
*/
zdl::DlSystem::Optional<zdl::DlSystem::StringList>
getOutputTensorNames() const noexcept;
/**
* @brief Gets the name of output tensor from the input layer name
*
* @return Output tensor name.
*/
zdl::DlSystem::StringList
getOutputTensorNamesByLayerName(const char *name) const noexcept;
/**
* @brief Processes the input data and returns the output
*
* @param[in] A map of tensors that contains the input data for
* each input. The names of tensors needs to be
* matched with names retrieved through
* getInputTensorNames()
*
* @param[in,out] An empty map of tensors that will contain the output
* data of potentially multiple layers (the key
* in the map is the layer name) upon return
*
* @note output tensormap has to be empty. To forward propagate
* and get results in user-supplied tensors, use
* executeWithSuppliedOutputTensors.
*/
bool execute(const zdl::DlSystem::TensorMap &input,
zdl::DlSystem::TensorMap &output) noexcept;
/**
* @brief Processes the input data and returns the output
*
* @param[in] A single tensor contains the input data.
*
* @param[in,out] An empty map of tensors that will contain the output
* data of potentially multiple layers (the key
* in the map is the layer name) upon return
*
* @note output tensormap has to be empty.
*/
bool execute(const zdl::DlSystem::ITensor *input,
zdl::DlSystem::TensorMap &output) noexcept;
/**
* @brief Processes the input data and returns the output, using
* user-supplied buffers
*
* @param[in] A map of UserBuffers that contains the input data for
* each input. The names of UserBuffers needs to be
* matched with names retrieved through
* getInputTensorNames()
*
* @param[in,out] A map of UserBuffers that will hold the output
* data of potentially multiple layers (the key
* in the map is the UserBuffer name)
*
* @note input and output UserBuffer maps must be fully pre-populated. with
* dimensions matching what the network expects.
* For example, if there are 5 output UserBuffers they all have to be
* present in map.
*
* Caller must guarantee that for the duration of execute(), the buffer
* stored in UserBuffer would remain valid. For more detail on buffer
* ownership and lifetime requirements, please refer to zdl::DlSystem::UserBuffer
* documentation.
*/
bool execute(const zdl::DlSystem::UserBufferMap &input,
const zdl::DlSystem::UserBufferMap &output) noexcept;
/**
* @brief Regiter Client ION Buffers
* @param[in] A UserMemoryMap of virtual addresses
*
*/
bool registerIonBuffers(const zdl::DlSystem::UserMemoryMap& ionBufferMap) noexcept;
/**
* @brief Regiter Client ION Buffers
* @param[in] A StringList of ION Buffer names
*
*/
bool deregisterIonBuffers(const zdl::DlSystem::StringList& ionBufferNames) noexcept;
/**
* @brief Returns the version string embedded at model conversion
* time.
*
* @return Model version string, which is a free-form string
* supplied at the time of the conversion
*
*/
zdl::DlSystem::String getModelVersion() const noexcept;
/**
* @brief Returns the dimensions of the input data to the model in the
* form of TensorShape. The dimensions in TensorShape corresponds to
* what the tensor dimensions would need to be for an input tensor to
* the model.
*
* @param[in] layer input name.
*
* @note Note that this function only makes sense for networks
* that have a fixed input size. For networks in which the
* input size varies with each call of Execute(), this
* function should not be used.
*
* @note Because the returned type is an Optional instance, it must
* be verified as a boolean true value before being dereferenced.
*
* @return An Optional instance of TensorShape that maintains dimensions,
* matching the tensor dimensions for input to the model,
* where the last entry is the fastest varying dimension, etc.
*
* @see zdl::DlSystem::ITensor
* @see zdl::DlSystem::TensorShape
* @see zdl::DlSystem::Optional
*/
zdl::DlSystem::Optional<zdl::DlSystem::TensorShape>
getInputDimensions() const noexcept;
zdl::DlSystem::Optional<zdl::DlSystem::TensorShape>
getInputDimensions(const char *name) const noexcept;
/**
* @brief Gets the output layer(s) for the network.
*
* Note that the output layers returned by this function may be
* different than those specified when the network was created
* via the zdl::SNPE::SNPEBuilder. For example, if the
* network was created in debug mode with no explicit output
* layers specified, this will contain all layers.
*
* @note Note that because the returned value is an Optional StringList,
* the list must be verified as a boolean true value before being
* dereferenced.
*
* @return A List of output layer names.
*
* @see zdl::DlSystem::Optional
*/
zdl::DlSystem::Optional<zdl::DlSystem::StringList>
getOutputLayerNames() const noexcept;
/**
* @brief Returns attributes of buffers used to feed input tensors and receive result from output tensors.
*
* @param[in] Tensor name.
*
* @return BufferAttributes of input/output tensor named
*/
zdl::DlSystem::Optional<zdl::DlSystem::IBufferAttributes*> getInputOutputBufferAttributes(const char *name) const noexcept;
/**
* @brief .
*
* Get the diagnostic logging interface
*
* @note Note that because the returned type is an Optional instance,
* it must be verified as a boolean true value before being
* dereferenced.
*
* @see zdl::DlSystem::Optional
*/
zdl::DlSystem::Optional<zdl::DiagLog::IDiagLog*>
getDiagLogInterface() noexcept;
private:
SNPE(const SNPE&) = delete;
SNPE& operator=(const SNPE&) = delete;
std::unique_ptr<SnpeRuntime> m_Runtime;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
}}
#endif

View File

@@ -0,0 +1,306 @@
//==============================================================================
//
// Copyright (c) 2017-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _SNPE_BUILDER_HPP_
#define _SNPE_BUILDER_HPP_
#include "SNPE/SNPE.hpp"
#include "DlSystem/DlEnums.hpp"
#include "DlSystem/UDLFunc.hpp"
#include "DlSystem/DlOptional.hpp"
#include "DlSystem/TensorShapeMap.hpp"
#include "DlSystem/PlatformConfig.hpp"
#include "DlSystem/IOBufferDataTypeMap.hpp"
#include "DlSystem/RuntimeList.hpp"
namespace zdl {
namespace DlContainer
{
class IDlContainer;
}
}
struct SNPEBuilderImpl;
namespace zdl { namespace SNPE {
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* The builder class for creating SNPE objects.
* Not meant to be extended.
*/
class ZDL_EXPORT SNPEBuilder final
{
private:
std::unique_ptr<::SNPEBuilderImpl> m_Impl;
public:
/**
* @brief Constructor of NeuralNetwork Builder with a supplied model.
*
* @param[in] container A container holding the model.
*
* @return A new instance of a SNPEBuilder object
* that can be used to configure and build
* an instance of SNPE.
*
*/
explicit SNPEBuilder(
zdl::DlContainer::IDlContainer* container);
~SNPEBuilder();
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE. Please use
* setRuntimeProcessorOrder()
*
* @brief Sets the runtime processor.
*
* @param[in] targetRuntimeProcessor The target runtime.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setRuntimeProcessor(
zdl::DlSystem::Runtime_t targetRuntimeProcessor);
/**
* @brief Requests a performance profile.
*
* @param[in] targetRuntimeProfile The target performance profile.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setPerformanceProfile(
zdl::DlSystem::PerformanceProfile_t performanceProfile);
/**
* @brief Sets the profiling level. Default profiling level for
* SNPEBuilder is off. Off and basic only applies to DSP runtime.
*
* @param[in] profilingLevel The target profiling level.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setProfilingLevel(
zdl::DlSystem::ProfilingLevel_t profilingLevel);
/**
* @brief Sets a preference for execution priority.
*
* This allows the caller to give coarse hint to SNPE runtime
* about the priority of the network. SNPE runtime is free to use
* this information to co-ordinate between different workloads
* that may or may not extend beyond SNPE.
*
* @param[in] ExecutionPriorityHint_t The target performance profile.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setExecutionPriorityHint(
zdl::DlSystem::ExecutionPriorityHint_t priority);
/**
* @brief Sets the layers that will generate output.
*
* @param[in] outputLayerNames List of layer names to
* output. An empty list will
* result in only the final
* layer of the model being
* the output layer. The list
* will be copied.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setOutputLayers(
const zdl::DlSystem::StringList& outputLayerNames);
/**
* @brief Sets the output tensor names.
*
* @param[in] outputTensorNames List of tensor names to
* output. An empty list will
* result in producing output for the final
* output tensor of the model.
* The list will be copied.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setOutputTensors(
const zdl::DlSystem::StringList& outputTensorNames);
/**
* @brief Passes in a User-defined layer.
*
* @param udlBundle Bundle of udl factory function and a cookie
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setUdlBundle(
zdl::DlSystem::UDLBundle udlBundle);
/**
* @brief Sets whether this neural network will perform inference with
* input from user-supplied buffers, and write output to user-supplied
* buffers. Default behaviour is to use tensors created by
* ITensorFactory.
*
* @param[in] bufferMode Whether to use user-supplied buffer or not.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setUseUserSuppliedBuffers(
bool bufferMode);
/**
* @brief Sets the debug mode of the runtime.
*
* @param[in] debugMode This enables debug mode for the runtime. It
* does two things. For an empty
* outputLayerNames list, all layers will be
* output. It might also disable some internal
* runtime optimizations (e.g., some networks
* might be optimized by combining layers,
* etc.).
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setDebugMode(
bool debugMode);
/**
* NOTE: DEPRECATED, MAY BE REMOVED IN THE FUTURE. Please use
* setRuntimeProcessorOrder()
*
* @brief Sets the mode of CPU fallback functionality.
*
* @param[in] mode This flag enables/disables the functionality
* of CPU fallback. When the CPU fallback
* functionality is enabled, layers in model that
* violates runtime constraints will run on CPU
* while the rest of non-violating layers will
* run on the chosen runtime processor. In
* disabled mode, models with layers violating
* runtime constraints will NOT run on the chosen
* runtime processor and will result in runtime
* exception. By default, the functionality is
* enabled.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setCPUFallbackMode(
bool mode);
/**
* @brief Sets network's input dimensions to enable resizing of
* the spatial dimensions of each layer for fully convolutional networks,
* and the batch dimension for all networks.
*
* @param[in] tensorShapeMap The map of input names and their new dimensions.
* The new dimensions overwrite the input dimensions
* embedded in the model and then resize each layer
* of the model. If the model contains
* layers whose dimensions cannot be resized e.g FullyConnected,
* exception will be thrown when SNPE instance is actually built.
* In general the batch dimension is always resizable.
* After resizing of layers' dimensions in model based
* on new input dimensions, the new model is revalidated
* against all runtime constraints, whose failures may
* result in cpu fallback situation.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setInputDimensions(const zdl::DlSystem::TensorShapeMap& inputDimensionsMap);
/**
* @brief Sets the mode of init caching functionality.
*
* @param[in] mode This flag enables/disables the functionality of init caching.
* When init caching functionality is enabled, a set of init caches
* will be created during network building/initialization process
* and will be added to DLC container. If such DLC container is saved
* by the user, in subsequent network building/initialization processes
* these init caches will be loaded from the DLC so as to reduce initialization time.
* In disable mode, no init caches will be added to DLC container.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setInitCacheMode(
bool cacheMode);
/**
* @brief Returns an instance of SNPE based on the current parameters.
*
* @return A new instance of a SNPE object that can be used
* to execute models or null if any errors occur.
*/
std::unique_ptr<SNPE> build() noexcept;
/**
* @brief Sets the platform configuration.
*
* @param[in] platformConfig The platform configuration.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setPlatformConfig(const zdl::DlSystem::PlatformConfig& platformConfig);
/**
* @brief Sets network's runtime order of precedence. Example:
* CPU_FLOAT32, GPU_FLOAT16, AIP_FIXED8_TF
* Note:- setRuntimeProcessor() or setCPUFallbackMode() will be silently ignored when
* setRuntimeProcessorOrder() is invoked
*
* @param[in] runtimeList The list of runtime in order of precedence
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setRuntimeProcessorOrder(const zdl::DlSystem::RuntimeList& runtimeList);
/**
* @brief Sets the unconsumed tensors as output
*
* @param[in] setOutput This enables unconsumed tensors (i.e)
* outputs which are not inputs to any
* layer (basically dead ends) to be marked
* for output
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setUnconsumedTensorsAsOutputs(
bool setOutput);
/**
* @brief Execution terminated when exceeding time limit.
* Only valid for dsp runtime currently.
*
* @param[in] timeout Time limit value
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setTimeOut(
uint64_t timeout);
/**
* @brief Sets the datatype of the buffer.
* Only valid for dsp runtime currently.
*
* @param[in] Map of the buffer names and the datatype that needs to be set.
*
* @return The current instance of SNPEBuilder.
*/
SNPEBuilder& setBufferDataType(const zdl::DlSystem::IOBufferDataTypeMap& dataTypeMap);
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
}}
#endif

View File

@@ -0,0 +1,220 @@
//==============================================================================
//
// Copyright (c) 2015-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef _SNPE_FACTORY_HPP_
#define _SNPE_FACTORY_HPP_
#include "SNPE/SNPE.hpp"
#include "DlSystem/DlEnums.hpp"
#include "DlSystem/UDLFunc.hpp"
#include "DlSystem/ZdlExportDefine.hpp"
#include "DlSystem/DlOptional.hpp"
namespace zdl {
namespace DlSystem
{
class ITensorFactory;
class IUserBufferFactory;
}
namespace DlContainer
{
class IDlContainer;
}
}
namespace zdl { namespace SNPE {
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* The factory class for creating SNPE objects.
*
*/
class ZDL_EXPORT SNPEFactory
{
public:
/**
* Indicates whether the supplied runtime is available on the
* current platform.
*
* @param[in] runtime The target runtime to check.
*
* @return True if the supplied runtime is available; false,
* otherwise.
*/
static bool isRuntimeAvailable(zdl::DlSystem::Runtime_t runtime);
/**
* Indicates whether the supplied runtime is available on the
* current platform.
*
* @param[in] runtime The target runtime to check.
*
* @param[in] option Extent to perform runtime available check.
*
* @return True if the supplied runtime is available; false,
* otherwise.
*/
static bool isRuntimeAvailable(zdl::DlSystem::Runtime_t runtime,
zdl::DlSystem::RuntimeCheckOption_t option);
/**
* Gets a reference to the tensor factory.
*
* @return A reference to the tensor factory.
*/
static zdl::DlSystem::ITensorFactory& getTensorFactory();
/**
* Gets a reference to the UserBuffer factory.
*
* @return A reference to the UserBuffer factory.
*/
static zdl::DlSystem::IUserBufferFactory& getUserBufferFactory();
/**
* Gets the version of the SNPE library.
*
* @return Version of the SNPE library.
*
*/
static zdl::DlSystem::Version_t getLibraryVersion();
/**
* Set the SNPE storage location for all SNPE instances in this
* process. Note that this may only be called once, and if so
* must be called before creating any SNPE instances.
*
* @param[in] storagePath Absolute path to a directory which SNPE may
* use for caching and other storage purposes.
*
* @return True if the supplied path was succesfully set as
* the SNPE storage location, false otherwise.
*/
static bool setSNPEStorageLocation(const char* storagePath);
/**
* @brief Register a user-defined op package with SNPE.
*
* @param[in] regLibraryPath Path to the registration library
* that allows clients to register a set of operations that are
* part of the package, and share op info with SNPE
*
* @return True if successful, False otherwise.
*/
static bool addOpPackage( const std::string& regLibraryPath );
/**
* Indicates whether the OpenGL and OpenCL interoperability is supported
* on GPU platform.
*
* @return True if the OpenGL and OpenCl interop is supported; false,
* otherwise.
*/
static bool isGLCLInteropSupported();
static const char* getLastError();
/**
* Initializes logging with the specified log level.
* initializeLogging with level, is used on Android platforms
* and after successful initialization, SNPE
* logs are printed in android logcat logs.
*
* It is recommended to initializeLogging before creating any
* SNPE instances, in order to capture information related to
* core initialization. If this is called again after first
* time initialization, subsequent calls are ignored.
* Also, Logging can be re-initialized after a call to
* terminateLogging API by calling initializeLogging again.
*
* A typical usage of Logging life cycle can be
* initializeLogging()
* any other SNPE API like isRuntimeAvailable()
* * setLogLevel() - optional - can be called anytime
* between initializeLogging & terminateLogging
* SNPE instance creation, inference, destroy
* terminateLogging().
*
* Please note, enabling logging can have performance impact.
*
* @param[in] LogLevel_t Log level (LOG_INFO, LOG_WARN, etc.).
*
* @return True if successful, False otherwise.
*/
static bool initializeLogging(const zdl::DlSystem::LogLevel_t& level);
/**
* Initializes logging with the specified log level and log path.
* initializeLogging with level & log path, is used on non Android
* platforms and after successful initialization, SNPE
* logs are printed in std output & into log files created in the
* log path.
*
* It is recommended to initializeLogging before creating any
* SNPE instances, in order to capture information related to
* core initialization. If this is called again after first
* time initialization, subsequent calls are ignored.
* Also, Logging can be re-initialized after a call to
* terminateLogging API by calling initializeLogging again.
*
* A typical usage of Logging life cycle can be
* initializeLogging()
* any other SNPE API like isRuntimeAvailable()
* * setLogLevel() - optional - can be called anytime
* between initializeLogging & terminateLogging
* SNPE instance creation, inference, destroy
* terminateLogging()
*
* Please note, enabling logging can have performance impact
*
* @param[in] LogLevel_t Log level (LOG_INFO, LOG_WARN, etc.).
*
* @param[in] Path of directory to store logs.
* If path is empty, the default path is "./Log".
* For android, the log path is ignored.
*
* @return True if successful, False otherwise.
*/
static bool initializeLogging(const zdl::DlSystem::LogLevel_t& level, const std::string& logPath);
/**
* Updates the current logging level with the specified level.
* setLogLevel is optional, called anytime after initializeLogging
* and before terminateLogging, to update the log level set.
* Log levels can be updated multiple times by calling setLogLevel
* A call to setLogLevel() is ignored if it is made before
* initializeLogging() or after terminateLogging()
*
* @param[in] LogLevel_t Log level (LOG_INFO, LOG_WARN, etc.).
*
* @return True if successful, False otherwise.
*/
static bool setLogLevel(const zdl::DlSystem::LogLevel_t& level);
/**
* Terminates logging.
*
* It is recommended to terminateLogging after initializeLogging
* in order to disable logging information.
* If this is called before initialization or after first time termination,
* calls are ignored.
*
* @return True if successful, False otherwise.
*/
static bool terminateLogging(void);
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
}}
#endif

View File

@@ -0,0 +1,49 @@
//==============================================================================
//
// Copyright (c) 2019 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef PSNPE_USERBUFFERLIST_HPP
#define PSNPE_USERBUFFERLIST_HPP
#include <vector>
#include "DlSystem/UserBufferMap.hpp"
#include "DlSystem/ZdlExportDefine.hpp"
namespace zdl {
namespace PSNPE
{
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief .
*
* The class for creating a UserBufferMap container.
*
*/
class ZDL_EXPORT UserBufferList final
{
public:
UserBufferList();
UserBufferList(const size_t size);
void push_back(const zdl::DlSystem::UserBufferMap &userBufferMap);
zdl::DlSystem::UserBufferMap& operator[](const size_t index);
UserBufferList& operator =(const UserBufferList &other);
size_t size() const noexcept;
size_t capacity() const noexcept;
void clear() noexcept;
~UserBufferList() = default;
private:
void swap(const UserBufferList &other);
std::vector<zdl::DlSystem::UserBufferMap> m_userBufferMaps;
};
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
} // namespace PSNPE
} // namespace zdl
#endif //PSNPE_USERBUFFERLIST_HPP

View File

@@ -0,0 +1,537 @@
//==============================================================================
//
// Copyright (c) 2019-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef SNPE_UDO_BASE_H
#define SNPE_UDO_BASE_H
#include <stdint.h>
// Provide values to use for API version.
#define API_VERSION_MAJOR 1
#define API_VERSION_MINOR 6
#define API_VERSION_TEENY 0
/** @addtogroup c_plus_plus_apis C++
@{ */
// Defines a bitmask of enum values.
typedef uint32_t SnpeUdo_Bitmask_t;
typedef SnpeUdo_Bitmask_t Udo_Bitmask_t;
// A string of characters, rather than an array of bytes.
// Assumed to be UTF-8.
typedef char* SnpeUdo_String_t;
typedef SnpeUdo_String_t Udo_String_t;
// The maximum allowable length of a SnpeUdo_String_t in bytes,
// including null terminator. SNPE will truncate strings longer
// than this.
#define SNPE_UDO_MAX_STRING_SIZE 1024
/**
* An enum which holds the various error types.
* The error types are divided to classes :
* 0 - 99 : generic errors
* 100 - 200 : errors related to configuration
*
*/
typedef enum
{
/// No Error
SNPE_UDO_NO_ERROR = 0, UDO_NO_ERROR = 0,
/// Unsupported value for core type
SNPE_UDO_WRONG_CORE = 1, UDO_WRONG_CORE = 1,
/// Invalid attribute/argument passed into UDO API
SNPE_UDO_INVALID_ARGUMENT = 2, UDO_INVALID_ARGUMENT = 2,
/// Unsupported feature error
SNPE_UDO_UNSUPPORTED_FEATURE = 3, UDO_UNSUPPORTED_FEATURE = 3,
/// Error relating to memory allocation
SNPE_UDO_MEM_ALLOC_ERROR = 4, UDO_MEM_ALLOC_ERROR = 4,
/* Configuration Specific errors */
/// No op with given attributes available in library
SNPE_UDO_WRONG_OPERATION = 100, UDO_WRONG_OPERATION = 100,
/// Unsupported value for core type in UDO configuration
SNPE_UDO_WRONG_CORE_TYPE = 101, UDO_WRONG_CORE_TYPE = 101,
/// Wrong number of params in UDO definition
SNPE_UDO_WRONG_NUM_OF_PARAMS = 102, UDO_WRONG_NUM_OF_PARAMS = 102,
/// Wrong number of dimensions for tensor(s) in UDO definition
SNPE_UDO_WRONG_NUM_OF_DIMENSIONS = 103, UDO_WRONG_NUM_OF_DIMENSIONS = 103,
/// Wrong number of input tensors in UDO definition
SNPE_UDO_WRONG_NUM_OF_INPUTS = 104, UDO_WRONG_NUM_OF_INPUTS = 104,
/// Wrong number of output tensors in UDO definition
SNPE_UDO_WRONG_NUM_OF_OUTPUTS = 105, UDO_WRONG_NUM_OF_OUTPUTS = 105,
SNPE_UDO_PROGRAM_CACHE_NOT_FOUND = 106, UDO_PROGRAM_CACHE_NOT_FOUND = 106,
SNPE_UDO_UNKNOWN_ERROR = 0xFFFFFFFF, UDO_UNKNOWN_ERROR = 0xFFFFFFFF
} SnpeUdo_ErrorType_t;
typedef SnpeUdo_ErrorType_t Udo_ErrorType_t;
/**
* An enum which holds the various data types.
* Designed to be used as single values or combined into a bitfield parameter
* (0x1, 0x2, 0x4, etc)
* \n FIXED_XX types are targeted for data in tensors.
* \n UINT / INT types are targeted for scalar params
*/
typedef enum
{
/// data type: 16-bit floating point
SNPE_UDO_DATATYPE_FLOAT_16 = 0x01, UDO_DATATYPE_FLOAT_16 = 0x01,
/// data type: 32-bit floating point
SNPE_UDO_DATATYPE_FLOAT_32 = 0x02, UDO_DATATYPE_FLOAT_32 = 0x02,
/// data type: 4-bit fixed point
SNPE_UDO_DATATYPE_FIXED_4 = 0x04, UDO_DATATYPE_FIXED_4 = 0x04,
/// data type: 8-bit fixed point
SNPE_UDO_DATATYPE_FIXED_8 = 0x08, UDO_DATATYPE_FIXED_8 = 0x08,
/// data type: 16-bit fixed point
SNPE_UDO_DATATYPE_FIXED_16 = 0x10, UDO_DATATYPE_FIXED_16 = 0x10,
/// data type: 32-bit fixed point
SNPE_UDO_DATATYPE_FIXED_32 = 0x20, UDO_DATATYPE_FIXED_32 = 0x20,
/// data type: 8-bit unsigned integer
SNPE_UDO_DATATYPE_UINT_8 = 0x100, UDO_DATATYPE_UINT_8 = 0x100,
/// data type: 16-bit unsigned integer
SNPE_UDO_DATATYPE_UINT_16 = 0x200, UDO_DATATYPE_UINT_16 = 0x200,
/// data type: 32-bit unsigned integer
SNPE_UDO_DATATYPE_UINT_32 = 0x400, UDO_DATATYPE_UINT_32 = 0x400,
/// data type: 8-bit signed integer
SNPE_UDO_DATATYPE_INT_8 = 0x1000, UDO_DATATYPE_INT_8 = 0x1000,
/// data type: 16-bit signed integer
SNPE_UDO_DATATYPE_INT_16 = 0x2000, UDO_DATATYPE_INT_16 = 0x2000,
/// data type: 32-bit signed integer
SNPE_UDO_DATATYPE_INT_32 = 0x4000, UDO_DATATYPE_INT_32 = 0x4000,
SNPE_UDO_DATATYPE_LAST = 0xFFFFFFFF, UDO_DATATYPE_LAST = 0xFFFFFFFF
} SnpeUdo_DataType_t;
typedef SnpeUdo_DataType_t Udo_DataType_t;
/**
* An enum which holds the various layouts.
* Designed to be used as single values or combined into a bitfield parameter
* (0x1, 0x2, 0x4, etc)
*/
typedef enum
{
/// data layout (4D): NHWC (batch-height-width-channel)
SNPE_UDO_LAYOUT_NHWC = 0x01, UDO_LAYOUT_NHWC = 0x01,
/// data layout (4D): NCHW (batch-channel-height-width)
SNPE_UDO_LAYOUT_NCHW = 0x02, UDO_LAYOUT_NCHW = 0x02,
/// data layout (5D): NDHWC (batch-dimension-height-width-channel)
SNPE_UDO_LAYOUT_NDHWC = 0x04, UDO_LAYOUT_NDHWC = 0x04,
SNPE_UDO_LAYOUT_GPU_OPTIMAL1 = 0x08, UDO_LAYOUT_GPU_OPTIMAL1 = 0x08,
SNPE_UDO_LAYOUT_GPU_OPTIMAL2 = 0x10, UDO_LAYOUT_GPU_OPTIMAL2 = 0x10,
SNPE_UDO_LAYOUT_DSP_OPTIMAL1 = 0x11, UDO_LAYOUT_DSP_OPTIMAL1 = 0x11,
SNPE_UDO_LAYOUT_DSP_OPTIMAL2 = 0x12, UDO_LAYOUT_DSP_OPTIMAL2 = 0x12,
// Indicates no data will be allocated for this tensor.
// Used to specify optional inputs/outputs positionally.
SNPE_UDO_LAYOUT_NULL = 0x13, UDO_LAYOUT_NULL = 0x13,
SNPE_UDO_LAYOUT_LAST = 0xFFFFFFFF, UDO_LAYOUT_LAST = 0xFFFFFFFF
} SnpeUdo_TensorLayout_t;
typedef SnpeUdo_TensorLayout_t Udo_TensorLayout_t;
/**
* An enum which holds the UDO library Core type .
* Designed to be used as single values or combined into a bitfield parameter
* (0x1, 0x2, 0x4, etc)
*/
typedef enum
{
/// Library target IP Core is undefined
SNPE_UDO_CORETYPE_UNDEFINED = 0x00, UDO_CORETYPE_UNDEFINED = 0x00,
/// Library target IP Core is CPU
SNPE_UDO_CORETYPE_CPU = 0x01, UDO_CORETYPE_CPU = 0x01,
/// Library target IP Core is GPU
SNPE_UDO_CORETYPE_GPU = 0x02, UDO_CORETYPE_GPU = 0x02,
/// Library target IP Core is DSP
SNPE_UDO_CORETYPE_DSP = 0x04, UDO_CORETYPE_DSP = 0x04,
SNPE_UDO_CORETYPE_LAST = 0xFFFFFFFF, UDO_CORETYPE_LAST = 0xFFFFFFFF
} SnpeUdo_CoreType_t;
typedef SnpeUdo_CoreType_t Udo_CoreType_t;
/**
* An enum to specify the parameter type : Scalar or Tensor
*/
typedef enum
{
/// UDO static param type: scalar
SNPE_UDO_PARAMTYPE_SCALAR = 0x00, UDO_PARAMTYPE_SCALAR = 0x00,
/// UDO static param type: string
SNPE_UDO_PARAMTYPE_STRING = 0x01, UDO_PARAMTYPE_STRING = 0x01,
/// UDO static param type: tensor
SNPE_UDO_PARAMTYPE_TENSOR = 0x02, UDO_PARAMTYPE_TENSOR = 0x02,
SNPE_UDO_PARAMTYPE_LAST = 0xFFFFFFFF, UDO_PARAMTYPE_LAST = 0xFFFFFFFF
} SnpeUdo_ParamType_t;
typedef SnpeUdo_ParamType_t Udo_ParamType_t;
/**
* An enum to specify quantization type
*/
typedef enum
{
/// Tensor Quantization type: NONE. Signifies unquantized tensor data
SNPE_UDO_QUANTIZATION_NONE = 0x00, UDO_QUANTIZATION_NONE = 0x00,
/// Tensor Quantization type: Tensorflow-style
SNPE_UDO_QUANTIZATION_TF = 0x01, UDO_QUANTIZATION_TF = 0x01,
SNPE_UDO_QUANTIZATION_QMN = 0x02, UDO_QUANTIZATION_QMN = 0x02,
SNPE_UDO_QUANTIZATION_LAST = 0xFFFFFFFF, UDO_QUANTIZATION_LAST = 0xFFFFFFFF
} SnpeUdo_QuantizationType_t;
typedef SnpeUdo_QuantizationType_t Udo_QuantizationType_t;
/**
* @brief A struct which is used to provide a version number using 3 values : major, minor, teeny
*
*/
typedef struct
{
/// version field: major - for backward-incompatible changes
uint32_t major;
/// version field: minor - for backward-compatible feature updates
uint32_t minor;
/// version field: teeny - for minor bug-fixes and clean-up
uint32_t teeny;
} SnpeUdo_Version_t;
typedef SnpeUdo_Version_t Udo_Version_t;
/**
* @brief A struct returned from version query, contains the Library version and API version
*
*/
typedef struct
{
/// Version of UDO library. Controlled by users
SnpeUdo_Version_t libVersion;
/// Version of SNPE UDO API used in compiling library. Determined by SNPE
SnpeUdo_Version_t apiVersion;
} SnpeUdo_LibVersion_t;
/**
* @brief A struct returned from version query, contains the package version
*
*/
typedef struct
{
/// Version of UDO API used in package.
Udo_Version_t apiVersion;
} Udo_PkgVersion_t;
/**
* @brief A union to hold the value of a generic type. Allows defining a parameter struct
* in a generic way, with a "value" location that holds the data regardless of the type.
*
*/
typedef union
{
/// value type: float
float floatValue;
/// value type: unsigned 32-bit integer
uint32_t uint32Value;
/// value type: signed 32-bit integer
int32_t int32Value;
/// value type: unsigned 16-bit integer
uint16_t uint16Value;
/// value type: signed 16-bit integer
int16_t int16Value;
/// value type: unsigned 8-bit integer
uint8_t uint8Value;
/// value type: signed 8-bit integer
int8_t int8Value;
} SnpeUdo_Value_t;
typedef SnpeUdo_Value_t Udo_Value_t;
/**
* @brief A struct which defines a scalar parameter : name, data type, and union of values
*
*/
typedef struct
{
/// The parameter data type : float, int, etc.
SnpeUdo_DataType_t dataType;
/// a union of specified type which holds the data
SnpeUdo_Value_t dataValue;
} SnpeUdo_ScalarParam_t;
typedef SnpeUdo_ScalarParam_t Udo_ScalarParam_t;
/**
* @brief A struct which defines the quantization parameters in case of Tensorflow style quantization
*
*/
typedef struct
{
/// minimum value of the quantization range of data
float minValue;
/// maximum value of the quantization range of data
float maxValue;
} SnpeUdo_TFQuantize_t;
typedef SnpeUdo_TFQuantize_t Udo_TFQuantize_t;
/**
* @brief A struct which defines the quantization type, and union of supported quantization structs
*
*/
typedef struct
{
/// quantization type (only TF-style currently supported)
SnpeUdo_QuantizationType_t quantizeType;
union
{
/// TF-style min-max quantization ranges
SnpeUdo_TFQuantize_t TFParams;
};
} SnpeUdo_QuantizeParams_t;
typedef SnpeUdo_QuantizeParams_t Udo_QuantizeParams_t;
/**
* @brief A struct which defines the datatype associated with a specified core-type
* This should be used to denote the datatypes for a single tensor info, depending
* on the intended execution core.
*
*/
typedef struct
{
/// The IP Core
SnpeUdo_CoreType_t coreType;
/// The associated datatype for this coreType
SnpeUdo_DataType_t dataType;
} SnpeUdo_PerCoreDatatype_t;
typedef SnpeUdo_PerCoreDatatype_t Udo_PerCoreDatatype_t;
/**
* @brief A struct which defines a tensor parameter : name, data type, layout, quantization, more.
* Also holds a pointer to the tensor data.
*
*/
typedef struct
{
/// The maximum allowable dimensions of the tensor. The memory held in
/// _tensorData_ is guaranteed to be large enough for this.
uint32_t* maxDimensions;
/// The current dimensions of the tensor. An operation may modify the current
/// dimensions of its output, to indicate cases where the output has been
/// "resized".
/// Note that for static parameters, the current and max dimensions must
/// match.
uint32_t* currDimensions;
/// Quantization params applicable to the tensor. Currently only supports
/// Tensorflow quantization style.
SnpeUdo_QuantizeParams_t quantizeParams;
/// Number of dimensions to the tensor: 3D, 4D, etc.
uint32_t tensorRank;
/// The parameter data type: float, int, etc.
SnpeUdo_DataType_t dataType;
/// The tensor layout type: NCHW, NHWC, etc.
SnpeUdo_TensorLayout_t layout;
/// Opaque pointer to tensor data. User may be required to re-interpret the pointer
/// based on core-specific definitions.
void* tensorData;
} SnpeUdo_TensorParam_t;
typedef SnpeUdo_TensorParam_t Udo_TensorParam_t;
/**
* @brief A struct which defines tensor information for activation tensors only
*
* It describes an activation tensor object using its name, the intended layout and the datatype
* it will take depending on the intended runtime core. The repeated field indicates that
* that the tensor info describes several input/output activation tensors, which all share the
* aforementioned properties.
*/
typedef struct
{
/// The tensor name
SnpeUdo_String_t tensorName;
/// The tensor layout type: NCHW, NHWC, etc.
SnpeUdo_TensorLayout_t layout;
/// The per core datatype: {SNPE_UDO_DATATYPE, SNPE_UDO_CORE_TYPE}
SnpeUdo_PerCoreDatatype_t* perCoreDatatype;
/// A boolean field indicating that this tensorinfo will be repeated e.x for ops such as Concat or Split
bool repeated;
/// A boolean field indicating whether input is static or not.
bool isStatic;
} SnpeUdo_TensorInfo_t;
typedef SnpeUdo_TensorInfo_t Udo_TensorInfo_t;
/**
* @brief struct which defines a UDO parameter - a union of scalar, tensor and string parameters
*
*/
typedef struct
{
/// Type is scalar or tensor
SnpeUdo_ParamType_t paramType;
/// The param name, for example : "offset", "activation_type"
SnpeUdo_String_t paramName;
union
{
/// scalar param value
SnpeUdo_ScalarParam_t scalarParam;
/// tensor param value
SnpeUdo_TensorParam_t tensorParam;
/// string param value
SnpeUdo_String_t stringParam;
};
} SnpeUdo_Param_t;
typedef SnpeUdo_Param_t Udo_Param_t;
/**
* @brief A struct which defines Operation information which is specific for IP core (CPU, GPU, DSP ...)
*
*/
typedef struct
{
/// The IP Core
SnpeUdo_CoreType_t udoCoreType;
/// Bitmask, defines supported internal calculation types (like FLOAT_32, etc)
/// Based on SnpeUdo_DataType
SnpeUdo_Bitmask_t operationCalculationTypes;
} SnpeUdo_OpCoreInfo_t;
typedef SnpeUdo_OpCoreInfo_t Udo_OpCoreInfo_t;
/**
* @brief A struct which defines the common and core-specific Operation information
*
*/
typedef struct
{
/// Operation type
SnpeUdo_String_t operationType;
/// A bitmask describing which IP Cores (CPU, GPU, DSP ...) support this operation
/// Translated based on SnpeUdo_CoreType
SnpeUdo_Bitmask_t supportedByCores;
/// Number of static parameters defined by the op
uint32_t numOfStaticParams;
/// Array of static parameters. Can be scalar or tensor params
SnpeUdo_Param_t* staticParams;
/// Number of input tensors this op receives
uint32_t numOfInputs;
/// Array of input tensor names to this operation
SnpeUdo_String_t* inputNames;
/// Number of output tensors this op receives
uint32_t numOfOutputs;
/// Array of output tensor names to this operation
SnpeUdo_String_t* outputNames;
/// Number of cores that the op can execute on
uint32_t numOfCoreInfo;
/// Array of per-core information entries
SnpeUdo_OpCoreInfo_t* opPerCoreInfo;
/// Array of input tensor infos for this operation
SnpeUdo_TensorInfo_t* inputInfos;
/// Array of output tensor infos for this operation
SnpeUdo_TensorInfo_t* outputInfos;
} SnpeUdo_OperationInfo_t;
typedef SnpeUdo_OperationInfo_t Udo_OperationInfo_t;
/**
* @brief A struct which provides the implementation library info : type, name
*
*/
typedef struct
{
/// Defines the IP Core that this implementation library is targeting
SnpeUdo_CoreType_t udoCoreType;
/// library name. will be looked at in the standard library path
SnpeUdo_String_t libraryName;
} SnpeUdo_LibraryInfo_t;
typedef SnpeUdo_LibraryInfo_t Udo_LibraryInfo_t;
/**
* @brief A struct returned by the registration library and contains information on the UDO package :
* name, operations, libraries, etc.
*
*/
typedef struct
{
/// A string containing the package name
SnpeUdo_String_t packageName;
/// A bitmask describing supported IP cores (CPU, GPU, DSP ...)
/// Translated based on SnpeUdo_CoreType
SnpeUdo_Bitmask_t supportedCoreTypes;
/// The number of implementation libraries in the package
uint32_t numOfImplementationLib;
/// Array of implementation libraries names/types
SnpeUdo_LibraryInfo_t* implementationLib;
/// A string containing all operation types separated by space
SnpeUdo_String_t operationsString;
/// Number of supported operations
uint32_t numOfOperations;
/// Array of Operation info structs. Each entry describes one
/// Operation (name, params, inputs, outputs)
SnpeUdo_OperationInfo_t* operationsInfo;
} SnpeUdo_RegInfo_t;
typedef SnpeUdo_RegInfo_t Udo_RegInfo_t;
/**
* @brief A struct returned by the implementation library and contains information on the
* specific library: name, IP Core, operations, etc.
*
*/
typedef struct
{
/// Defines the IP Core that this implementation library is targeting
SnpeUdo_CoreType_t udoCoreType;
/// A string containing the package name
SnpeUdo_String_t packageName;
/// A string containing all operation types separated by space
SnpeUdo_String_t operationsString;
/// Number of supported operations
uint32_t numOfOperations;
} SnpeUdo_ImpInfo_t;
typedef SnpeUdo_ImpInfo_t Udo_ImpInfo_t;
/**
* @brief This struct defines an operation. It is used for validation
* or creation of an operation.
* In case of using it for creation, the static params which are tensors
* contain pointers to the real data (weights, for example), and input/output
* tensors also include pointers to the buffers used.
*/
typedef struct
{
/// The IP Core that the operation is defined for - CPU, GPU, DSP...
SnpeUdo_CoreType_t udoCoreType;
/// Operation type
SnpeUdo_String_t operationType;
/// The number of static parameters provided in the staticParams array.
/// this number has to match the number provided by the UDO Registration library information
uint32_t numOfStaticParams;
/// Array of static parameters
SnpeUdo_Param_t* staticParams;
/// The number of input parameters provided in inputs array.
/// this number has to match the number provided by the UDO Registration library information
uint32_t numOfInputs;
/// Array of input tensors, providing layout, data type, sizes, etc
/// When used to create an operation, also contains the initial location of the data
SnpeUdo_TensorParam_t* inputs;
/// The number of output parameters provided in inputs array.
/// this number has to match the number provided by the UDO Registration library information
uint32_t numOfOutputs;
/// Array of output tensors, providing layout, data type, sizes, etc
/// When used to create an operation, also contains the initial location of the data
SnpeUdo_TensorParam_t* outputs;
} SnpeUdo_OpDefinition_t;
typedef SnpeUdo_OpDefinition_t Udo_OpDefinition_t;
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif //SNPE_UDO_BASE_H

View File

@@ -0,0 +1,343 @@
//==============================================================================
//
// Copyright (c) 2019-2021 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
#ifndef SNPE_UDO_IMPL_H
#define SNPE_UDO_IMPL_H
#include <stdbool.h>
#include "SnpeUdo/UdoShared.h"
#ifdef __cplusplus
extern "C"
{
#endif
/** @addtogroup c_plus_plus_apis C++
@{ */
typedef struct _SnpeUdo_OpFactory_t* SnpeUdo_OpFactory_t;
typedef struct _SnpeUdo_Operation_t* SnpeUdo_Operation_t;
typedef SnpeUdo_OpFactory_t Udo_OpFactory_t;
typedef SnpeUdo_Operation_t Udo_Operation_t;
/**
* @brief Initialize the shared library's data structures. Calling any other
* library function before this one will result in error.
*
* @param[in] globalInfrastructure Global core-specific infrastructure to be
* used by operations created in this library. The definition and
* semantics of this object will be defined in the corresponding
* implementation header for the core type.
* @return Error code
*/
SnpeUdo_ErrorType_t
SnpeUdo_initImplLibrary(void* globalInfrastructure);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_InitImplLibraryFunction_t)(void*);
/**
* @brief A function to query the API version of the UDO implementation library.
* The function populates a SnpeUdo_LibVersion_t struct, which contains a SnpeUdo_Version_t
* struct for API version and library version.
*
* @param[in, out] version A pointer to struct which contains major, minor, teeny information for
* library and api versions.
*
* @return Error code
*/
SnpeUdo_ErrorType_t
SnpeUdo_getImplVersion(SnpeUdo_LibVersion_t** version);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_getImplVersion_t)(SnpeUdo_LibVersion_t** version);
/**
* @brief Release the shared library's data structures, and invalidate any
* handles returned by the library. The behavior of any outstanding
* asynchronous calls made to this library when this function is called
* are undefined. All library functions (except SnpeUdo_initImplLibrary) will
* return an error after this function has been successfully called.
*
* It should be possible to call SnpeUdo_initImplLibrary after calling this
* function, and re-initialize the library.
*
* @return Error code
*/
SnpeUdo_ErrorType_t
SnpeUdo_terminateImplLibrary(void);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_TerminateImplLibraryFunction_t)(void);
/**
* @brief A function to query info on the UDO implementation library.
* The function populates a structure which contains information about
* operations that are part of this library
*
* @param[in, out] implementationInfo A pointer to struct which contains information
* on the operations
*
* @return error code
*
*/
SnpeUdo_ErrorType_t
SnpeUdo_getImpInfo(SnpeUdo_ImpInfo_t** implementationInfo);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_GetImpInfoFunction_t)(SnpeUdo_ImpInfo_t** implementationInfo);
typedef SnpeUdo_GetImpInfoFunction_t Udo_GetImpInfoFunction_t;
/**
* @brief A function to create an operation factory.
* The function receives the operation type, and an array of static parameters,
* and returns operation factory handler
*
* @param[in] udoCoreType The Core type to create the operation on. An error will
* be returned if this does not match the core type of the library.
*
* @param[in] perFactoryInfrastructure CreateOpFactory infrastructure appropriate to this
* core type. The definition and semantics of this object will be defined
* in the corresponding implementation header for the core type.
*
* @param[in] operationType A string containing Operation type. for example "MY_CONV"
*
* @param[in] numOfStaticParams The number of static parameters.
*
* @param[in] staticParams Array of static parameters
*
* @param[in,out] opFactory Handler to Operation Factory, to be used when creating operations
*
* @return Error Code
*/
SnpeUdo_ErrorType_t
SnpeUdo_createOpFactory(SnpeUdo_CoreType_t udoCoreType,
void* perFactoryInfrastructure,
SnpeUdo_String_t operationType,
uint32_t numOfStaticParams,
SnpeUdo_Param_t* staticParams,
SnpeUdo_OpFactory_t* opFactory);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_CreateOpFactoryFunction_t)(SnpeUdo_CoreType_t,
void*,
SnpeUdo_String_t,
uint32_t,
SnpeUdo_Param_t*,
SnpeUdo_OpFactory_t*);
typedef SnpeUdo_CreateOpFactoryFunction_t Udo_CreateOpFactoryFunction_t;
/**
* @brief A function to release the resources allocated for an operation factory
* created by this library.
*
* @param[in] factory The operation factory to release. Upon success this handle will be invalidated.
*
* @return Error Code
*/
SnpeUdo_ErrorType_t
SnpeUdo_releaseOpFactory(SnpeUdo_OpFactory_t opFactory);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_ReleaseOpFactoryFunction_t)(SnpeUdo_OpFactory_t);
typedef SnpeUdo_ReleaseOpFactoryFunction_t Udo_ReleaseOpFactoryFunction_t;
/**
* @brief A function to create an operation from the factory.
* The function receives array of inputs and array of outputs, and creates an operation
* instance, returning the operation instance handler.
*
* @param[in] opFactory OpFactory instance containing the parameters for this operation.
*
* @param[in] perOpInfrastructure Per-Op infrastructure for this operation. The definition
* and semantics of this object will be defined in the implementation header
* appropriate to this core type.
*
* @param[in] numOfInputs The number of input tensors this operation will receive.
*
* @param[in] inputs Array of input tensors, providing both the sizes and initial
* location of the data.
*
* @param[in] numOfOutputs Number of output tensors this operation will produce.
*
* @param[in] outputs Array of output tensors, providing both the sizes and
* initial location of the data.
*
* @param[in,out] operation Handle for newly created operation instance.
*
* @return Error Code
*/
SnpeUdo_ErrorType_t
SnpeUdo_createOperation(SnpeUdo_OpFactory_t opFactory,
void* perOpInfrastructure,
uint32_t numOfInputs,
SnpeUdo_TensorParam_t* inputs,
uint32_t numOfOutputs,
SnpeUdo_TensorParam_t* outputs,
SnpeUdo_Operation_t* operation);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_CreateOperationFunction_t)(SnpeUdo_OpFactory_t,
void*,
uint32_t,
SnpeUdo_TensorParam_t*,
uint32_t,
SnpeUdo_TensorParam_t*,
SnpeUdo_Operation_t*);
typedef SnpeUdo_CreateOperationFunction_t Udo_CreateOperationFunction_t;
/**
* @brief A pointer to notification function.
*
* The notification function supports the non-blocking (e.g. asynchronous) execution use-case.
* In case an "executeUdoOp" function is called with "blocking" set to zero, and a
* notify function, this function will be called by the implementation library at the
* end of execution. The implementation library will pass the notify function the ID
* that was provided to it when "executeUdoOp" was called.
*
* @param[in] ID 32-bit value, that was provided to executeUdoOp by the calling entity.
* Can be used to track the notifications, in case of multiple execute calls issued.
*
* @return Error code
*
*/
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_ExternalNotify_t)(const uint32_t ID);
typedef SnpeUdo_ExternalNotify_t Udo_ExternalNotify_t;
/**
* @brief Operation execution function.
*
* Calling this function will run the operation on set of inputs, generating a set of outputs.
* The call can be blocking (synchronous) or non-blocking (asynchronous). To support the
* non-blocking mode, the calling entity can pass an ID and a notification function.
* At the end of the execution this notification function would be called, passing it the ID.
* <b> NOTE: Asynchronous execution mode not supported in this release. </b>
*
* @param[in] operation handle to the operation on which execute is invoked
* @param[in] blocking flag to indicate execution mode.
* If set, execution is blocking,
* e.g SnpeUdo_executeOp call does not return until execution is done.
* If not set, SnpeUdo_executeOp returns immediately, and the
* library will call the notification function (if set) when execution is done.
*
* @param[in] ID 32-bit number that can be used by the calling entity to track execution
* in case of non-blocking execution.
* For example, it can be a sequence number, increased by one on each call.
*
* @param[in] notifyFunc Pointer to notification function. if the pointer is set, and execution is
* non-blocking, the library will call this function at end of execution,
* passing the number provided as ID
*
* @return Error code
*
*/
SnpeUdo_ErrorType_t
SnpeUdo_executeOp(SnpeUdo_Operation_t operation,
bool blocking,
const uint32_t ID,
SnpeUdo_ExternalNotify_t notifyFunc);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_ExecuteOpFunction_t)(SnpeUdo_Operation_t,
bool,
const uint32_t,
SnpeUdo_ExternalNotify_t);
typedef SnpeUdo_ExecuteOpFunction_t Udo_ExecuteOpFunction_t;
/**
* @brief A function to setting the inputs & outputs. part of SnpeUdo_Operation struct,
* returned from creation of a new operation instance.
* <b> Not supported in this release. </b>
*
* This function allows the calling entity to change some of the inputs and outputs
* between calls to execute.
* Note that the change is limited to changing the <b> pointer </b> to the tensor data only.
* Any other change may be rejected by the implementation library, causing
* immediate invalidation of the operation instance
*
* @param[in] operation Operation on which IO tensors are set
*
* @param[in] inputs array of tensor parameters. The calling entity may provide a subset of the
* operation inputs, providing only those that it wants to change.
*
* @param[in] outputs array of tensor parameters. The calling entity may provide a subset of the
* operation outputs, providing only those that it wants to change.
*
* @return Error code
*
*/
SnpeUdo_ErrorType_t
SnpeUdo_setOpIO(SnpeUdo_Operation_t operation,
SnpeUdo_TensorParam_t* inputs,
SnpeUdo_TensorParam_t* outputs);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_SetOpIOFunction_t)(SnpeUdo_Operation_t,
SnpeUdo_TensorParam_t*,
SnpeUdo_TensorParam_t*);
typedef SnpeUdo_SetOpIOFunction_t Udo_SetOpIOFunction_t;
/**
* @brief A function to return execution times.
*
* This function can be called to query the operation execution times on the IP core
* on which the operation is run. The time is provided in micro-seconds
*
* @param[in] operation Handle to operation whose execution time is being profiled
*
* @param[in,out] executionTime pointer to a uint32 value.This function writes the operation
* execution time in usec into this value.
*
* @return Error code
*
*/
SnpeUdo_ErrorType_t
SnpeUdo_profileOp(SnpeUdo_Operation_t operation, uint32_t *executionTime);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_ProfileOpFunction_t)(SnpeUdo_Operation_t, uint32_t*);
typedef SnpeUdo_ProfileOpFunction_t Udo_ProfileOpFunction_t;
/**
* @brief A function to release the operation instance
* \n When it is called, the implementation library needs to release all resources
* allocated for this operation instance.
* \n Note that all function pointers which are part of SnpeUdo_Operation become
* <b> invalid </b> once releaseUdoOp call returns.
*
* @param[in] operation Handle to operation to be released
* @return Error code
*
*/
SnpeUdo_ErrorType_t
SnpeUdo_releaseOp(SnpeUdo_Operation_t operation);
typedef SnpeUdo_ErrorType_t
(*SnpeUdo_ReleaseOpFunction_t)(SnpeUdo_Operation_t);
typedef SnpeUdo_ReleaseOpFunction_t Udo_ReleaseOpFunction_t;
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#ifdef __cplusplus
} // extern "C"
#endif
#endif //SNPE_UDO_IMPL_H

View File

@@ -0,0 +1,44 @@
//==============================================================================
//
// Copyright (c) 2019-2020 Qualcomm Technologies, Inc.
// All Rights Reserved.
// Confidential and Proprietary - Qualcomm Technologies, Inc.
//
//==============================================================================
// Header to be used by a CPU UDO Implementation library
#ifndef SNPE_UDO_IMPL_CPU_H
#define SNPE_UDO_IMPL_CPU_H
#include <stdio.h>
/** @addtogroup c_plus_plus_apis C++
@{ */
/**
* @brief This struct provides the infrastructure needed by a developer of
* CPU UDO Implementation library.
*
* The framework/runtime which loads the CPU UDO implementation library provides
* this infrastructure data to the loaded library at the time of op factory creation.
* as an opaque pointer. It contains hooks for the UDO library to invoke supported
* functionality at the time of execution
*
* @param getData function pointer to retrieve raw tensor data from opaque pointer
* passed into the UDO when creating an instance.
* @param getDataSize function pointer to retrieve tensor data size from opaque pointer
*/
typedef struct
{
/// function pointer to retrieve raw tensor data from opaque pointer
/// passed into the UDO when creating an instance.
float* (*getData)(void*);
/// function pointer to retrieve tensor data size from opaque pointer
size_t (*getDataSize) (void*);
} SnpeUdo_CpuInfrastructure_t;
/** @} */ /* end_addtogroup c_plus_plus_apis C++ */
#endif // SNPE_UDO_IMPL_CPU_H

Some files were not shown because too many files have changed in this diff Show More