octomizer package

Submodules

octomizer.auth module

Helper classes for authenticating via gRPC.

class octomizer.auth.AuthInterceptor(access_token)

Bases: UnaryUnaryClientInterceptor

This class intercepts outbound gRPC calls and adds an Authorization: Bearer header carrying an access token.

Parameters

access_token – The token to send in the Authorization header.

intercept_unary_unary(continuation, client_call_details, request)

Invoked by gRPC when issuing a request using this interceptor.

octomizer.client module

Python client for the OctoML Platform API.

octomizer.client.OCTOMIZER_API_HOST = 'api.octoml.ai'

Default value for the OctoML API port, if not specified manually or via the environment.

class octomizer.client.OctomizerClient(host: Optional[str] = None, port: Optional[int] = None, insecure: bool = False, access_token: Optional[str] = None, check_connection: bool = True)

Bases: object

A client to the OctoML Platform service.

Parameters
  • host – The hostname of the OctoML RPC service. Note that this is typically not the same as the hostname used to access the OctoML web interface. The default is the value of the environment variable OCTOMIZER_API_HOST. This defaults to api.octoml.ai if not specified.

  • port – The port of the OctoML RPC serivce. The default is the value of the environment variable OCTOMIZER_API_PORT, and defaults to 443.

  • insecure – whether to create an insecure (non-TLS) channel to the gRPC server. This is typically used only for testing.

  • access_token – An OctoML access token. These can be obtained from the web interface or via the CreateAccessToken RPC call.

  • check_connection – If True, check that the connection is live when the client is created, raising a RuntimeError if the connection cannot be established.

add_user(given_name: str, family_name: str, email: str, active: bool = True, is_own_account_admin: bool = False, can_accelerate: bool = True, account_uuid: Optional[str] = None) User

Creates and returns a user with the given information

Parameters
  • given_name – The given name of the new user.

  • family_name – The family name of the new user.

  • email – The email associated to the new user.

  • active – Whether the new user is active (not offboarded).

  • is_own_account_admin – Whether the new user is an admin of their own account.

  • can_accelerate – Whether the new user has the ability to trigger octomizations.

  • account_uuid – uuid corresponding to the account that the new user should be associated to.

Returns

the newly created user

cancel_workflow(uuid: str) Workflow

Cancels the workflow with the given id.

Parameters

uuid – the id of the workflow to cancel.

Returns

the requested workflow, if it exists.

property channel: Channel

Returns the underlying gRPC channel used by this OctomizerClient.

delete_model(uuid: str)

Deletes the model with the given id. This deletes all ModelVariants and Workflows associated with a model, and cannot be undone.

Parameters

uuid – the id of the model to delete.

get_account(uuid: str) Account

Returns the account with the given id.

Parameters

uuid – the id of the account to get.

Returns

the requested account, if it exists.

get_account_users(uuid: str) Iterator[User]

Returns all users associated with the account with the given uuid.

Parameters

uuid – the id of the account whose users to get.

Returns

all the users in the account

get_current_user() User

Returns the currently-authenticated user.

get_dataref(uuid: str) DataRef

Returns the DataRef with the given id.

Parameters

uuid – the id of the DataRef to get.

Returns

the requested DataRef, if it exists.

get_hardware_targets() List[HardwareTarget]

Gets the available hardware targets for the current user’s account.

Returns

the list of hardware targets available to the user’s account.

get_model(uuid: str) Model

Returns the model with the given id.

Parameters

uuid – the id of the model to get.

Returns

the requested model, if it exists.

get_model_variant(uuid: str) ModelVariant

Returns the model variant with the given id.

Parameters

uuid – the id of the model variant to get.

Returns

the requested model variant, if it exists.

get_package_workflow_group(uuid: str) PackageWorkflowGroup

Retrieves the PackageWorkflowGroup with the given id associated with this Model.

Parameters

uuid – the id of the PackageWorkflowGroup to retrieve.

Returns

the PackageWorkflowGroup associated with this Model that has the given id.

get_project(uuid: str) Project

Returns the Project with the given id.

Parameters

uuid – the id of the Project to get.

Returns

the requested Project, if it exists.

get_usage(start_time: Optional[datetime] = None, end_time: Optional[datetime] = None, account_uuid: Optional[str] = None) List[HardwareUsage]

Returns the usage records associated with the user’s account.

Parameters
  • start_time – Timestamp of the beginning of when to check usage metrics. The default start_time is the first day of the month.

  • end_time – Timestamp of the end of when to check usage metrics. The default end_time is the first day of next month.

  • account_uuid – account_uuid to check account usage against. The default account_uuid is the uuid associated to the active user’s account.

Returns

a list of usage records associated with the given account.

get_user(uuid: str) User

Returns the user with the given id.

Parameters

uuid – the id of the user to get.

Returns

the requested user, if it exists.

get_workflow(uuid: str) Workflow

Returns the workflow with the given id.

Parameters

uuid – the id of the workflow to get.

Returns

the requested workflow, if it exists.

property host: str

Returns the OctoML host that the OctomizerClient connects to.

list_models() Iterator[Model]

Returns all Models associated with the current user.

Returns

all the current user’s Models.

list_projects() Iterator[Project]

Returns all Projects associted with the current user.

Returns

all the current user’s Projects.

list_users(account_uuid: Optional[str] = None) Iterator[User]

Returns all users associated with the account with the given uuid.

Parameters

uuid – the id of the account whose users to get.

Returns

all the users in the account

property stub: OctomizerServiceStub

Return the underlying gRPC client stub used by this OctomizerClient. This is useful for cases where you wish to invoke gRPC calls directly.

update_user(user_uuid: str, given_name: Optional[str] = None, family_name: Optional[str] = None, email: Optional[str] = None, account_uuid: Optional[str] = None, active: Optional[bool] = None, is_own_account_admin: Optional[bool] = None, can_accelerate: Optional[bool] = None) User

Updates a user’s data with the provided parameters

Parameters
  • user_uuid – Uuid corresponding to the user whose information should be updated.

  • given_name – The new given name for the user.

  • family_name – The new family name for the user.

  • email – A new email associated to the user.

  • account_uuid – The uuid corresponding to the new account that the user should be associated to.

  • active – A new value for whether the user is active (not offboarded).

  • is_own_account_admin – A new value for whether the user is an admin of their own account.

  • can_accelerate – A new value for whether the user has the ability to trigger octomizations.

Returns

the updated user

octomizer.logging_ module

octomizer.model module

Generic wrapper for Models in the OctoML Platform.

octomizer.model.DEFAULT_JOB_POLL_INTERVAL_SECONDS = 5

The default number of seconds to wait between polling for job statuses.

octomizer.model.DEFAULT_JOB_TIMEOUT_SECONDS = 7200

The default number of seconds to wait for a job to finish.

exception octomizer.model.InvalidInputError

Bases: ModelCreationError

The specified inputs are invalid..

class octomizer.model.Model(client: OctomizerClient, name: Optional[str] = None, model: Optional[Union[bytes, str]] = None, description: Optional[str] = None, labels: Optional[List[str]] = None, uuid: Optional[str] = None, proto: Optional[Model] = None, model_format: Optional[str] = None, model_input_shapes: Optional[Dict[str, List[int]]] = None, model_input_dtypes: Optional[Dict[str, str]] = None, project: Optional[Project] = None, timeout: int = 7200, relaxed_ingest: bool = False)

Bases: object

Represents a generic Model in the OctoML Platform.

__init__(client: OctomizerClient, name: Optional[str] = None, model: Optional[Union[bytes, str]] = None, description: Optional[str] = None, labels: Optional[List[str]] = None, uuid: Optional[str] = None, proto: Optional[Model] = None, model_format: Optional[str] = None, model_input_shapes: Optional[Dict[str, List[int]]] = None, model_input_dtypes: Optional[Dict[str, str]] = None, project: Optional[Project] = None, timeout: int = 7200, relaxed_ingest: bool = False)

Creates a new Model.

There are three ways to use this constructor:
  1. The client passes in a model object, name, description, and labels. A new model is created on the service with the given parameters.

  2. The client passes in a model UUID. An existing model with the given UUID is fetched from the service.

  3. The client provides a fully-populated models_pb2.Model protobuf message.

Parameters
  • client – an instance of the OctoML client. Required.

  • name – the name of the model. Required.

  • model – The model data in the appropriate format, or the name of a file containing the model.

  • description – a description of the model.

  • labels – tags for the Model.

  • uuid – UUID of a Model already existing in the OctoML Platform. If provided, no other values other than client should be specified.

  • proto – the underyling protobuf object wrapped by this Model. If provided, no other values other than client should be specified.

  • model_format – the type of the underlying model. Supported formats: onnx, tflite, tf_graph_def (tensorflow graph def), tf_saved_model (includes keras saved model).

  • model_input_shapes – The model’s input shapes in key, value format. e.g. {“input0”: [1, 3, 224, 224]}.

  • model_input_dtypes – The model’s input dtypes in key, value format. e.g. {“input0”: “float32”}

  • project – the Project this model belongs to. Must be octomizer.project.Project or None, and None when providing the UUID.

  • timeout – the ingestion job timeout in seconds

create_package_workflow_group(package_workflow_group_spec: PackageWorkflowGroup) PackageWorkflowGroup

Creates a new PackageWorkflowGroup for this Model.

Parameters

package_workflow_group_spec – the specification for the PackageWorkflowGroup to be created.

Returns

the new PackageWorkflowGroup.

create_packages(platform: str, acceleration_mode: workflows_pb2.AccelerationMode = 0, input_shapes: Optional[Dict[str, List[int]]] = None, input_dtypes: Optional[Dict[str, str]] = None, package_name: Optional[str] = None, metadata: Optional[Dict[str, str]] = None) package_workflow_group.PackageWorkflowGroup

Create packages for this Model. This is a convenience function that creates a PackageWorkflowGroup with Workflows created from available tuners and runtimes.

Parameters
  • platform – The hardware platform to target. Available platforms can be queried via the get_hardware_targets method on an OctomizerClient. If you would like to benchmark on other hardware platforms, please submit a feature request here.

  • acceleration_mode

    The acceleration mode to use for the PackageWorkflowGroup. There are currently two modes:

    • AUTO (default): Default settings search

    • EXPRESS: Non-exhaustive search

  • input_shapes – dict mapping input name to shape. Must be provided if input_dtypes is provided.

  • input_dtypes – dict mapping input name to dtype. Must be provied if input_shapes is provided.

  • package_name – The name of the package. If unset or empty, will default to the name of the model. Note: Non-alphanumeric characters in the name will be replaced with underscores (‘_’) and trailing/leading underscores will be stripped. Valid package names must only contain lower case letters, numbers, and single (non leading/trailing) underscores (‘_’).

  • metadata – Metadata tagged onto PackageWorkflowGroup and its Workflows.

get_model_variant(uuid: str) ModelVariant

Retrieves the ModelVariant with the given id associated with this Model.

Parameters

uuid – the id of the ModelVariant to retrieve.

Returns

the ModelVariant associated with this Model that has the given id.

get_uploaded_model_variant() ModelVariant

Returns the original, uploaded ModelVariant for this Model.

get_workflow(uuid: str) Workflow

Deprecated. Retrieves the Workflow with the given id associated with this Model.

Parameters

uuid – the id of the Workflow to retrieve.

Returns

the Workflow associated with this Model that has the given id.

property inputs: Tuple[Dict[str, List[int]], Dict[str, str]]

Return the input shapes and dtypes for this Model. Shapes are expected to be positive but -1 can be used as a sentinel when the dim is unknown and the user is expected to clarify.

list_model_variants() Iterator[ModelVariant]

Retrieves all ModelVariants associated with this Model.

Returns

all ModelVariants associated with this Model.

list_workflows() Iterator[Workflow]

Retrieves all Workflows associated with this Model.

Returns

all Workflows associated with this Model.

property project: Optional[Project]

Return the Project this Model belongs to.

property proto: Model

Return the underlying protobuf describing this Model.

property status: ingest_model_status_pb2.IngestModelStatus.Status

Return the status of the IngestModel job.

static upload_data(client: OctomizerClient, model_bytes: bytes, filename: str = '') DataRef
property uuid: str

Return the UUID for this Model.

wait_for_ingestion(timeout: int = 7200, poll_interval: int = 5) IngestModelStatus

Polls the model ingestion job until it completes.

Parameters
  • timeout – the timeout in seconds

  • poll_interval – the polling interval in seconds

Returns

the IngestModelStatus with the COMPLETED status

Raise

ModelCreationError if polling times out or ingestion fails

exception octomizer.model.ModelCreationError

Bases: RuntimeError

octomizer.model_inputs module

octomizer.model_inputs.inputs_are_dynamic(input_shapes: Dict[str, List[int]]) bool

Returns True if inputs are dynamic, i.e. at least one shape is -1.

octomizer.model_inputs.inputs_are_valid(input_shapes: Optional[Dict[str, List[int]]], input_dtypes: Optional[Dict[str, str]]) Optional[str]

Returns None when this model’s inputs are valid, otherwise returns a message indicating what updates are necessary to fix the inputs.

octomizer.model_inputs.inputs_to_input_proto(input_shapes: Optional[Dict[str, List[int]]] = None, input_dtypes: Optional[Dict[str, str]] = None) ModelInputs

Constructs ModelInputs message from input_shapes, input_dtypes dicts

Parameters
  • input_shapes – optional dict of input name to input shape.

  • input_dtypes – optional dict of input name to input dtype.

Returns

ModelInputs proto constructed from the input dicts.

octomizer.model_variant module

Generic wrapper for ModelVariants in the OctoML Platform.

class octomizer.model_variant.AutoTVMOptions(kernel_trials: int = 2000, exploration_trials: int = 0, random_trials: int = 0, early_stopping_threshold: int = 500)

Bases: object

Specifies options for autotuning using AutoTVM.

early_stopping_threshold: int = 500

Threshold to terminate autotuning if results have not improved in this many iterations.

exploration_trials: int = 0

[experimental, use at your own risk] Minimum number of trials to tune from scratch during autotuning. Note for each tuning job, max(kernel_trials - cached trials, exploration_trials) number of trials are actively tuned.

kernel_trials: int = 2000

Number of trials for each kernel during autotuning – records are pulled from a cache if available, and the remaining trials are actively tuned.

random_trials: int = 0

[experimental, use at your own risk] On top of any cached trials, this indicates the maximum additional random records from the cache to seed autotuning if available.

class octomizer.model_variant.AutoschedulerOptions(trials_per_kernel: int = 1000, early_stopping_threshold: int = 250, top_trials_per_kernel: int = 10)

Bases: object

Specifies options for autotuning using Autoscheduler.

early_stopping_threshold: int = 250

Threshold to terminate autotuning if results have not improved in this many iterations.

top_trials_per_kernel: int = 10

Number of top trials to retrieve from the cache, if possible.

trials_per_kernel: int = 1000

Number of trials for each kernel during autotuning.

class octomizer.model_variant.MetascheduleOptions(exploration_trials_per_kernel: int = 1000, top_trials_per_kernel: int = 10, memory_layout: autotune_pb2.MetaSchedulerSpec.MemoryLayout = 0)

Bases: object

Specifies options for autotuning using Metaschedule.

exploration_trials_per_kernel: int = 1000

Number of trials to explore for each kernel during autotuning.

memory_layout: autotune_pb2.MetaSchedulerSpec.MemoryLayout = 0

The preferred memory layout for this tuning run.

top_trials_per_kernel: int = 10

Number of trials to import from previous tuning runs.

class octomizer.model_variant.ModelVariant(client: OctomizerClient, model: Model, uuid: Optional[str] = None, proto: Optional[ModelVariant] = None)

Bases: object

Represents a ModelVariant on the OctoML Platform.

__init__(client: OctomizerClient, model: Model, uuid: Optional[str] = None, proto: Optional[ModelVariant] = None)

Initializes a new ModelVariant.

Parameters
  • client – an instance of the OctoML client.

  • model – the Model this ModelVariant is associated with.

  • uuid – the id of this ModelVariant in the OctoML Platform.

  • proto – the underyling protobuf object wrapped by this ModelVariant.

accelerate(platform: str, relay_opt_lvl: int = 3, enable_profiler: bool = True, tvm_num_threads: int = 0, kernel_trials: Optional[int] = None, exploration_trials: Optional[int] = None, random_trials: Optional[int] = None, early_stopping_threshold: Optional[int] = None, num_benchmark_trials: int = 30, num_runs_per_trial: int = 1, max_time_seconds: int = 120, min_time_seconds: int = 10, input_shapes: Optional[Dict[str, List[int]]] = None, input_dtypes: Optional[Dict[str, str]] = None, tuning_options: Optional[Union[AutoTVMOptions, AutoschedulerOptions, MetascheduleOptions]] = None, create_package: bool = True, package_name: Optional[str] = None, benchmark_tvm_in_onnxruntime: bool = False) Workflow

Accelerate this ModelVariant. This is a convenience function that creates a Workflow consisting of autotuning, benchmarking, and (optional) packaging stages.

Parameters
  • platform

    The hardware platform to target. Available platforms can be queried via the get_hardware_targets method on an OctomizerClient. If you would like to benchmark on other hardware platforms, please submit a feature request here.

  • relay_opt_lvl – The Relay optimization level to use.

  • enable_profiler – Whether to enable the RelayVM profiler when benchmarking. Profiling is done as an additional step, so it does not affect the values of the standard metrics that are reported.

  • tvm_num_threads – Number of threads that TVM runtime uses when running inference. By default, this is set to vcpu_count/2 for hyperthreading hardware targets and vcpu_count for non-hyperthreading hardware targets, to give you best performance. Setting to 0 enables TVM to automatically decide number of threads.

  • kernel_trials – deprecated, specify tuning_options instead.

  • exploration_trials – deprecated, specify tuning_options instead.

  • random_trials – deprecated, specify tuning_options instead.

  • early_stopping_threshold – deprecated, specify tuning_options instead.

  • num_benchmark_trials – Number of benchmarking trials to execute; if zero, then max_time_seconds value dictates benchmark duration.

  • num_runs_per_trial – Number of benchmarks to run per trial.

  • max_time_seconds – The maximum benchmark duration; zero implies no time limit. Note that the experiment may consist of fewer trials than specified.

  • min_time_seconds – The minimum benchmark duration; zero implies no minimum time. Note that the experiment may consist of more trials than specified.

  • input_shapes – dict mapping input name to shape. Must be provided if input_dtypes is provided.

  • input_dtypes – dict mapping input name to dtype. Must be provied if input_shapes is provided.

  • tuning_options – options to control the autotuning search. Provide either AutoTVMOptions or AutoschedulerOptions or MetascheduleOptions.

  • create_package – Whether a package should be created or not. Defaults to True.

  • package_name – The name of the package. If unset or empty, will default to the name of the model. Note: Non-alphanumeric characters in the name will be replaced with underscores (‘_’) and trailing/leading underscores will be stripped. Valid package names must only contain lower case letters, numbers, and single (non leading/trailing) underscores (‘_’).

  • benchmark_tvm_in_onnxruntime – Whether we should run the benchmark as a TVM -> ONNX custom op model.

:return A Workflow instance.

:raises ValueError when the package name is not a valid package name.

benchmark(platform: str, num_benchmark_trials: int = 30, num_runs_per_trial: int = 1, max_time_seconds: int = 120, min_time_seconds: int = 10, relay_opt_lvl: int = 3, enable_profiler: bool = True, tvm_num_threads: int = 0, untuned_tvm: bool = False, input_shapes: Optional[Dict[str, List[int]]] = None, input_dtypes: Optional[Dict[str, str]] = None, create_package: bool = False, package_name: Optional[str] = None, use_onnx_engine: bool = False, intra_op_num_threads: int = 0, onnx_execution_provider: Optional[ONNXRuntimeExecutionProvider] = None, benchmark_tvm_in_onnxruntime: bool = False, reduced_precision_conversion: workflow_pb2.ReducedPrecisionConversion = 0) workflow.Workflow

Benchmark this ModelVariant. This is a convenience function that creates a Workflow consisting of a single benchmarking stage.

Parameters
  • platform

    The hardware platform to target. Available platforms can be queried via the get_hardware_targets method on an OctomizerClient. If you would like to benchmark on other hardware platforms, please submit a feature request here.

  • num_benchmark_trials – Number of benchmarking trials to execute.

  • num_runs_per_trial – Number of benchmarks to run per trial.

  • max_time_seconds – The maximum benchmark duration; zero implies no time limit. Note that the experiment may consist of fewer trials than specified.

  • min_time_seconds – The minimum benchmark duration; zero implies no minimum time. Note that the experiment may consist of more trials than specified.

  • relay_opt_lvl – The Relay optimization level to use, if the model format is Relay.

  • enable_profiler – Whether to enable the RelayVM profiler when benchmarking, if the model format is Relay. Profiling is done as an additional step, so it does not affect the values of the standard metrics that are reported.

  • tvm_num_threads – Number of threads that TVM runtime uses when running inference. By default, this is set to vcpu_count/2 for hyperthreading hardware targets and vcpu_count for non-hyperthreading hardware targets, to give you best performance. Setting to 0 enables TVM to automatically decide number of threads.

  • untuned_tvm – Whether this is a baseline untuned TVM benchmark.

  • create_package – Whether a package should be created or not. Defaults to False.

  • package_name – The name of the package. If unset or empty, will default to the name of the model. Note: Non-alphanumeric characters in the name will be replaced with underscores (‘_’) and trailing/leading underscores will be stripped. Valid package names must only contain lower case letters, numbers, and single (non leading/trailing) underscores (‘_’).

  • intra_op_num_threads – The number of threads to use for ONNX-RT’s CPUExecutionProvider, TensorFlow, and PyTorch benchmarks/packages. Default is 0, which is physical core count of the platform.

  • onnx_execution_provider – The execution provider to use for ONNX benchmarks. Note that not every execution provider is valid for every platform.

  • benchmark_tvm_in_onnxruntime – Whether we should run the benchmark as a TVM -> ONNX custom op model. (Only for Relay models.)

  • reduced_precision_conversion – Whether to use reduced precision math for the workflows. (only for ONNX models).

:return A Workflow instance.

create_workflow(workflow_spec: Workflow) Workflow

Creates a new Workflow for this ModelVariant.

Parameters

workflow_spec – the specification for the Workflow to be created.

Returns

the new Workflow.

property format: ModelVariantFormat

Returns the ModelVariantFormat of this ModelVariant.

Returns

the ModelVariantFormat of this ModelVariant.

property inputs: Tuple[Dict[str, List[int]], Dict[str, str]]

Return the input shapes and dtypes for this ModelVariant. Shapes are expected to be positive but -1 can be used as a sentinel when the dim is unknown and the user is expected to clarify.

package(platform: str, relay_opt_lvl: Optional[int] = None, tvm_num_threads: Optional[int] = None, input_shapes: Optional[Dict[str, List[int]]] = None, input_dtypes: Optional[Dict[str, str]] = None, package_name: Optional[str] = None, package_options: Optional[PackageOptions] = None) Workflow

Package this ModelVariant. This is a convenience function that creates a Workflow consisting of a single packaging stage.

Parameters
  • platform

    The hardware platform to target. Available platforms can be queried via the get_hardware_targets method on an OctomizerClient. If you would like to benchmark on other hardware platforms, please submit a feature request here.

  • relay_opt_lvl – Deprecated, specify relay optimization level with package_options

  • tvm_num_threads – Deprecated, the number set here will not affect the package.

  • input_shapes – dict mapping input name to shape. Must be provided if input_dtypes is provided.

  • input_dtypes – dict mapping input name to dtype. Must be provied if input_shapes is provided.

  • package_name – The name of the package. If unset or empty, will default to the name of the model. Note: Non-alphanumeric characters in the name will be replaced with underscores (‘_’) and trailing/leading underscores will be stripped. Valid package names must only contain lower case letters, numbers, and single (non leading/trailing) underscores (‘_’).

  • package_options – Selects and configures runtime: TVMPackageOptions (TVM), or OnnxPackageOption (ONNX). All available package types for the specified runtime engine will be created.

:return A Workflow instance.

:raises ValueError when the package name is not a valid package name.

property uuid: str

Return the UUID for this Model.

class octomizer.model_variant.ModelVariantFormat(value)

Bases: Enum

An enum representing the formats a ModelVariant can be in.

ONNX = 'onnx'
RELAY = 'relay'
TENSORFLOW = 'tensorflow'
TFLITE = 'tflite'
TORCHSCRIPT = 'torchscript'
class octomizer.model_variant.OnnxPackageOptions(engine_spec: ~octoml.octomizer.v1.engine_pb2.EngineSpec = onnxruntime_engine_spec { })

Bases: PackageOptions

Specifies packaging to the ONNX runtime

engine_spec: EngineSpec = onnxruntime_engine_spec { }

The engine spec to use for the ONNX package.

class octomizer.model_variant.PackageOptions

Bases: ABC

class octomizer.model_variant.TVMPackageOptions(engine_spec: Optional[EngineSpec] = None)

Bases: PackageOptions

Specifies packaging options for TVM runtime

engine_spec: Optional[EngineSpec] = None

The Relay optimization level to use.

class octomizer.model_variant.TensorFlowPackageOptions

Bases: PackageOptions

Specifies packaging to the Tensorflow runtime

class octomizer.model_variant.TorchscriptPackageOptions

Bases: PackageOptions

Specifies packaging to the Torchscript runtime

octomizer.package_type module

class octomizer.package_type.PackageType(value)

Bases: Enum

An enum representing the types of possible packages.

DOCKER_BUILD_TRITON = 'DOCKER_BUILD_TRITON'
LINUX_SHARED_OBJECT = 'LINUX_SHARED_OBJECT'
ONNXRUNTIME_CUSTOM_OPERATOR_LINUX = 'ONNXRUNTIME_CUSTOM_OPERATOR_LINUX'
PYTHON_PACKAGE = 'PYTHON_PACKAGE'

octomizer.package_workflow_group module

Resource representing a PackageWorkflowGroup.

octomizer.package_workflow_group.DEFAULT_PACKAGE_WORKFLOW_GROUP_POLL_INTERVAL_SECONDS = 30

The default number of seconds to wait between polling for statuses.

octomizer.package_workflow_group.DEFAULT_PACKAGE_WORKFLOW_GROUP_TIMEOUT_SECONDS = 1800

The default number of seconds to wait for a PackageWorkflowGroup to finish.

class octomizer.package_workflow_group.PackageWorkflowGroup(client: OctomizerClient, uuid: Optional[str] = None, proto: Optional[PackageWorkflowGroup] = None)

Bases: object

Represents an OctoML PackageWorkflowGroup.

__init__(client: OctomizerClient, uuid: Optional[str] = None, proto: Optional[PackageWorkflowGroup] = None)

Initializes a new PackageWorkflowGroup.

Parameters
  • client – an instance of the OctoML client.

  • uuid – the id of this PackageWorkflowGroup in the OctoML Platform.

  • proto – the underyling protobuf object wrapped by this PackageWorkflowGroup.

cancel() PackageWorkflowGroup

Cancel this Workflows in this PackageWorkflowGroup.

done() bool

Returns True if all Workflows have finished.

property proto: PackageWorkflowGroup

Return the raw representation for this PackageWorkflowGroup.

refresh() PackageWorkflowGroup

Get the latest status of this Workflow from the OctoML Platform.

property uuid: str

Return the UUID for this PackageWorkflowGroup.

wait(timeout: Optional[int] = 1800, poll_interval: int = 30, poll_callback: Optional[Callable[[PackageWorkflowGroup], None]] = None) bool

Waits until this PackageWorkflowGroup has finished or the given timeout has elapsed.

Parameters
  • timeout – the number of seconds to wait for this PackageWorkflowGroup.

  • poll_interval – the number of seconds to wait between polling for the status of this PackageWorkflowGroup.

  • poll_callback – Optional callback invoked with self as an argument each time the PackageWorkflowGroup status is polled.

Returns

whether the PackageWorkflowGroup completed or not.

Raise

PackageWorkflowGroupTimeoutError if timeout seconds elapse before the Workflows in the group reach a terminal state.

property workflows: List[Workflow]
exception octomizer.package_workflow_group.PackageWorkflowGroupTimeoutError

Bases: Exception

Indicates a PackageWorkflowGroup timed out while being awaited.

octomizer.project module

Generic wrapper for Projects in the Octomizer.

class octomizer.project.Project(client: OctomizerClient, name: Optional[str] = None, description: Optional[str] = None, labels: Optional[List[str]] = None, uuid: Optional[str] = None, proto: Optional[Project] = None)

Bases: object

Represents a Project in the Octomizer system.

__init__(client: OctomizerClient, name: Optional[str] = None, description: Optional[str] = None, labels: Optional[List[str]] = None, uuid: Optional[str] = None, proto: Optional[Project] = None)

Creates a new Project.

There are three ways to use this constructor:
  1. The client passes in a project object, name, description, and labels. A new project is created on the service with the given parameters.

  2. The client passes in a project UUID. An existing project with the given UUID is fetched from the service.

  3. The client provides a fully-populated projects_pb2.Project protobuf message.

Parameters
  • client – an instance of the Octomizer client. Required.

  • name – the name of the project. Required.

  • description – a description of the project.

  • labels – tags for the Project.

  • uuid – UUID of a Project already existing in the Octomizer. If provided, no other values other than client should be specified.

  • proto – the underyling protobuf object wrapped by this Project. If provided, no other values other than client should be specified.

list_models() Iterator[Model]

Retrieves all Models associated with this Project.

Returns

all Models associated with this Project.

property proto: Project

Return the underlying protobuf describing this Project.

property uuid: str

Return the UUID for this Project.

octomizer.user module

class octomizer.user.User(client: OctomizerClient, given_name: Optional[str] = None, family_name: Optional[str] = None, email: Optional[str] = None, account_uuid: Optional[str] = None, active: Optional[bool] = None, is_own_account_admin: Optional[bool] = None, can_accelerate: Optional[bool] = None, uuid: Optional[str] = None, proto: Optional[User] = None)

Bases: object

__init__(client: OctomizerClient, given_name: Optional[str] = None, family_name: Optional[str] = None, email: Optional[str] = None, account_uuid: Optional[str] = None, active: Optional[bool] = None, is_own_account_admin: Optional[bool] = None, can_accelerate: Optional[bool] = None, uuid: Optional[str] = None, proto: Optional[User] = None)

Creates a new User.

There are three ways to use this constructor:
  1. The client passes in a given name, family name, email, account UUID, active flag, and permissions object. A new user is created on the service with the given parameters.

  2. The client passes in a user UUID. An existing user with the given UUID is fetched from the service.

  3. The client provides a fully-populated users_pb2.User protobuf message.

Parameters
  • client – an instance of the OctoML Platform client. Required.

  • given_name – the given name of the user.

  • family_name – the family name of the user.

  • email – the email of the user.

  • account_uuid – the UUID corresponding to the account that owns this user.

  • active – if the user is active (not offboarded).

  • permissions – the permissions given to the user.

  • uuid – UUID of a User already existing in the OctoML Platform. If provided, no other values other than client should be specified.

  • proto – the underyling protobuf object wrapped by this Model. If provided, no other values other than client should be specified.

property account_uuid: str

Return the account UUID for this User.

property active: bool

Return a value for if this User is active (not offboarded).

property family_name: str

Return the family name for this User.

property given_name: str

Return the given name for this User.

property permissions: Permissions

Returns the permissions this User has.

property proto: User

Return the underlying protobuf describing this User.

property uuid: str

Return the UUID for this User.

exception octomizer.user.UserCreationError

Bases: RuntimeError

octomizer.workflow module

Resource representing a Workflow.

octomizer.workflow.DEFAULT_WORKFLOW_POLL_INTERVAL_SECONDS = 10

The default number of seconds to wait between polling for Workflow statuses.

octomizer.workflow.DEFAULT_WORKFLOW_TIMEOUT_SECONDS = 300

The default number of seconds to wait for a Workflow to finish.

class octomizer.workflow.Workflow(client: OctomizerClient, model: Optional[Model] = None, uuid: Optional[str] = None, proto: Optional[Workflow] = None)

Bases: object

Represents an OctoML Workflow.

__init__(client: OctomizerClient, model: Optional[Model] = None, uuid: Optional[str] = None, proto: Optional[Workflow] = None)

Initializes a new Workflow.

Parameters
  • client – an instance of the OctoML client.

  • model – deprecated, the Model this Workflow is associated with.

  • uuid – the id of this Workflow in the OctoML Platform.

  • proto – the underyling protobuf object wrapped by this Workflow.

cancel() Workflow

Cancel this Workflow if it hasn’t already been canceled/completed/failed.

completed() bool

Returns True if this Workflow has completed successfully.

docker_build_triton(tag)

Downloads the Triton Docker build package, extracts all of the files and builds a Docker image.

Tag

The name and tag (name:tag) of the Docker image.

done() bool

Returns True if this Workflow has finished.

has_benchmark_stage() bool

Returns True when the workflow contains a benchmark stage, False otherwise.

is_terminal() bool

Returns True if this Workflow is in a terminal state

metrics() BenchmarkMetrics

Return the BenchmarkMetrics for this Workflow.

The result is only valid if the Workflow had a Benchmark stage, and the Workflow state is WorkflowState.COMPLETED.

property model: Model

Return the model associated with this Workflow.

package_url(package_type: Optional[PackageType] = PackageType.PYTHON_PACKAGE) str

Return the URL of the package output for this Workflow.

Parameters

package_type – The package type we want to get the url for. Defaults to Python wheel.

The result is only valid if the Workflow had a Package stage, the package type is available for the specified runtime engine, and the Workflow state is WorkflowState.COMPLETED.

:raises ValueError when a package type isn’t available or when there is

no package DataRef UUID.

progress() Progress

Returns the progress of this Workflow.

property proto: Workflow

Return the raw representation for this Workflow.

refresh() Workflow

Get the latest status of this Workflow from the OctoML Platform.

result() WorkflowResult

Returns the result of this Workflow. The result is only valid if the Workflow has finished running.

save_package(out_dir: str, package_type: Optional[PackageType] = PackageType.PYTHON_PACKAGE) str

Download the package result for this Workflow to the given directory, and returns the full path to the package.

Out_dir

Where the package will be saved to.

Parameters

package_type – The package type we want to save. Defaults to Python wheel.

The result is only valid if the Workflow had a Package stage, and the Workflow state is WorkflowState.COMPLETED.

state() workflows_pb2.WorkflowStatus.WorkflowState

Returns the current state of this Workflow.

status() WorkflowStatus

Returns the current status of this Workflow.

property uuid: str

Return the UUID for this Workflow.

wait(timeout: Optional[int] = 300, poll_interval: int = 10, poll_callback: Optional[Callable[[Workflow], None]] = None) workflows_pb2.WorkflowStatus.WorkflowState

Waits until this Workflow has finished or the given timeout has elapsed.

Parameters
  • timeout – the number of seconds to wait for this Workflow.

  • poll_interval – the number of seconds to wait between polling for the status of this workflow.

  • poll_callback – Optional callback invoked with self as an argument each time the Workflow status is polled.

Returns

the terminal state of this workflow.

Raise

WorkflowTimeoutError if timeout seconds elapse before the Workflow reaches a terminal state.

exception octomizer.workflow.WorkflowTimeoutError

Bases: Exception

Indicates a Workflow timed out while being awaited.

Module contents