Welcome to Octomizer’s documentation!

Octomizer is a machine learning model optimization service powered by OctoML.ai. This documentation provides an overview of how to use the Octomizer service via the web interface as well as the Python SDK.

Example usage

Here is a simple Python program to upload an ONNX model to the Octomizer, optimize it, and download the optimized Python package:

from octomizer import client, workflow
from octomizer.models import onnx_model

# Pass your API token below:
client = client.OctomizerClient(access_token=MY_ACCESS_TOKEN)

# Specify model file and input layer parameters.
model_file = "mnist.onnx"

# Upload the model to Octomizer.
model = onnx_model.ONNXModel(client, name=model_file, model=model_file)

# Octomize it. By default, the resulting package will be a Python wheel.
wrkflow = model.get_uploaded_model_variant().octomize(platform="broadwell")
wrkflow.wait()
# Save the resulting Python wheel to the current directory.
wrkflow.save_package(".")

# If you would like to view benchmark metrics, you can either visit the UI
# or invoke something similar to:
for w in model.list_workflows():
   platform = w.hardware.platform
   if w.completed() and w.has_benchmark_stage():
      engine = w.proto.benchmark_stage_spec.engine
      metrics = w.metrics()
      print(platform, engine, metrics)

To specify another package output type, one may instead execute something similar to:

from octomizer import model_variant
model.octomize(
   platform="broadwell",
   package_type=model_variant.PackageType.LINUX_SHARED_OBJECT),
)

To see a list of possible hardware platforms, you may execute:

print(client.get_hardware_targets())

Please contact us if you have any questions or problems.