Skip to content

Releases: roboflow/inference

v1.2.2

10 Apr 17:33
dd5cfa5

Choose a tag to compare

What's Changed

Full Changelog: v1.2.1...v1.2.2

v1.2.1

07 Apr 21:29
d198cb7

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.2.0...v1.2.1

v1.2.0

27 Mar 20:56
57b58c0

Choose a tag to compare

🚀 Added

🚗 Switched to inference-models as default inference engine

As announced at the beginning of the 1.x.y release series, we've been working to make inference-models the default engine — and it's now live. The old inference backend remains available in opt-out mode.

Along with this change (and related updates to torch handling), we've updated the recommended installation flow for the inference-gpu Python package. Install torch and torchvision first — selecting the variant and CUDA index that matches your environment — then install inference-gpu:

pip install --index-url https://download.pytorch.org/whl/cu128 torch torchvision  # adjust CUDA version as needed
pip install inference-gpu

Additionally, since inference-models depends on pycuda, you'll need CUDA installed with the development toolkit (including headers required to build pycuda). Follow the appropriate installation guide for your platform:

Tip

To continue using the old inference backend, set the environment variable USE_INFERENCE_MODELS=False.

Important

inference-models manages its cache differently from the old backend. To enable automatic model eviction in long-running containers, activate the Cache Watchdog — it monitors disk usage and removes files when storage exceeds the configured threshold.

Set MAX_INFERENCE_MODELS_CACHE_SIZE_MB to enable it. You can also control how often it runs with INFERENCE_MODELS_CACHE_WATCHDOG_INTERVAL_MINUTES. We recommend enabling this only if there's a risk of running out of disk space on your server.

🛤️ trackers 🤝 Workflows

The new Roboflow open-source library - trackers just got onboarded to workflows.

Thanks to @leeclemnet (#2130) we have three new blocks:

New Block Type Slug Algorithm
bytetrack/v1.py roboflow_core/trackers_bytetrack@v1 ByteTrack
sort/v1.py roboflow_core/trackers_sort@v1 SORT
ocsort/v1.py roboflow_core/trackers_ocsort@v1 OC-SORT
trackers

🔥 New Workflows blocks

  • GLM-OCR model now has Workflows coverage - after adding the model to inference-models last week, @Erol444 this week made a contribution to Workflows 💪
  • @jeku46 in #2171 added structured Event Write block to the pool of Enterprise plugins

Workflows Community plugins

Check out our new documentation page - with Workflows Community plugins highlighting community work around Workflows ecosystem.

image

🔧 Fixed

🚧 Maintenance

Full Changelog: v1.1.2...v1.2.0

v1.1.2

20 Mar 20:03
a3baebd

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.1.1...v1.1.2

v1.1.1

13 Mar 17:41
af014f7

Choose a tag to compare

🚀 Added

🌀 Execution Engine v1.8.0

Steps gated by control flow (e.g. after a ContinueIf block) can now run even when they have no data-derived lineage — meaning they don't receive batch-oriented inputs from upstream steps. Lineage and execution dimensionality are now derived from control flow predecessor steps. Existing workflows are unaffected.

  • 🔀 Control flow lineage — The compiler now tracks lineage coming from control flow steps (e.g. branches after ContinueIf). When a step has no batch-oriented data inputs but is preceded by control flow steps, its execution slices and batch structure are taken from those control flow predecessors.
  • 🔓 Loosened compatibility check — Previously, steps with control flow predecessors but no data-derived lineage would fail at compile time with ControlFlowDefinitionError. That check is now relaxed: lineage is derived from control flow predecessors when no input data lineage exists. The strict check still runs when the step does have data-derived lineage.
  • New step patterns — Steps triggered only by control flow that don't consume batch data now compile and run correctly. For example, you can send email notifications or run other side-effect steps after a ContinueIf without wiring any data into parameters like message_parameters — the step will execute once per control flow branch.
  • 🐛 Batch.remove_by_indices nested batch fix (breaking) — When removing indices via Batch.remove_by_indices, nested Batch elements are now recursively filtered by the same index set. Previously, only the top-level batch was filtered while nested batches were left unchanged, which could cause downstream blocks to silently process None values or fail outright.

Please review our change log 🥼 which outlines all introduced changes. PR: #2106 by @dkosowski87

Warning

One breaking change is included due to a bug fix in Batch.remove_by_indices with nested batches (see below) — impact is expected to be minimal.

🚧 Maintenance

Full Changelog: v1.1.0...v1.1.1

v1.1.0

11 Mar 15:59
6d5e50a

Choose a tag to compare

ℹ️ About 1.1.0 release

This inference release brings important changes to the ecosystem:

  • We have deprecated Python 3.9 which reached EOL
  • We have not made inference-models the default backend for running predictions - this change is postponed until version 1.2.0.

🚀 Added

🧠 Qwen3.5

Thanks to @Matvezy, inference now supports the new Qwen3.5 model.

Qwen3.5 is Alibaba's latest open-source model family (released Feb 2026), ranging from 0.8B to 397B parameters. The headline features are native multimodal (text + vision) support. inference and Workflows support small 0.8B parameters version.

Model is available only with inference-models backend - released in inference-models 0.20.0

🪄 GPT-5.4 support

Thanks to @Erol444, the LLM Workflows block now supports GPT-5.4, keeping inference current with the latest OpenAI model lineup.

⚙️ Selectable inference backend for batch processing

Following up on inferemce 1.0.0 release, Roboflow clients can now select which inference backend is used for batch processing — giving more fine-grained control when mixing legacy and new engine workloads.

Using inference-cli, one can specify which models backend will be selected inference-models or old-inference.

inference rf-cloud batch-processing process-images-with-workflow \
    --workflow-id <your-workflow> \
    --batch-id <your-batch> \
    --api-key<your-api-key> \
    --inference-backend inference-models
# or - for videos
inference rf-cloud batch-processing process-videos-with-workflow \
    --workflow-id <your-workflow> \
    --batch-id <your-batch> \
    --api-key<your-api-key> \
    --inference-backend inference-models

The same can be configured in Roboflow App and via HTTP integration - check out swagger

Caution

Currently, the default backend is old-inference, but that will change in the nearest future - Roboflow clients should verify new backend and make necessary adjustments in their integrations if they want to still use old-inference backend.

🦺 Maintanence

🐍 Drop of Python 3.9 and upgrade to transformers>=5

We've ported all public builds to work with versions of Python newer than 3.9, which was slowing us down when it comes to onboarding new features. Thanks to deprecation, we could migrate to transformers>=5 and enable new model - Qwen 3.5.

Other changes

Full Changelog: v1.0.5...v1.1.0

v1.0.5

06 Mar 20:40
c3eedf0

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.0.4...v1.0.5

v1.0.4

04 Mar 14:13
52a37e4

Choose a tag to compare

What's Changed

Full Changelog: v1.0.3...v1.0.4

v1.0.3

03 Mar 18:35
79519d4

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.0.2...v1.0.3

v1.0.2

27 Feb 21:22
f4b5fdf

Choose a tag to compare

What's Changed

Full Changelog: v1.0.1...v1.0.2