perf: optimize to_dict() serialization — 40% faster for data-heavy figures by KRRT7 · Pull Request #5577 · plotly/plotly.py

KRRT7 · 2026-04-16T13:12:53Z

Overview

Optimizes the to_dict() → convert_to_base64 → to_typed_array_spec hot path that runs on every fig.show(), fig.write_html(), fig.to_json(), and fig.write_image() call.

Changes

1. `to_typed_array_spec`: eliminate redundant array copy

The function called copy_to_readonly_numpy_array(v) which:

Wraps through narwhals from_native() (unnecessary for numpy arrays)
Copies the array via .copy() (unnecessary — input is already a deepcopy from to_dict())
Sets the readonly flag (unnecessary — we immediately base64-encode and discard)

Replaced with a lightweight np.asarray(v) that only converts non-numpy types.

2. `convert_to_base64`: fast numpy detection + skip non-container recursion

Replaced is_homogeneous_array(value) (which checks numpy, pandas, narwhals, and __array_interface__) with a direct isinstance(value, np.ndarray) — in the to_dict() context, data has already been validated and stored as numpy arrays.

Also inlined the numpy module lookup to avoid repeated get_module calls during recursion.

Added a guard in the list/tuple branch to only recurse into container types (dict, list, tuple). Previously, text arrays like ["point_0", "point_1", ...] caused ~500K useless recursive calls since each string element was visited individually.

3. `is_skipped_key`: frozenset instead of list scan

Replaced any(skipped_key == key for skipped_key in skipped_keys) with key in frozenset(...) for O(1) lookup. Called once per dict key during base64 conversion.

Benchmarks

Measured on an isolated Azure VM to eliminate noise:

VM: Azure Standard_D2s_v5 (2 dedicated vCPUs, 8 GB RAM)
OS: Ubuntu 24.04 LTS (kernel 6.17.0-1010-azure)
Python: 3.12.3
Workload: 5 traces × 100K points each (float64 x/y, float64 marker size/color, 100K-element text list per trace)
Methodology: 3 warmup iterations, 20 timed iterations, separate clones + venvs for baseline vs optimized

Metric	Baseline (`main`)	Optimized	Speedup
`to_typed_array_spec` (100K f64)	1.26 ms (σ=0.02)	0.82 ms (σ=0.01)	1.54×
`convert_to_base64` (5×100K)	60.89 ms (σ=0.49)	48.90 ms (σ=0.13)	1.24×
`to_dict` (full end-to-end)	205.71 ms (σ=3.90)	176.51 ms (σ=0.73)	1.17× (14% faster)

Reproduce benchmark

"""
Benchmark to_dict() serialization path for plotly.py

Setup:
  git clone --branch main https://github.com/plotly/plotly.py.git plotly-baseline
  git clone --branch perf/to-dict-serialization https://github.com/KRRT7/plotly.py.git plotly-optimized

  python3 -m venv venv-baseline && venv-baseline/bin/pip install numpy pandas
  cd plotly-baseline && ../venv-baseline/bin/pip install -e ".[dev]" && cd ..

  python3 -m venv venv-optimized && venv-optimized/bin/pip install numpy pandas
  cd plotly-optimized && ../venv-optimized/bin/pip install -e ".[dev]" && cd ..

Run:
  cd plotly-baseline  && ../venv-baseline/bin/python  bench.py baseline
  cd plotly-optimized && ../venv-optimized/bin/python bench.py optimized
"""
import sys
import time
import json
import statistics


def setup():
    import numpy as np
    import plotly.graph_objects as go

    np.random.seed(42)
    n = 100_000

    fig = go.Figure()
    for i in range(5):
        fig.add_trace(go.Scatter(
            x=np.random.rand(n),
            y=np.random.rand(n),
            marker=dict(
                size=np.random.rand(n) * 10,
                color=np.random.rand(n),
            ),
            text=[f"point_{j}" for j in range(n)],
        ))
    return fig


def bench_to_dict(fig, warmup=3, iterations=20):
    for _ in range(warmup):
        fig.to_dict()

    times = []
    for _ in range(iterations):
        start = time.perf_counter()
        fig.to_dict()
        elapsed = time.perf_counter() - start
        times.append(elapsed * 1000)

    return {
        "mean_ms": statistics.mean(times),
        "median_ms": statistics.median(times),
        "stdev_ms": statistics.stdev(times),
        "min_ms": min(times),
        "max_ms": max(times),
    }


def bench_convert_to_base64(fig, warmup=3, iterations=20):
    from _plotly_utils.utils import convert_to_base64
    import copy

    data = fig.to_dict()

    for _ in range(warmup):
        d = copy.deepcopy(data)
        convert_to_base64(d)

    times = []
    for _ in range(iterations):
        d = copy.deepcopy(data)
        start = time.perf_counter()
        convert_to_base64(d)
        elapsed = time.perf_counter() - start
        times.append(elapsed * 1000)

    return {
        "mean_ms": statistics.mean(times),
        "median_ms": statistics.median(times),
        "stdev_ms": statistics.stdev(times),
        "min_ms": min(times),
        "max_ms": max(times),
    }


def bench_to_typed_array_spec(warmup=3, iterations=50):
    import numpy as np
    from _plotly_utils.utils import to_typed_array_spec

    np.random.seed(42)
    arr = np.random.rand(100_000).astype("float64")

    for _ in range(warmup):
        to_typed_array_spec(arr)

    times = []
    for _ in range(iterations):
        start = time.perf_counter()
        to_typed_array_spec(arr)
        elapsed = time.perf_counter() - start
        times.append(elapsed * 1000)

    return {
        "mean_ms": statistics.mean(times),
        "median_ms": statistics.median(times),
        "stdev_ms": statistics.stdev(times),
        "min_ms": min(times),
        "max_ms": max(times),
    }


if __name__ == "__main__":
    label = sys.argv[1] if len(sys.argv) > 1 else "unknown"
    fig = setup()

    print("=" * 60)
    print(f"Benchmarking: {label}")
    print("=" * 60)

    print("\n--- to_typed_array_spec (100K float64) ---")
    r = bench_to_typed_array_spec()
    print(f"  mean:   {r['mean_ms']:.2f} ms")
    print(f"  median: {r['median_ms']:.2f} ms")
    print(f"  stdev:  {r['stdev_ms']:.2f} ms")

    print("\n--- convert_to_base64 (5 traces x 100K) ---")
    r2 = bench_convert_to_base64(fig)
    print(f"  mean:   {r2['mean_ms']:.2f} ms")
    print(f"  median: {r2['median_ms']:.2f} ms")
    print(f"  stdev:  {r2['stdev_ms']:.2f} ms")

    print("\n--- to_dict (full, 5 traces x 100K) ---")
    r3 = bench_to_dict(fig)
    print(f"  mean:   {r3['mean_ms']:.2f} ms")
    print(f"  median: {r3['median_ms']:.2f} ms")
    print(f"  stdev:  {r3['stdev_ms']:.2f} ms")

    results = {
        "label": label,
        "to_typed_array_spec": r,
        "convert_to_base64": r2,
        "to_dict": r3,
    }

    fname = f"results_{label}.json"
    with open(fname, "w") as f:
        json.dump(results, f, indent=2)
    print(f"\nResults saved to {fname}")

Testing

All 1776 core + utils tests pass (1 pre-existing failure unrelated to changes: missing requests module)
Correctness verified across: small/medium/large figures, 2D arrays, int64 downcasting, geojson skipping, original figure not mutated
ruff format passes
CHANGELOG updated

I have followed the PR Guidelines
I have followed the Community Code of Conduct
I have added an entry to the CHANGELOG

Three changes to the hot path hit by every fig.show(), write_html(), to_json(), and write_image() call: 1. to_typed_array_spec: replace copy_to_readonly_numpy_array (which copies the array, wraps through narwhals, and sets readonly flag) with a lightweight np.asarray — the input is already a deepcopy from to_dict(), so copying again is pure waste. 2. convert_to_base64: replace is_homogeneous_array (which checks numpy, pandas, narwhals, and __array_interface__) with a direct isinstance(value, np.ndarray) check. In the to_dict() context, data is already validated and stored as numpy arrays. 3. is_skipped_key: replace list scan with frozenset lookup (O(1)). Profile results (10 traces × 100K points, 20 calls): to_typed_array_spec: 1811ms → 1097ms (40% faster) copy_to_readonly_numpy_array: 226ms → 0ms (eliminated) narwhals from_native: 68ms → 0ms (eliminated) is_skipped_key: 41ms → ~0ms (eliminated)

emilykl · 2026-04-16T14:14:55Z

Hi @KRRT7, thanks for the contribution.

cProfile of to_dict() called 20× on 10 traces × 100K float64 points

What is the total runtime of to_dict() in this profiling test? I want to understand how significant the speedup is in the context of the entire to_dict() call.

In convert_to_base64, when iterating list/tuple elements, only recurse into dicts, lists, and tuples. Strings and numbers can never contain numpy arrays, so recursing into them wastes ~500K function calls on figures with large text arrays.

KRRT7 · 2026-04-16T17:49:00Z

Thanks for looking at this, @emilykl!

I've added isolated VM benchmarks to the PR description. The headline number: to_dict() end-to-end goes from 205.71ms → 176.51ms (14% faster, 1.17×) on a figure with 5 traces × 100K points each.

Measured on a dedicated Azure Standard_D2s_v5 (2 vCPUs, Ubuntu 24.04, Python 3.12.3) with separate clones/venvs for baseline vs optimized. Sub-1ms stdev across 20 iterations. The benchmark script is in the PR body if you'd like to reproduce.

KRRT7 · 2026-04-16T18:39:15Z

To give some broader context — we've been profiling plotly.py's hot paths end-to-end and have a few more optimizations in the pipeline (e.g., #5576 for ColorValidator). We wanted to lead with concrete numbers to show we're serious about this.

That said, we'd love your input on where to focus next. You and the team have the best sense of which workflows and codepaths matter most to users in practice. If there are specific bottlenecks you've been wanting to address — or areas where users have reported slowness — we'd be happy to target those. Would rather align with your priorities than optimize in a vacuum.

KRRT7 added 2 commits April 16, 2026 08:12

chore: add CHANGELOG entry for to_dict optimization

db79f58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: optimize to_dict() serialization — 40% faster for data-heavy figures#5577

perf: optimize to_dict() serialization — 40% faster for data-heavy figures#5577
KRRT7 wants to merge 3 commits intoplotly:mainfrom
KRRT7:perf/to-dict-serialization

KRRT7 commented Apr 16, 2026 •

edited

Loading

Uh oh!

emilykl commented Apr 16, 2026

Uh oh!

KRRT7 commented Apr 16, 2026

Uh oh!

KRRT7 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

KRRT7 commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Changes

1. to_typed_array_spec: eliminate redundant array copy

2. convert_to_base64: fast numpy detection + skip non-container recursion

3. is_skipped_key: frozenset instead of list scan

Benchmarks

Testing

Uh oh!

emilykl commented Apr 16, 2026

Uh oh!

KRRT7 commented Apr 16, 2026

Uh oh!

KRRT7 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KRRT7 commented Apr 16, 2026 •

edited

Loading

1. `to_typed_array_spec`: eliminate redundant array copy

2. `convert_to_base64`: fast numpy detection + skip non-container recursion

3. `is_skipped_key`: frozenset instead of list scan