Skip to content

chroma: gate candidates on the musicbrainz plugin being enabled#6522

Open
AnonymousAAArdvark wants to merge 5 commits intobeetbox:masterfrom
AnonymousAAArdvark:fix/chroma-gate-musicbrainz-6212
Open

chroma: gate candidates on the musicbrainz plugin being enabled#6522
AnonymousAAArdvark wants to merge 5 commits intobeetbox:masterfrom
AnonymousAAArdvark:fix/chroma-gate-musicbrainz-6212

Conversation

@AnonymousAAArdvark
Copy link
Copy Markdown

Summary

Fixes #6212. The chroma plugin used Acoustid fingerprinting to look
up MusicBrainz release and recording IDs, then unconditionally resolved
those IDs via MusicBrainz — even when the user had not enabled the
musicbrainz plugin. The result: users with (e.g.) only
chroma + spotify enabled would still see MusicBrainz-sourced
album and track candidates appear in the autotagger, defeating their
choice to opt out of MusicBrainz as a metadata source.

This PR replaces the direct MusicBrainzPlugin() instantiation with a
lookup through metadata_plugins.get_metadata_source(\"musicbrainz\").
When the musicbrainz plugin is not loaded, both candidates and
item_candidates short-circuit and return empty. Acoustid fingerprinting
itself is unaffected — the acoustid_id and acoustid_fingerprint
item fields are still populated exactly as before.

Note that the original issue thread mentions only the album-level
candidates path, but the same bug exists in item_candidates (the
singleton/track path). Both are fixed.

Bonus fix

The previous cached_property mb = MusicBrainzPlugin() pattern created a
second MusicBrainzPlugin instance outside the plugin registry. This
silently bypassed beetsplug/mbpseudo.py, which registers a replacement
instance for the musicbrainz plugin — meaning chroma-triggered lookups were
never routed through the pseudo-release-aware code. Routing through
get_metadata_source fixes this as a side effect: chroma now uses the
same shared instance as the rest of beets.

Alternative considered

The issue reporter's maintainer response suggested a richer approach:
extract cross-reference IDs (Spotify, Discogs, etc.) from the MusicBrainz
release response and route them to whichever metadata plugins are
enabled. That's a good direction, but (a) it only applies to
candidates — MusicBrainz recording responses don't carry cross-source
track IDs, so item_candidates needs the gating fix regardless; and
(b) it would still want get_metadata_source as its foundation.
This PR is intentionally scoped as the minimal gating fix; I'd be happy
to follow up with the cross-reference routing as a separate PR if the
maintainers want it.

Test plan

  • New unit tests in test/plugins/test_chroma.py covering all four
    cross-product cases: candidates / item_candidates × with /
    without the musicbrainz plugin loaded. The without musicbrainz
    cases directly pin the issue-6212 regression.
  • Existing chroma tests (chromasearch_*) still pass.
  • Full focused regression subset passes (144 tests across
    test/plugins/test_chroma.py, test/plugins/test_musicbrainz.py,
    test/test_metadata_plugins.py, test/autotag/test_match.py,
    test/autotag/test_distance.py).
  • ruff check + ruff format --check clean on changed files.
  • mypy beetsplug/chroma.py clean.
  • sphinx-lint docs/plugins/chroma.rst docs/changelog.rst clean.
  • Changelog entry added under Unreleased → Bug fixes.
  • Note added to docs/plugins/chroma.rst explaining that chroma
    requires musicbrainz to produce candidates.

Disclosure: this contribution was made while working on an open-source
assignment for EECS 481 (Software Engineering) at the University of
Michigan.

The chroma plugin uses Acoustid fingerprinting, which returns MusicBrainz
release and recording IDs. It then unconditionally resolved those IDs
through MusicBrainz even when the user had not enabled the musicbrainz
plugin, so MusicBrainz-sourced candidates still appeared during tagging
regardless of the user's plugin configuration.

Instead of instantiating ``MusicBrainzPlugin`` directly as a
``cached_property``, look up the loaded instance through
``metadata_plugins.get_metadata_source("musicbrainz")``. When that
returns ``None`` (the musicbrainz plugin is not loaded), short-circuit
both ``candidates`` and ``item_candidates`` to return empty so no
MusicBrainz-sourced candidates are produced. Acoustid fingerprinting and
the ``acoustid_id`` / ``acoustid_fingerprint`` item fields are
unaffected.

This also removes a latent bug: the previous direct instantiation
bypassed the plugin registry, so any plugin that swapped the musicbrainz
plugin at runtime (for example, ``mbpseudo``) was silently ignored on
the chroma path.

Fixes beetbox#6212
@AnonymousAAArdvark AnonymousAAArdvark requested a review from a team as a code owner April 10, 2026 23:26
@github-actions github-actions bot added chroma chroma plugin musicbrainz musicbrainz plugin labels Apr 10, 2026
@AnonymousAAArdvark
Copy link
Copy Markdown
Author

AnonymousAAArdvark commented Apr 10, 2026

Quick note on the CI state:

Check docs: real failure docstrfmt --preserve-adornments wanted my changelog entry wrapped slightly differently than I had it. Fixed in 14b26f0.
Run tests (windows-latest, 3.10–3.13): these failed at the choco install mp3gain step with an HTTP 499 from community.chocolatey.org/api/v2/ (example job log). Looks like a transient Chocolatey infrastructure hiccup.

@semohr semohr self-assigned this Apr 11, 2026
@amogus07
Copy link
Copy Markdown
Contributor

Bonus fix

The previous cached_property mb = MusicBrainzPlugin() pattern created a second MusicBrainzPlugin instance outside the plugin registry. This silently bypassed beetsplug/mbpseudo.py, which registers a replacement instance for the musicbrainz plugin — meaning chroma-triggered lookups were never routed through the pseudo-release-aware code. Routing through get_metadata_source fixes this as a side effect: chroma now uses the same shared instance as the rest of beets.

would this fix #6441?

@semohr
Copy link
Copy Markdown
Contributor

semohr commented Apr 11, 2026

As it is yes it would fix the issue with mbpseudo.

As I said in the other comments tho it feels a bit like a hacky and lazy way out. Musibrainz does give identifiers for other metadata sources which we could use to resolve to the proper (enabled) metadata plugin.

But maybe these are two different issues to tackle.

…rces

In response to review feedback on beetbox#6212, rework the fix so chroma does
not simply gate on the musicbrainz plugin being enabled. Instead:

1. Chroma always queries MusicBrainz to resolve acoustid release IDs
   (since that is the only identifier acoustid returns), preferring
   the loaded ``musicbrainz`` plugin instance from the metadata-source
   registry and falling back to a private, unregistered instance when
   the user has not enabled ``musicbrainz``.

2. The new ``extra_external_sources`` keyword argument on
   ``MusicBrainzPlugin.album_for_id`` / ``album_info`` lets chroma
   force-extract cross-reference IDs from ``url-relations`` for
   exactly the external sources whose metadata-source plugins are
   loaded (Discogs, Bandcamp, Spotify, Deezer, Tidal), without
   requiring the user to globally opt into ``musicbrainz.external_ids``.

3. For each resolved MusicBrainz ``AlbumInfo``, chroma yields the MB
   candidate (when the musicbrainz plugin is loaded) and dispatches
   to each loaded external metadata plugin via its ``album_for_id``
   when the corresponding ``{source}_album_id`` attribute is set on
   the ``AlbumInfo``. A user running ``chroma`` with ``spotify`` but
   without ``musicbrainz`` now receives Spotify-sourced candidates
   from acoustid matches instead of nothing.

``item_candidates`` (the singleton track path) still requires the
musicbrainz plugin because MusicBrainz recording responses do not
carry cross-source track identifiers; this is documented in
``docs/plugins/chroma.rst``.

Tests updated to cover all four cross-product cases:
no-compatible-sources, musicbrainz-only, external-only, and
musicbrainz+external.
@AnonymousAAArdvark
Copy link
Copy Markdown
Author

Thanks for the review @semohr, I've reworked the PR to do the cross-reference routing instead of gating. Latest commit bc956db:

What chroma does now:

  1. Always queries MusicBrainz to resolve the acoustid release ID (prefers the loaded musicbrainz plugin from the registry so mbpseudo still takes effect; falls back to a private, transient MusicBrainzPlugin instance when the user hasn't enabled musicbrainz).
  2. Passes a new extra_external_sources keyword argument to MusicBrainzPlugin.album_for_id / album_info so chroma can force-extract cross-reference IDs for exactly the external sources whose metadata plugins are loaded — without asking the user to globally opt into musicbrainz.external_ids.
  3. For each resolved MB AlbumInfo, yields the MB candidate (only when the musicbrainz plugin itself is loaded) and dispatches to each loaded external metadata plugin via album_for_id(info.{source}_album_id) for the Discogs / Bandcamp / Spotify / Deezer / Tidal cross-references in url-relations.

So:

  • chroma + spotify (no musicbrainz) → Spotify candidates for releases MB has a Spotify url-relation for. Addresses your AC/DC example directly.
  • chroma + musicbrainz → MB candidates, unchanged from today.
  • chroma + musicbrainz + spotify → both kinds of candidates.
  • chroma alone (no MB, no cross-ref targets) → empty, no network call made.

Test coverage is in test/plugins/test_chroma.py. Four classes covering all four cross-product cases (no compatible sources / MB only / external only / both), plus asserts that extra_external_sources is passed correctly and that external plugins are only called when their cross-ref ID is present on the MB AlbumInfo. Full suite (2234 tests) still passes locally; ruff / docstrfmt / sphinx-lint clean.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 11, 2026

Codecov Report

❌ Patch coverage is 78.68852% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.80%. Comparing base (72b1118) to head (f0934e2).
⚠️ Report is 32 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beetsplug/chroma.py 77.58% 10 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6522      +/-   ##
==========================================
+ Coverage   70.55%   70.80%   +0.25%     
==========================================
  Files         148      150       +2     
  Lines       18820    18965     +145     
  Branches     3066     3090      +24     
==========================================
+ Hits        13279    13429     +150     
+ Misses       4894     4877      -17     
- Partials      647      659      +12     
Files with missing lines Coverage Δ
beetsplug/mbpseudo.py 79.87% <100.00%> (ø)
beetsplug/chroma.py 49.83% <77.58%> (+13.29%) ⬆️

... and 15 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The previous commit added a new keyword-only ``extra_external_sources``
parameter to ``MusicBrainzPlugin.album_info`` so :doc:`plugins/chroma`
can force extraction of cross-reference IDs for enabled metadata
source plugins. ``MusicBrainzPseudoReleasePlugin.album_info`` overrides
that method and needs the matching signature (mypy --override) and
should forward the flag through both its own ``super().album_info``
calls so the cross-reference routing also works on releases that get
redirected to a pseudo release.
@semohr
Copy link
Copy Markdown
Contributor

semohr commented Apr 11, 2026

Neat, this is more or less how I imagined it. Thanks for the initiative 🙏

As the changes touch musicbrainz now I would like to ping @snejus. Adding a additional argument to album_for_id seems like a bad idea to me. Any ideas for a less intrusive workaround?

Im gona review this properly once we are okay with the high level decicion on how to integrate musicbrainz.

@cached_property
def mb(self) -> MusicBrainzPlugin:
return MusicBrainzPlugin()
def _musicbrainz_client(self) -> MusicBrainzPlugin:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this a cached property. Should simplify this slightly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chroma chroma plugin musicbrainz musicbrainz plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

musicbrainz metadata source is used when the musicbrainz plugin is not activated

3 participants