chroma: gate candidates on the musicbrainz plugin being enabled#6522
chroma: gate candidates on the musicbrainz plugin being enabled#6522AnonymousAAArdvark wants to merge 5 commits intobeetbox:masterfrom
Conversation
The chroma plugin uses Acoustid fingerprinting, which returns MusicBrainz
release and recording IDs. It then unconditionally resolved those IDs
through MusicBrainz even when the user had not enabled the musicbrainz
plugin, so MusicBrainz-sourced candidates still appeared during tagging
regardless of the user's plugin configuration.
Instead of instantiating ``MusicBrainzPlugin`` directly as a
``cached_property``, look up the loaded instance through
``metadata_plugins.get_metadata_source("musicbrainz")``. When that
returns ``None`` (the musicbrainz plugin is not loaded), short-circuit
both ``candidates`` and ``item_candidates`` to return empty so no
MusicBrainz-sourced candidates are produced. Acoustid fingerprinting and
the ``acoustid_id`` / ``acoustid_fingerprint`` item fields are
unaffected.
This also removes a latent bug: the previous direct instantiation
bypassed the plugin registry, so any plugin that swapped the musicbrainz
plugin at runtime (for example, ``mbpseudo``) was silently ignored on
the chroma path.
Fixes beetbox#6212
|
Quick note on the CI state: Check docs: real failure docstrfmt --preserve-adornments wanted my changelog entry wrapped slightly differently than I had it. Fixed in 14b26f0. |
would this fix #6441? |
|
As it is yes it would fix the issue with mbpseudo. As I said in the other comments tho it feels a bit like a hacky and lazy way out. Musibrainz does give identifiers for other metadata sources which we could use to resolve to the proper (enabled) metadata plugin. But maybe these are two different issues to tackle. |
…rces In response to review feedback on beetbox#6212, rework the fix so chroma does not simply gate on the musicbrainz plugin being enabled. Instead: 1. Chroma always queries MusicBrainz to resolve acoustid release IDs (since that is the only identifier acoustid returns), preferring the loaded ``musicbrainz`` plugin instance from the metadata-source registry and falling back to a private, unregistered instance when the user has not enabled ``musicbrainz``. 2. The new ``extra_external_sources`` keyword argument on ``MusicBrainzPlugin.album_for_id`` / ``album_info`` lets chroma force-extract cross-reference IDs from ``url-relations`` for exactly the external sources whose metadata-source plugins are loaded (Discogs, Bandcamp, Spotify, Deezer, Tidal), without requiring the user to globally opt into ``musicbrainz.external_ids``. 3. For each resolved MusicBrainz ``AlbumInfo``, chroma yields the MB candidate (when the musicbrainz plugin is loaded) and dispatches to each loaded external metadata plugin via its ``album_for_id`` when the corresponding ``{source}_album_id`` attribute is set on the ``AlbumInfo``. A user running ``chroma`` with ``spotify`` but without ``musicbrainz`` now receives Spotify-sourced candidates from acoustid matches instead of nothing. ``item_candidates`` (the singleton track path) still requires the musicbrainz plugin because MusicBrainz recording responses do not carry cross-source track identifiers; this is documented in ``docs/plugins/chroma.rst``. Tests updated to cover all four cross-product cases: no-compatible-sources, musicbrainz-only, external-only, and musicbrainz+external.
|
Thanks for the review @semohr, I've reworked the PR to do the cross-reference routing instead of gating. Latest commit bc956db: What chroma does now:
So:
Test coverage is in test/plugins/test_chroma.py. Four classes covering all four cross-product cases (no compatible sources / MB only / external only / both), plus asserts that extra_external_sources is passed correctly and that external plugins are only called when their cross-ref ID is present on the MB AlbumInfo. Full suite (2234 tests) still passes locally; ruff / docstrfmt / sphinx-lint clean. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6522 +/- ##
==========================================
+ Coverage 70.55% 70.80% +0.25%
==========================================
Files 148 150 +2
Lines 18820 18965 +145
Branches 3066 3090 +24
==========================================
+ Hits 13279 13429 +150
+ Misses 4894 4877 -17
- Partials 647 659 +12
🚀 New features to boost your workflow:
|
The previous commit added a new keyword-only ``extra_external_sources`` parameter to ``MusicBrainzPlugin.album_info`` so :doc:`plugins/chroma` can force extraction of cross-reference IDs for enabled metadata source plugins. ``MusicBrainzPseudoReleasePlugin.album_info`` overrides that method and needs the matching signature (mypy --override) and should forward the flag through both its own ``super().album_info`` calls so the cross-reference routing also works on releases that get redirected to a pseudo release.
|
Neat, this is more or less how I imagined it. Thanks for the initiative 🙏 As the changes touch musicbrainz now I would like to ping @snejus. Adding a additional argument to Im gona review this properly once we are okay with the high level decicion on how to integrate musicbrainz. |
| @cached_property | ||
| def mb(self) -> MusicBrainzPlugin: | ||
| return MusicBrainzPlugin() | ||
| def _musicbrainz_client(self) -> MusicBrainzPlugin: |
There was a problem hiding this comment.
We should make this a cached property. Should simplify this slightly.
Summary
Fixes #6212. The
chromaplugin used Acoustid fingerprinting to lookup MusicBrainz release and recording IDs, then unconditionally resolved
those IDs via MusicBrainz — even when the user had not enabled the
musicbrainzplugin. The result: users with (e.g.) onlychroma+spotifyenabled would still see MusicBrainz-sourcedalbum and track candidates appear in the autotagger, defeating their
choice to opt out of MusicBrainz as a metadata source.
This PR replaces the direct
MusicBrainzPlugin()instantiation with alookup through
metadata_plugins.get_metadata_source(\"musicbrainz\").When the musicbrainz plugin is not loaded, both
candidatesanditem_candidatesshort-circuit and return empty. Acoustid fingerprintingitself is unaffected — the
acoustid_idandacoustid_fingerprintitem fields are still populated exactly as before.
Note that the original issue thread mentions only the album-level
candidatespath, but the same bug exists initem_candidates(thesingleton/track path). Both are fixed.
Bonus fix
The previous
cached_property mb = MusicBrainzPlugin()pattern created asecond
MusicBrainzPlugininstance outside the plugin registry. Thissilently bypassed
beetsplug/mbpseudo.py, which registers a replacementinstance for the musicbrainz plugin — meaning chroma-triggered lookups were
never routed through the pseudo-release-aware code. Routing through
get_metadata_sourcefixes this as a side effect: chroma now uses thesame shared instance as the rest of beets.
Alternative considered
The issue reporter's maintainer response suggested a richer approach:
extract cross-reference IDs (Spotify, Discogs, etc.) from the MusicBrainz
release response and route them to whichever metadata plugins are
enabled. That's a good direction, but (a) it only applies to
candidates— MusicBrainz recording responses don't carry cross-sourcetrack IDs, so
item_candidatesneeds the gating fix regardless; and(b) it would still want
get_metadata_sourceas its foundation.This PR is intentionally scoped as the minimal gating fix; I'd be happy
to follow up with the cross-reference routing as a separate PR if the
maintainers want it.
Test plan
test/plugins/test_chroma.pycovering all fourcross-product cases:
candidates/item_candidates× with /without the musicbrainz plugin loaded. The
without musicbrainzcases directly pin the issue-6212 regression.
chromasearch_*) still pass.test/plugins/test_chroma.py,test/plugins/test_musicbrainz.py,test/test_metadata_plugins.py,test/autotag/test_match.py,test/autotag/test_distance.py).ruff check+ruff format --checkclean on changed files.mypy beetsplug/chroma.pyclean.sphinx-lint docs/plugins/chroma.rst docs/changelog.rstclean.docs/plugins/chroma.rstexplaining that chromarequires musicbrainz to produce candidates.
Disclosure: this contribution was made while working on an open-source
assignment for EECS 481 (Software Engineering) at the University of
Michigan.