Conversation
Separated out of the larger PR around SchemaDesigner.
VishnuPriyaChandraSekar
left a comment
There was a problem hiding this comment.
I have just started reviewing the Solr code base.
I just left a minor comment other than that, PR seems to be fine.
gerlowskija
left a comment
There was a problem hiding this comment.
This looks pretty good overall. I left some inline comments, mostly small things.
I do have one larger question here though: I gather much of this functionality already existed under the aegis of the SchemaDesigner, but I don't see any deletions or modifications to those files in this PR. Shouldn't this PR be deleting the "download-configset" and "get-configset-file" APIs that exist over in schema-designer land?
| @Path("/configsets/{configSetName}") | ||
| interface Download { | ||
| @GET | ||
| @Path("/download") |
There was a problem hiding this comment.
[-1] /download isn't very REST-ful. The fact that we're fetching/downloading the resource is already kindof implied by the HTTP 'GET' verb/method.
Could we drop it and make this "just" GET /api/configsets/{configsetName}?
There was a problem hiding this comment.
so we could, and I was thinking about it and then asked Claude for soem suggestions, and he kind of came back to "/download" is easy. Here is the write up:
RESTful Alternatives
Option 1: Keep current design ✅ (Recommended)
GET /api/configsets → List (JSON)
GET /api/configsets/{name}/download → Download ZIP
GET /api/configsets/{name}/{filePath} → Get single file
Pros:
- Explicit and unambiguous
- Self-documenting (
/downloadtells you what it does) - No conflicts with file paths
- Common pattern (GitHub uses
/archive, GitLab uses/repository/archive, etc.)
Cons:
- Not purely resource-oriented REST
Option 2: Use /files sub-resource
GET /api/configsets/{name} → Metadata (JSON) - NEW
GET /api/configsets/{name}/files → Download all as ZIP
GET /api/configsets/{name}/files/{path} → Get single file
Pros:
- More RESTful hierarchy
- Room for metadata endpoint
Cons:
- Breaking change to existing API
- More complex paths
Option 3: Content negotiation (Pure REST)
GET /api/configsets/{name}
Accept: application/json → Metadata
Accept: application/zip → ZIP download
Pros:
- True REST content negotiation
Cons:
- Solr doesn't typically use
Acceptheaders this way - Harder to use with curl/browsers
- Still conflicts with
/{filePath}pattern
Recommendation
Keep the current design (/download) because:
- ✅ Unambiguous - no path conflicts
- ✅ Self-documenting - clear what you get back
- ✅ Pragmatic - works well without complex content negotiation
- ✅ Common pattern - many APIs use action-style endpoints for different representations
- ✅ Backward compatible - if we add metadata later, we can use
GET /api/configsets/{name}for JSON
There was a problem hiding this comment.
If we had another places that we did the zip download, then I could see making them all work the same, with whatever pattern we picked...
There was a problem hiding this comment.
I was thinking about it and then asked Claude for soem suggestions, and he kind of came back to
I don't mean to be glib, but I don't really care what Claude thinks - I care what you think! Maybe by quoting Claude you're implying you agree? But I'd love to hear which option you think is better, and why!
If we had another places that we did the zip download, then I could see making them all work the same
Idk about ZIP downloads, that's admittedly a little more niche 🤷
What is clear to me is that we have a bunch of APIs that let you "download some thing" or "download some part of some thing", and none of those APIs take the /download approach suggested by your LLM.
- Download a file from the filestore:
GET /api/cluster/filestore/files/<path> - Download an index file from leader:
GET /api/cores/coreName/replication/files/<path> - Fetch the data in a ZooKeeper node:
GET /api/cluster/zookeeper/data/<path> - etc.
We've got an established pattern here: rely on the HTTP verb to imply "download", and offer a "files/" sub-path as a way to support individual file download. If we want to deviate from that there should be a pretty compelling reason.
And while I hear Claude's point that we may want to reserve /api/configsets/<name> for some future metadata at some point...that doesn't feel very compelling to me. "configset metadata" isn't a thing in Solr today. And as Claude pointed out there are options (e.g. Accept header) for adding that pretty seamlessly down the road.
To be transparent, if you personally feel strongly about /download I'm not gonna veto the PR or anything. But Claude's argument for /download doesn't make a ton of sense to me, and I think we should be intentional about these endpoints.
@VishnuPriyaChandraSekar by the way, we need more code reviewers for the PR's in Solr. With AI, it's easier then ever to generate code, but reviewing it really isn't any easier. It would be great if you reviewed more PRs, as that would help contributors get some early feedback! I believe you can use the Code Review feature and approve a PR, even if you aren't a committer, you just can't merge. |
…the cloud version does. Don't love that I'm improving standalone hwen I want to eliminate it.
Use a binary format, don't try and provide path metadata...
|
Lots of progress, I think I am down to doc fixes. |
|
Working through the docs.. I am going to split uploading a complete configset from the putting of a single file in an existing configset. Right now the methods are on the same upload interface, and it's a bit confusing. |
…f classes Keeps the overall pattern, but simplifies docs by having it's own docs.
… in SOLR-17584 that are still around. Simplifies the tests.
|
@gerlowskija I couldn't help it, I also found some remnants of the trusted configset concept that i could remove in this... Simplified osme of the tests! I think I'm ready for final review! |
|
Test all pass locally. |
shakes fist at cloud I think you know what I think about bundling that into this PR 😛 You definitely get a "pass" though IMO since this PR itself was split off from a different one haha To re-state my concern/issue for others though: I get that "this should be a separate PR" is really whiny and really pedantic. But in a post-AI world where it's easier than ever to write PRs (but no easier to review them) it's worth doing whatever we can to make the review side of things easier. Including being a little more militant about this PR-scope-creep stuff. It's not just a bit of change to the diff - now anyone that wants to do a thorough review of this PR has to understand all of the context around trusted configsets, why we removed them recently, and whether there was a reason this code wasn't included in that removal, etc. (To save others the work: SOLR-17584 removed "trusted configsets", and it was intended to get all the logic, so we're 👍 to remove any vestigial bits here.) I'll get off my soapbox, and I won't ask that you break things up at this point. Mainly hoping I can win hearts and minds on this for the future 😛 |
gerlowskija
left a comment
There was a problem hiding this comment.
Definitely moving in the right direction, but still a few things that should be tidied up before committing. Hopefully all minor stuff.
One larger PR-level question I'm still hoping for clarification on though: I get the impression that the "configset download" capability in this PR is migrated from an existing schema-designer API. But I don't see any deletions or removals over in the schema-designer code. Am I misunderstanding things here? Or am I correct about the duplication, but it's something you're doing intentionally at this point? Could just use a little more context on that...
| @Path("/configsets/{configSetName}") | ||
| interface PutFile { | ||
| @PUT | ||
| @Path("{filePath:.+}") |
There was a problem hiding this comment.
[-0] We should probably add /files to the path here to align with what we now have for getConfigSetFile above.
| } | ||
|
|
||
| if (overwriteOnExists || !Files.exists(configsetFilePath)) { | ||
| // Create parent directories if they don't exist (similar to ZK's makePath) |
There was a problem hiding this comment.
[Q] Why is this code here?
This feels kindof suspect at a glance, from a security perspective. The risk is a bit lower in ZK, but since we're on the filesystem here we probably shouldn't blindly be creating directories based on a user-specified parameter. If we're 100% certain that there's no path-escape or any other funny-business possible here, maybe this is fine. But I'd think this through really carefully if you haven't already.
| * @param configSetId the internal configset name to download (may differ from displayName, e.g. | ||
| * for schema-designer's mutable copies) | ||
| * @param displayName the user-visible name used to derive the download filename | ||
| * @param configSetId the internal configset name to download |
There was a problem hiding this comment.
[0] "internal" probably isn't needed here anymore.
| Boolean cleanup, | ||
| InputStream requestBody) | ||
| throws IOException { | ||
| public SolrJerseyResponse putConfigSetFile( |
There was a problem hiding this comment.
[0] I liked the old method name better, fwiw. uploadConfigSetFile is clearer than putConfigSetFile
| SolrJerseyResponse putConfigSetFile( | ||
| @PathParam("configSetName") String configSetName, | ||
| @PathParam("filePath") String filePath, | ||
| @QueryParam("overwrite") Boolean overwrite, |
There was a problem hiding this comment.
[Q] Can you please add some context on why these boolean flags are going away?
'Overwrite' in particular seems like a nice safety check that I imagine users would like to use.
| assertEquals(filePath, response.path); | ||
| assertEquals("", response.content); | ||
| // Verify PNG signature is intact | ||
| assertEquals((byte) 0x89, responseBytes[0]); |
There was a problem hiding this comment.
[-0] We already did an 'assertArrayEquals' above to validate that we received exactly what we expected...what do these assertions provide in addition to that?
| final var api = new UploadConfigSet(mockCoreContainer, null, null); | ||
| final var response = api.putConfigSetFile(configSetName, filePath, fileStream); | ||
|
|
||
| assertNotNull(response); |
There was a problem hiding this comment.
[0] IMO these not-null checks don't provide a ton of value. If you want to validate the response, look at the responseHeader 'status' field, or make sure that the 'error' property is null/empty, or...
| } | ||
|
|
||
| /** Creates a configset with files on disk for testing overwrites. */ | ||
| private void createExistingConfigSet(String configSetName, String... filePathAndContent) |
There was a problem hiding this comment.
[-1] Nothing ever calls this.
| // Test with empty string | ||
| final var ex = | ||
| assertThrows( | ||
| SolrException.class, () -> api.putConfigSetFile(configSetName, "", fileStream)); |
There was a problem hiding this comment.
[0] Are we pre-creating 'configSetName' as a configset in some way? (Or is ConfigSetService mocked in some way such that it always exists?)
If not, shouldn't the API be returning a 404 or something bc the configset doesn't exist?
|
|
||
| === API Changes | ||
|
|
||
| Solr 10.1 changes the V1 API signature for uploading a single file to remove the overwrite parameter. Previously the default for `overwrite` is false when using the v1 API, but true when using the v2 API. |
There was a problem hiding this comment.
[-0] I'm not sure I understand what this means. Does "API signature" here mean the method-signature in the Java code? Or are saying that the v1 API no longer supports this param at all?
I must've missed it earlier on in my review, but if we're removing support for a parameter from v1 can you share a bit more context on why that was done?
https://issues.apache.org/jira/browse/SOLR-15701
Description
Add Download and GetFile and PutFile to ConfigSets API and SolrJ.
Solution
This was extracted with copilots help from the SchemaDesigner code base. We are going to merge this as part of ConfigSets API. In a future PR, once this is done, we'll finish the currently started SchemaDesigner work to use these APIs.
The PutFile was broken out of the Upload feature into it's own files with it's own parameters. Before it was inter twingled with Upload of a zipped configset.
Tests
Added some tests.