Draft
Conversation
…r good; needs more auditing, and all the debugger/writeline/assert stuff removed
badrishc
reviewed
Apr 11, 2026
| /// (2) in Metadata, to track per-thread epoch table entries for each LightEpoch instance. | ||
| /// </summary> | ||
| [StructLayout(LayoutKind.Explicit, Size = MaxInstances * sizeof(int))] | ||
| [InlineArray(MaxInstances)] |
Collaborator
There was a problem hiding this comment.
If this is not proven to be necessary, we can retain the prior version. I initially did InlineArray but noted some inefficiencies which led to the current struct design.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Four separate fixes here, aimed at improving Vector Set migration test reliability.
1. Async all migration tasks
Rarely (but not so rarely it doesn't happen in CIs) migrations can deadlock on this
.Result.I suspect, but cannot prove, this is because we block an IOCP thread that would serve the read side of the request (where
MIGRATE ...would be executed) on the destination node. If I'm right, this means we only suffer this in tests where we host all the Garnet nodes in the same process.The fix is to punch async/await all the way down so we don't have to
.Resultthere.2. Handle
TransmitSlotfailuresThere are a number of places where we ignored the result of a
Task<bool>orTask<bool>[]in migrations. This is unlikely to cause test flakiness directly (they'd still fail if the task failed) but made debugging very painful.3. Pulled in a separate fix to
SessionParseStateI never this directly cause an issue, but it's a issue reported separately in the same general area of migration/replication.
4. Removed pointers from
LightEpoch.InstanceIndexBufferIf you'd asked me last week if the code in
InstanceIndexBuffer.GetRefis safe I'd have said yes. I would expect[ThreadStatic]structto be implicitly pinned. But, rarely, I have seen some evidence that it was relocated. Nothing definitive but out of an abundance of caution I've redone it in equivalent but safe-ish code.Safe-ish because I'm still doing a
Unsafe.Add, but we never have a pointer in a local so any movement betweenref field0andUnsafe.Addwill be correctly updated by the GC.TODOs
Other thoughts
Because it's Friday, writing this down for Monday:
CanOperateOnKeyandWaitForSlotToStabalizeare also risky