priority: prevent init timer restarting when a child transitions from CONNECTING to CONNECTING#8813
Conversation
…NECTING to CONNECTING
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8813 +/- ##
==========================================
+ Coverage 83.25% 83.32% +0.06%
==========================================
Files 417 417
Lines 32978 32980 +2
==========================================
+ Hits 27457 27479 +22
+ Misses 4106 4090 -16
+ Partials 1415 1411 -4
🚀 New features to boost your workflow:
|
| select { | ||
| case <-initTimerStarted: | ||
| t.Fatalf("Init timer restarted when subchannel moved from Ready to Idle") | ||
| case <-sCtx.Done(): |
There was a problem hiding this comment.
If UpdateState is asynchronous, and happens after sCtx.Done, it can give false positives. This race can happen if the test environment is very slow
There was a problem hiding this comment.
UpdateState is not asynchronous. See here.
There was a problem hiding this comment.
This is a pattern that we use very commonly in our tests to check for things that we expect not to happen. This is a tradeoff between reasonable test execution times and correctness/flakiness. I don't recall having any problems with this approach so far though. But if you have an idea for how we could do this better, I'd be happy to hear. Thanks.
| return | ||
| } | ||
| child.state = s | ||
| curState := child.state |
| child.stopInitTimer() | ||
| case connectivity.Connecting: | ||
| if !child.reportedTF { | ||
| // Skip restarting the timer if the child was already in Connecting. |
There was a problem hiding this comment.
Would you mind documenting the whole if? Why are we not starting it in the !reportedTF case?
There was a problem hiding this comment.
Done. It's a bit long now, but I had to read a bunch of stuff to get to the history behind the !reportedTF.
… CONNECTING to CONNECTING (grpc#8813) Fixes grpc#8516 This PR ensures that the init timer is not restarted when a child policy transitions from CONNECTING to CONNECTING. This ensures that we will only ever wait for `DefaultPriorityInitTimeout` which is set to a value of `10s` for a given child to become `Ready` before we attempt a failover to the next highest priority child. RELEASE NOTES: - xds/priority: Fixed a bug causing delayed failover to lower-priority clusters when a higher-priority cluster is stuck in the CONNECTING state.
Fixes #8516
This PR ensures that the init timer is not restarted when a child policy transitions from CONNECTING to CONNECTING. This ensures that we will only ever wait for
DefaultPriorityInitTimeoutwhich is set to a value of10sfor a given child to becomeReadybefore we attempt a failover to the next highest priority child.RELEASE NOTES: