Refactor RetryStrategy to be async#671
Conversation
This refactors retry strategies to be async. This is needed for when strategies need to internally wait or if they need to protect access to a shared resource.
ubaskota
left a comment
There was a problem hiding this comment.
Thanks for the PR. It looks good overall. I have some clarifying questions inline where I could add them, plus two more below that didn't fit in any specific part:
-
The PR description mentions long-poll operations as motivation, and notes that a strategy "may need to wait internally." The current implementations only do computation (backoff math, quota checks) and no I/O operations. What does "waiting internally" mean concretely here?
-
Should
designs/retries.mdfile be updated with the change? Adding it here since GitHub doesn't let me comment on unchanged files.
|
|
||
| @lru_cache | ||
| def _create_retry_strategy( | ||
| self, retry_mode: RetryStrategyType, max_attempts: int | None |
There was a problem hiding this comment.
_create_retry_strategy is synchronous with @lru_cache. Is this intentional? It's just creating and returning retry strategy today, but do you see a case where this would need to be async?
| """ | ||
| self._max_capacity = initial_capacity | ||
| self._available_capacity = initial_capacity | ||
| self._lock = threading.Lock() |
There was a problem hiding this comment.
Is leaving threading.Lock intentional? If so, do you think it's worth adding a short comment explaining the choice? And do you see a scenario where we may need to make this async?
This refactors RetryStrategy to be async. This is necessary because a strategy may need to use a synchronization primitive or may need to internally wait.
For example, AWS retry strategies may need to wait internally even if a retry would not be permitted if the operation is a long-poll operation.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.