Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Uncordon the node during failed updates
Today we cordon the node before we write updates to the node. This means that if a file write fails (e.g. failed to create a directory), we fail the update but the node stays cordoned. This will cause deadlocks as the node annotation for desired config will no longer be updated. With the rollback added, if you delete the erroneous machineconfig in question, we will be able to auto-recover from failed writes, like we do for failed reconciliation. The side effect of this is that the node will flip between Ready and Ready,Unschedulable, since each time we receive a node event we will attempt to update again and go through the full process. Signed-off-by: Yu Qi Zhang <[email protected]>
- Loading branch information