Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry direct unix package calls if observing EINTR #4637

Merged
merged 1 commit into from
Feb 25, 2025

Conversation

evanphx
Copy link
Contributor

@evanphx evanphx commented Feb 20, 2025

It appears this similar pattern is done elsewhere in libcontainer. Unsure if it needs to be do everywhere unix. is used directly or not.

This was observed during a buildkit run, with this output:

runc run failed: unable to start container process: error during container init: reading from parent failed: fetch packet length from socket: interrupted system call

@evanphx evanphx force-pushed the evanphx/b-eintr branch 3 times, most recently from 5cce86e to ce293b9 Compare February 21, 2025 04:18
@rata
Copy link
Member

rata commented Feb 21, 2025

@evanphx thanks for the PR to fix this! So, you saw this with runc 1.2, right? I guess we will need a backport also, in that case.

rata
rata previously approved these changes Feb 21, 2025
Copy link
Member

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@evanphx can you also open another PR (or add it here, as you prefer) changing other occurences?

In utils/cmsg.go I see unhandled cases for unix.Recvmsg() and unix.Sendmsg(). The manpage says they can return EINTR, and as we are using those through the unix package (the same applies to the syscall package), we need to handle that.

@rata rata added the backport/1.2-todo A PR in main branch which needs to be backported to release-1.2 label Feb 21, 2025
@rata rata self-requested a review February 21, 2025 14:45
@rata rata dismissed their stale review February 21, 2025 14:46

I'll review again when the code is slightly simpler

@evanphx
Copy link
Contributor Author

evanphx commented Feb 21, 2025

@rata I can go through and add the other checks as well, no trouble.

@evanphx evanphx force-pushed the evanphx/b-eintr branch 2 times, most recently from adeef38 to 69f5417 Compare February 21, 2025 19:13
@evanphx
Copy link
Contributor Author

evanphx commented Feb 21, 2025

@rata There ya go, let me know how this looks! I added the guard to a call to unix.Read as the man page also says it returns EINTR.

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evanphx it looks like you forgot to update the commit subject when adding more cases.

Otherwise LGTM.

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also,

  1. You can compare err directly here (as it's done in other places) but will need to add //nolint:errorlint // unix errors are bare annotation). I ever so slightly prefer a direct comparison as it should be faster, but I'm fine either way.

  2. While at it, we can take a look at proper error wrapping, too. It seems that in a few cases we may benefit in wrapping an error in an os.SyscallErr (either directly or using os.NewSyscallError).

@evanphx
Copy link
Contributor Author

evanphx commented Feb 21, 2025

@kolyshkin I'll have a look at those!

Retry Recvfrom, Sendmsg, Readmsg, and Read as they can return EINTR.

Signed-off-by: Evan Phoenix <[email protected]>
@evanphx
Copy link
Contributor Author

evanphx commented Feb 21, 2025

@rata Forgot to mention this earlier, yes, I'm seeing it with 1.2 so we need to backport.

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -40,7 +41,11 @@ func registerMemoryEventV2(cgDir, evName, cgEvName string) (<-chan struct{}, err

for {
n, err := unix.Read(fd, buffer[:])
Copy link
Member

@lifubang lifubang Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, a small suggestion:
Maybe we can use os.File to read data from this fd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice idea, but fd comes from:

        fd, err := unix.InotifyInit()
...
        evFd, err := unix.InotifyAddWatch(fd, filepath.Join(cgDir, evName), unix.IN_MODIFY)
....

And os.File closes the underlying fd when go detects is not used anymore. It might cause issues here, in the inotify stuff. So I'd not do it as part of this PR, we can explore in another one if this works or not IMHO.

Copy link
Member

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot!

@@ -40,7 +41,11 @@ func registerMemoryEventV2(cgDir, evName, cgEvName string) (<-chan struct{}, err

for {
n, err := unix.Read(fd, buffer[:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice idea, but fd comes from:

        fd, err := unix.InotifyInit()
...
        evFd, err := unix.InotifyAddWatch(fd, filepath.Join(cgDir, evName), unix.IN_MODIFY)
....

And os.File closes the underlying fd when go detects is not used anymore. It might cause issues here, in the inotify stuff. So I'd not do it as part of this PR, we can explore in another one if this works or not IMHO.

@rata
Copy link
Member

rata commented Feb 24, 2025

@evanphx wanna do a backport and open a PR against the 1.2 release branch too? :)

@lifubang I guess you are ok merging this?

@lifubang lifubang changed the title Retry Recvfrom if observing EINTR Retry direct unix package calls if observing EINTR Feb 25, 2025
@lifubang lifubang merged commit 5f8abe5 into opencontainers:main Feb 25, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.2-todo A PR in main branch which needs to be backported to release-1.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants