Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-123832: Adjust socket.getaddrinfo docs for better POSIX compliance #126182

Merged
merged 4 commits into from
Nov 14, 2024

Conversation

encukou
Copy link
Member

@encukou encukou commented Oct 30, 2024

issue tl;dr: POSIX technically allows getaddrinfo results with default family, proto & type to be unusable; and you want to specify one of those on all systems to filter out stuff like SOCK_RAW that your app can't handle anyway.


IMO, CPython should aim to follow POSIX in cases where all supported platforms behave a certain way, but some unsupported ones stick to the spec in some inconvenient way.

@gpshead, I'd be interested in your take on this.


This PR changes nothing changes for CPython supported platforms, but hints how to deal with platforms that stick to the letter of the spec.
It also marks socket.getaddrinfo as a wrapper around getaddrinfo(3); specifically, workarounds to make the function work consistently across platforms are out of scope in its code.

Include wording similar to the POSIX's “by providing options and by limiting the returned information”, which IMO suggests that the hints limit the resulting list compared to the defaults, but can be interpreted differently. Details are added in a note.

Specifically say that this wraps the underlying C function. So, the details are in OS docs. The “full range of results” bit goes away.

Use AF_UNSPEC rather than zero for the family default, although I don't think a system where it's nonzero would be very usable.

Suggest setting proto and/or type (with examples, as the appropriate values aren't obvious). Say why you probably want to do that that on all systems; mention the behavior on the “letter of the spec” systems.

Suggest that the results should be tried in order, which is, AFAIK best practice -- see RFC 6724 section 2, and its predecessor from 2003 (which are specific to IP, but indicate how people use this):

Well-behaved applications SHOULD iterate through the list of
addresses returned from getaddrinfo() until they find a working address.


📚 Documentation preview 📚: https://cpython-previews--126182.org.readthedocs.build/

…mpliance

This changes nothing changes for CPython supported platforms,
but hints how to deal with platforms that stick to the letter of
the spec.
It also marks `socket.getaddrinfo` as a wrapper around `getaddrinfo(3)`;
specifically, workarounds to make the function work consistently across
platforms are out of scope in its code.

Include wording similar to the POSIX's “by providing options and by
limiting the returned information”, which IMO suggests that the
hints limit the resulting list compared to the defaults, *but* can
be interpreted differently. Details are added in a note.

Specifically say that this wraps the underlying C function. So, the
details are in OS docs. The “full range of results” bit goes away.

Use `AF_UNSPEC` rather than zero for the *family* default, although
I don't think a system where it's nonzero would be very usable.

Suggest setting proto and/or type (with examples, as the appropriate
values aren't obvious). Say why you probably want to do that that
on all systems; mention the behavior on the “letter of the spec”
systems.

Suggest that the results should be tried in order, which is,
AFAIK best practice -- see RFC 6724 section 2, and its predecessor
from 2003 (which are specific to IP, but indicate how people use this):

> Well-behaved applications SHOULD iterate through the list of
> addresses returned from `getaddrinfo()` until they find a working address.
@zyv
Copy link

zyv commented Oct 30, 2024

I really like your changes now 👍 Thank you!

My only concern now is that the "meat" of the message is "hidden" in a note, but I guess there's nothing that can reasonably be done about that, apart from making type and proto required arguments, and forcing users to set them to None if they really just want the cannonname... but everyone will probably agree that the ship has sailed on that one :-/ So probably this is as good as it gets!

Copy link
Contributor

@willingc willingc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good and well written.

Doc/library/socket.rst Outdated Show resolved Hide resolved
@encukou
Copy link
Member Author

encukou commented Nov 4, 2024

Yeah, the ship has sailed, with Solaris catching up and AIX left behind. Sad, but it's where we are.

@gpshead, I intend to merge this next week. Let me know if you want to review & want more time.

In these cases, limiting the *type* and/or *proto* can help eliminate
unsuccessful or unusable connecton attempts.

Some systems will, however, only return a single address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we know typical details for "many systems" and "some systems" we should include an possible example with each such as "(ex: most Linux configurations)" or "(ex: reported on Solaris and AIX configurations)". I'm wording those non-concretely as well, but suggest adding it just to add some context for people looking into behaviors as to when they may encounter the unexpected.

not a big deal without this, just a nice to have.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For your reference from #123832, "many" are basically mainstream (Linux, macOS, FreeBSD) and "some" are Illumos / Solaris / AIX. I haven't had a chance to get my hands on anything more exotic recently (HP-UX and VMS come to mind...).

I would also prefer to be more specific, but I thought the details were left out to avoid "finger pointing".

Doc/library/socket.rst Outdated Show resolved Hide resolved
@encukou encukou merged commit ff0ef0a into python:main Nov 14, 2024
25 checks passed
@encukou encukou deleted the socket-getaddrinfo-docs branch November 14, 2024 08:31
@encukou encukou added needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Nov 14, 2024
@miss-islington-app
Copy link

Thanks @encukou for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Thanks @encukou for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 14, 2024
…mpliance (pythonGH-126182)

* pythongh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance

This changes nothing changes for CPython supported platforms,
but hints how to deal with platforms that stick to the letter of
the spec.
It also marks `socket.getaddrinfo` as a wrapper around `getaddrinfo(3)`;
specifically, workarounds to make the function work consistently across
platforms are out of scope in its code.

Include wording similar to the POSIX's “by providing options and by
limiting the returned information”, which IMO suggests that the
hints limit the resulting list compared to the defaults, *but* can
be interpreted differently. Details are added in a note.

Specifically say that this wraps the underlying C function. So, the
details are in OS docs. The “full range of results” bit goes away.

Use `AF_UNSPEC` rather than zero for the *family* default, although
I don't think a system where it's nonzero would be very usable.

Suggest setting proto and/or type (with examples, as the appropriate
values aren't obvious). Say why you probably want to do that that
on all systems; mention the behavior on the “letter of the spec”
systems.

Suggest that the results should be tried in order, which is,
AFAIK best practice -- see RFC 6724 section 2, and its predecessor
from 2003 (which are specific to IP, but indicate how people use this):

> Well-behaved applications SHOULD iterate through the list of
> addresses returned from `getaddrinfo()` until they find a working address.

(cherry picked from commit ff0ef0a)

Co-authored-by: Petr Viktorin <[email protected]>
Co-authored-by: Carol Willing <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Nov 14, 2024

GH-126824 is a backport of this pull request to the 3.12 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 14, 2024
…mpliance (pythonGH-126182)

* pythongh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance

This changes nothing changes for CPython supported platforms,
but hints how to deal with platforms that stick to the letter of
the spec.
It also marks `socket.getaddrinfo` as a wrapper around `getaddrinfo(3)`;
specifically, workarounds to make the function work consistently across
platforms are out of scope in its code.

Include wording similar to the POSIX's “by providing options and by
limiting the returned information”, which IMO suggests that the
hints limit the resulting list compared to the defaults, *but* can
be interpreted differently. Details are added in a note.

Specifically say that this wraps the underlying C function. So, the
details are in OS docs. The “full range of results” bit goes away.

Use `AF_UNSPEC` rather than zero for the *family* default, although
I don't think a system where it's nonzero would be very usable.

Suggest setting proto and/or type (with examples, as the appropriate
values aren't obvious). Say why you probably want to do that that
on all systems; mention the behavior on the “letter of the spec”
systems.

Suggest that the results should be tried in order, which is,
AFAIK best practice -- see RFC 6724 section 2, and its predecessor
from 2003 (which are specific to IP, but indicate how people use this):

> Well-behaved applications SHOULD iterate through the list of
> addresses returned from `getaddrinfo()` until they find a working address.

(cherry picked from commit ff0ef0a)

Co-authored-by: Petr Viktorin <[email protected]>
Co-authored-by: Carol Willing <[email protected]>
@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Nov 14, 2024
@bedevere-app
Copy link

bedevere-app bot commented Nov 14, 2024

GH-126825 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Nov 14, 2024
encukou added a commit that referenced this pull request Nov 15, 2024
…ompliance (GH-126182) (GH-126825)

gh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance (GH-126182)

* gh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance

This changes nothing changes for CPython supported platforms,
but hints how to deal with platforms that stick to the letter of
the spec.
It also marks `socket.getaddrinfo` as a wrapper around `getaddrinfo(3)`;
specifically, workarounds to make the function work consistently across
platforms are out of scope in its code.

Include wording similar to the POSIX's “by providing options and by
limiting the returned information”, which IMO suggests that the
hints limit the resulting list compared to the defaults, *but* can
be interpreted differently. Details are added in a note.

Specifically say that this wraps the underlying C function. So, the
details are in OS docs. The “full range of results” bit goes away.

Use `AF_UNSPEC` rather than zero for the *family* default, although
I don't think a system where it's nonzero would be very usable.

Suggest setting proto and/or type (with examples, as the appropriate
values aren't obvious). Say why you probably want to do that that
on all systems; mention the behavior on the “letter of the spec”
systems.

Suggest that the results should be tried in order, which is,
AFAIK best practice -- see RFC 6724 section 2, and its predecessor
from 2003 (which are specific to IP, but indicate how people use this):

> Well-behaved applications SHOULD iterate through the list of
> addresses returned from `getaddrinfo()` until they find a working address.

(cherry picked from commit ff0ef0a)

Co-authored-by: Petr Viktorin <[email protected]>
Co-authored-by: Carol Willing <[email protected]>
encukou added a commit that referenced this pull request Nov 15, 2024
…ompliance (GH-126182) (GH-126824)

gh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance (GH-126182)

* gh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance

This changes nothing changes for CPython supported platforms,
but hints how to deal with platforms that stick to the letter of
the spec.
It also marks `socket.getaddrinfo` as a wrapper around `getaddrinfo(3)`;
specifically, workarounds to make the function work consistently across
platforms are out of scope in its code.

Include wording similar to the POSIX's “by providing options and by
limiting the returned information”, which IMO suggests that the
hints limit the resulting list compared to the defaults, *but* can
be interpreted differently. Details are added in a note.

Specifically say that this wraps the underlying C function. So, the
details are in OS docs. The “full range of results” bit goes away.

Use `AF_UNSPEC` rather than zero for the *family* default, although
I don't think a system where it's nonzero would be very usable.

Suggest setting proto and/or type (with examples, as the appropriate
values aren't obvious). Say why you probably want to do that that
on all systems; mention the behavior on the “letter of the spec”
systems.

Suggest that the results should be tried in order, which is,
AFAIK best practice -- see RFC 6724 section 2, and its predecessor
from 2003 (which are specific to IP, but indicate how people use this):

> Well-behaved applications SHOULD iterate through the list of
> addresses returned from `getaddrinfo()` until they find a working address.

(cherry picked from commit ff0ef0a)

Co-authored-by: Petr Viktorin <[email protected]>
Co-authored-by: Carol Willing <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir skip news
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants