Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netxlite: call getaddrinfo and handle platform-specific oddities #764

Merged
merged 13 commits into from
May 28, 2022

Conversation

bassosimone
Copy link
Contributor

@bassosimone bassosimone commented May 27, 2022

This commit changes our system resolver to call getaddrinfo directly when CGO is enabled. This change allows us to:

  1. obtain the CNAME easily

  2. obtain the real getaddrinfo retval

  3. handle platform specific oddities such as EAI_NODATA
    returned on Android devices

See ooni/probe#2029 and ooni/probe#2029 (comment) in particular.

See ooni/probe#2033 for documentation regarding the desire to see getaddrinfo's retval.

See ooni/probe#2118 for possible follow-up changes.

This commit changes our system resolver to call getaddrinfo
directly when CGO is enabled. This change allows us to:

1. obtain the CNAME easily

2. obtain the real getaddrinfo retval

See ooni/probe#2029

See ooni/probe#2033
@bassosimone bassosimone requested a review from hellais as a code owner May 27, 2022 15:56
1. remove tests that would not work on windows

2. add extra coverage for error mapping on windows
The main reason why I wanted to unify the implementations is
that I would like to be sure I test Android code.

Since Android is ~Linux, it makes sense to run Android
specific tests using Linux. (That seems at least slightly
better than not testing Android code.)

While there, figure out what is actually happening inside the
android codebase and devise a plan of action.

We're not done here yet.
We only need to know and use the GOOS inside Android code.

However, to properly test the code, we need toError to receive
the GOOS as an argument, so we can write specific tests for
Android that properly handle EAI_NODATA.

I could possibly only test the specific function below toError
that handles Android but I'd rather test that toError as a whole
is WAI. A future refactoring MAY make Linux code not call such
a specific function. In such a case, with a less precise unit
test, we would never know we've broken the codebase.
This change will allow us to identify the cases in which we cannot
basically say anything about the response from getaddrinfo.
@bassosimone bassosimone changed the title feat(netxlite): call getaddrinfo directly feat(netxlite): call getaddrinfo and handle platform-specific oddities May 28, 2022
@bassosimone bassosimone merged commit cf6dbe4 into master May 28, 2022
@bassosimone bassosimone deleted the issue/2029-small branch May 28, 2022 13:10
@bassosimone bassosimone changed the title feat(netxlite): call getaddrinfo and handle platform-specific oddities netxlite: call getaddrinfo and handle platform-specific oddities May 28, 2022
bassosimone added a commit that referenced this pull request May 29, 2022
After #764, the build for
CGO_ENABLED=0 has been broken for miniooni:

https://github.com/ooni/probe-cli/runs/6636995859?check_suite_focus=true

Likewise, it's not possible to run tests with CGO_ENABLED=0.

Additionally, @hellais previously raised a valid point in the review
of #698:

> Another issue we should consider is that, if I understand how
> this works correctly, depending on whether or not we have built
> with CGO_ENABLED=0 on or not, we are going to be measuring
> things in a different way (using our cgo inspired getaddrinfo
> implementation or using netgo). This might present issues when
> analyzing or interpreting the data.
>
> Do we perhaps want to add some field to the output data format that
> gives us an indication of which DNS resolution code was used to
> generate the the metric?

This comment is relevant to the current commit because
#698 is the previous
iteration of #764.

So, while fixing the build and test issues, let us also distinguish
between the CGO_ENABLED=1 and CGO_ENABLED=0 cases.

Before this commit, OONI used "system" to indicate the case where
we were using net.DefaultResolver. This behavior dates back to the
Measurement Kit days. While it is true that ooni/probe-engine and
ooni/probe-cli could have been using netgo in the past when we
said "system" as the resolver, it also seems reasonable to continue
to use "system" top indicate getaddrinfo.

So, the choice here is basically to use "netgo" from now on to
indicate the cases in which we were built with CGO_ENABLED=0.

This change will need to be documented into ooni/spec along with
the introduction of the `android_dns_cache_no_data` error.
bassosimone added a commit that referenced this pull request May 30, 2022
After #764, the build for
CGO_ENABLED=0 has been broken for miniooni:

https://github.com/ooni/probe-cli/runs/6636995859?check_suite_focus=true

Likewise, it's not possible to run tests with CGO_ENABLED=0.

To make tests work with `CGO_ENABLED=0`, I needed to sacrifice some
unit tests run for the CGO case. It is not fully clear to me what was happening
here, but basically `getaddrinfo_cgo_test.go` was compiled with CGO
being disabled, even though the ``//go:build cgo` flag was specified.

Additionally, @hellais previously raised a valid point in the review
of #698:

> Another issue we should consider is that, if I understand how
> this works correctly, depending on whether or not we have built
> with CGO_ENABLED=0 on or not, we are going to be measuring
> things in a different way (using our cgo inspired getaddrinfo
> implementation or using netgo). This might present issues when
> analyzing or interpreting the data.
>
> Do we perhaps want to add some field to the output data format that
> gives us an indication of which DNS resolution code was used to
> generate the the metric?

This comment is relevant to the current commit because
#698 is the previous
iteration of #764.

So, while fixing the build and test issues, let us also distinguish
between the CGO_ENABLED=1 and CGO_ENABLED=0 cases.

Before this commit, OONI used "system" to indicate the case where
we were using net.DefaultResolver. This behavior dates back to the
Measurement Kit days. While it is true that ooni/probe-engine and
ooni/probe-cli could have been using netgo in the past when we
said "system" as the resolver, it also seems reasonable to continue
to use "system" top indicate getaddrinfo.

So, the choice here is basically to use "netgo" from now on to
indicate the cases in which we were built with CGO_ENABLED=0.

This change will need to be documented into ooni/spec along with
the introduction of the `android_dns_cache_no_data` error.

## Checklist

- [x] I have read the [contribution guidelines](https://github.com/ooni/probe-cli/blob/master/CONTRIBUTING.md)
- [x] reference issue for this pull request: ooni/probe#2029
- [x] if you changed anything related how experiments work and you need to reflect these changes in the ooni/spec repository, please link to the related ooni/spec pull request: ooni/spec#242
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant