WIP: Draft PR for improving connectivity testing to capture resolved IPs and all attempts #228

amircybersec · 2024-05-02T17:24:00Z

Note: The code is still messy not ready for merge. I made this PR to discuss upcoming high-level changes to the testing approach and format for connectivity testing pursuant to #209 (comment)

I moved the loop that iterates over ip addresses outside of the intercept dialer. This way in each iteration I am generating a new intercept dialer that takes one of the resolved addresses and use that to continue the end-to-end test.

In this approach, I am allowing each attempt to run end-to-end. The test may fail earlier for example at connect attempt. In that case, it won't continue to the next stage. However each attempt could continue to the end and test for connect/read/write operations for each resolved IP.

We also need to decide if test for all resolved IPs or abort at success.

Format

I am proposing below format that we can use for both udp and tcp.

List of suggested format changes:

Each attempt should include a start-time and duration
The error captured for each attempt should be a ConnectivityError type that includes op, PosixError, Error fields.
I am uncertain about keeping selected_address in the main report. It is hard to tell which address this should be populated with.
I am uncertain what error field in the main report should be populated with. What if the test makes 3 attempts and all of them fail with different types of errors. Does 'error' then reflect? I guess when one of the attempts succeed error field should be nil.

TODO: The code is slow and I am planning to optimize it such that tests run in an independent treads in parallel.

TODO: Update the format in the code to match the proposal

{
    "resolver": "8.8.8.8:53",
    "proto": "udp",
    "transport": "ss://[email protected]:65496?prefix=",
    "endpoint": {
        "host": "pod2.exampledomain.com",
        "port": "65496"
    },
    "attempts": [
            {
                "address": {
                    "host": "2606:4700:3032::6815:xxxx",
                    "port": "65496"
                },
                "time": "2024-05-02T05:02:13Z",
                "duration_ms": 1000,
                "error": {
                    "op": "connect",
                    "msg": "i/o timeout"
                }
            },
            {
                "address": {
                    "host": "2606:4700:3032::ac43:xxxx",
                    "port": "65496"
                },
                "time": "2024-05-02T05:02:13Z",
                "duration_ms": 1000,
                "error": {
                    "op": "read",
                    "msg": "reset"
                }
            },
            {
                "address": {
                    "host": "172.67.138.xx",
                    "port": "65496"
                },
                "time": "2024-05-02T05:02:13Z",
                "duration_ms": 1000,
                "error": {
                    "op": "connect",
                    "msg": "i/o timeout"
                }
            },
            {
                "address": {
                    "host": "104.21.38.xx",
                    "port": "65496"
                },
                "time": "2024-05-02T05:02:13Z",
                "duration_ms": 1000,
                "error": {
                    "op": "connect",
                    "msg": "i/o timeout"
                }
            }
     ],
    // maybe we should remove this; what if both IPv4 and IPv6 addresses succeed, 
    // then which one is the selected address?
    "selected_address": { 
        "host": "104.21.38.199",
        "port": "65496"
    },
    // This would be the start-time of the main test and the duration for the whole test
    "time": "2024-05-02T05:02:13Z",
    "duration_ms": 20006,
    // maybe we should remove this one too or make it a summary of all errors in the all attempts. 
    // I am not sure what this field would reflect is we have two failed attempts. I guess this should be `nil`
   // if at least of the tests pass?
    "error": {
        "op": "connect",
        "msg": "all connect attempts failed. no more addresses to try"
    }
}

amircybersec · 2024-05-02T17:29:04Z

@fortuna I appreciate your feedback on the format and the proposed approach for testing. This is in reference to #209 (comment)

amircybersec · 2024-05-15T18:15:01Z

dns/resolver.go

@@ -268,6 +268,11 @@ func NewUDPResolver(pd transport.PacketDialer, resolverAddr string) Resolver {
 			return nil, &nestedError{ErrDial, err}
 		}
 		defer conn.Close()
+		// force close connection is context is done/cancelled.


@fortuna I have to add this here to close the connection for pending receives (in case of timeouts) when another attempt succeeds. I managed to do this with the following change. I am basically cancelling the passed context in connectivity.go and checking if context is done. This is the best I came up with but I am not sure if there's a more elegant way to accomplish this.

amircybersec · 2024-05-15T18:19:25Z

dns/resolver.go

@@ -391,3 +396,14 @@ func NewHTTPSResolver(sd transport.StreamDialer, resolverAddr string, url string
 		return &msg, nil
 	})
 }
+
+func wrappErrors(err1, err2 error) error {


@fortuna we need to use this function instead of errors.Join() since errors.Join() breaks the wrapped chain of errors. when we are unwrapping the errors with unwrapAll() in makeErrorRecord in test_connectivity.go (

outline-sdk/x/examples/test-connectivity/main.go

Line 124 in 07f86b4

record.Msg = unwrapAll(result.Err).Error()

), the errors don't get fully unwrapped and can leak unwanted error messages that can contain sensitive information.

amircybersec changed the title ~~WIP: Draft PR for improving connectivity testing to capture revolved IPs and all attempts~~ WIP: Draft PR for improving connectivity testing to capture resolved IPs and all attempts May 2, 2024

amircybersec commented May 15, 2024

View reviewed changes

fortuna and others added 23 commits July 30, 2024 14:37

Collect connection address

1920ee0

Try intercepting the transport dialer

a50d0eb

Pass wrapper instead

f98ba8b

Clean up

c5ab95c

Comment

f88036d

Collect connections

3d750e1

Fix

a1732d2

Timeout

053da66

To fix

e6f31d0

loop over intercept dialer

cad18a9

code cleanup

c3df0ed

changed connectivity result format

4fdf296

updated json report format

01a1c7e

making tcp tests run concurrently

91dae61

close clean up / remove mutex

7b47394

add ip type tag to address json

fb791b1

polish comments and code

1fe55fa

make udp tests concurrent

bb1090e

cancel pending read/writes

e4e4915

do not capture conn closed err

cb83523

improvement to concurrency

bdbbc5e

bug fix and code clean up

08e4381

migrating to new config design

60eed45

amircybersec force-pushed the amir-conn-test branch from d597f81 to 60eed45 Compare July 30, 2024 21:50

amircybersec added 2 commits July 30, 2024 15:53

removed commented code

c7832de

reverse wrappErrors in dns

dfd82c9

amircybersec added 2 commits August 9, 2024 15:58

fix wrap issue

b8134d8

change time to start_time

7318a7e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Draft PR for improving connectivity testing to capture resolved IPs and all attempts #228

WIP: Draft PR for improving connectivity testing to capture resolved IPs and all attempts #228

amircybersec commented May 2, 2024 •

edited

Loading

amircybersec commented May 2, 2024

amircybersec May 15, 2024

amircybersec May 15, 2024 •

edited

Loading

WIP: Draft PR for improving connectivity testing to capture resolved IPs and all attempts #228

Are you sure you want to change the base?

WIP: Draft PR for improving connectivity testing to capture resolved IPs and all attempts #228

Conversation

amircybersec commented May 2, 2024 • edited Loading

Format

amircybersec commented May 2, 2024

amircybersec May 15, 2024

Choose a reason for hiding this comment

amircybersec May 15, 2024 • edited Loading

Choose a reason for hiding this comment

amircybersec commented May 2, 2024 •

edited

Loading

amircybersec May 15, 2024 •

edited

Loading