Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for CONNECT proxy in Cro::HTTP::Client #127

Open
jonathanstowe opened this issue Mar 3, 2021 · 11 comments
Open

Support for CONNECT proxy in Cro::HTTP::Client #127

jonathanstowe opened this issue Mar 3, 2021 · 11 comments

Comments

@jonathanstowe
Copy link
Contributor

For HTTPS proxied requests if the proxy doesn't have the CA certificate for the target server (irrespective of whether the client has,) when using the same request method then it may fail, which is why the CONNECT method is used to the proxy host for HTTPS. Most proxy implementations require this these days.

@jonathanstowe
Copy link
Contributor Author

This may be related to #116

@jonathanstowe
Copy link
Contributor Author

jonathanstowe commented Mar 3, 2021

For (largely my,) reference for a CONNECT proxy the client will send CONNECT <target-host>:<target-port> to the configured proxy and when the proxy returns 200, the client will send the original request as if it was sending it directly to the target host.

@jonathanstowe
Copy link
Contributor Author

I think what has to happen is that there needs to be a third type of connector that connects to the proxy, issues the CONNECT and then can be used as a normal connection (possibly after upgrading to TLS,) in the first instance I'm going to see if this can be achieved by sub-classing Cro::HTTP::Client as I found this immediately before I was going to put something live (where it needed to use the proxy.)

@jonathanstowe jonathanstowe changed the title Support for CONNECT proxy Support for CONNECT proxy in Cro::HTTP::Client Mar 4, 2021
@jonathanstowe
Copy link
Contributor Author

Okay, I have a working PoC that's good enough for my current purposes:

use Cro::HTTP::Client;
use Cro::Connection;
use Cro::Message;
use Cro::Replyable;
use Cro::Sink;
use Cro::Source;
use Cro::TCP;
use Cro::Types;
use Cro::TCP::NoDelay;
use IO::Socket::Async::SSL;
use Cro::Uri::HTTP;

sub supports-alpn(--> Bool) is export { IO::Socket::Async::SSL.supports-alpn }

class Cro::ConnectProxy::Connector does Cro::Connector {

    has Bool $.secure = True;
    has Str  $.proxy is required;

    class Transform does Cro::Transform {
        has $!socket;

        submethod BUILD(:$!socket!) {}

        method consumes() { Cro::TCP::Message }
        method produces() { Cro::TCP::Message }

        method alpn-result() { $!socket.alpn-result }

        method transformer(Supply $incoming --> Supply) {
            supply {
                whenever $incoming {
                    whenever $!socket.write(.data) {}
                }
                whenever $!socket.Supply(:bin) -> $data {
                    emit Cro::TCP::Message.new(:$data);
                    LAST done;
                }
                CLOSE {
                    $!socket.close;
                }
            }
        }
    }

    method consumes() { Cro::TCP::Message }
    method produces() { Cro::TCP::Message }

    method connect(:$nodelay, *%options --> Promise) {
        my $host = %options<host>:delete // 'localhost';
        my $port = %options<port>:delete;
        my Cro::Uri $parsed-proxy-uri = Cro::Uri::HTTP.parse($!proxy);

        my $proxy-host = $parsed-proxy-uri.host;
        my $proxy-port = $parsed-proxy-uri.port;

        IO::Socket::Async.connect($proxy-host, $proxy-port)
            .then: {
                my $socket = .result;
                my $connect = qq:to/EOC/;
                CONNECT $host:$port HTTP/1.1
                Host: $host:$port
                User-Agent: Cro
                Proxy-Connection: Keep-Alive

                EOC

                await $socket.print:  $connect;
                react {
                    whenever $socket.Supply -> $v {
                        if $v ~~ /200/ {
                           last;
                        }
                    }
                }

                if $!secure {
                    $socket = await IO::Socket::Async::SSL.upgrade-client($socket, host => $host, |%options);
                }
                nodelay($socket) if $nodelay;
                Transform.new(:$socket)
            }
    }
}

class Cro::HTTP::ProxyClient is Cro::HTTP::Client {
    has $.connect-proxy;

    submethod BUILD(:$!connect-proxy) {
    }

    method choose-connector($secure) {
        if self && $!connect-proxy {
            Cro::ConnectProxy::Connector.new(:$secure, proxy => $!connect-proxy);
        }
        else {
            callsame;
        }
    }

}

my $client = Cro::HTTP::ProxyClient.new(connect-proxy => 'http://127.0.0.1:3128', content-type => 'application/json');

my $r =  await $client.post('https://somethingorother/api/product/summary', body => '{ "product_id" : [ 112345 ] }');

say await $r.body-text;

# vim: ft=raku

Obviously some work needs to be done to abstract the connect part and I think the Connector wants to go somewhere else other than in Cro::HTTP::Client but that's basically it.

@jonathanstowe
Copy link
Contributor Author

I think the connector probably wants to in cro-core as Cro::TCP::ConnectProxy::Connector or something, because it may be useful in other places, and just the small change in the Cro::HTTP::Client to use it if necessary.

@jonathanstowe
Copy link
Contributor Author

And just in case anyone wants to reproduce/test this, I used squid as a proxy (which was the actual proxy software which gave me the problem in the first place,) it should just work sufficiently out of the box.

@CIAvash
Copy link

CIAvash commented Mar 5, 2021

I tried @jonathanstowe 's code and it works with my proxy too(#116). Although I deleted codes related to Cro::TCP::NoDelay, because it doesn't exist in my lib path.

@jonathanstowe
Copy link
Contributor Author

FWIW I'd be quite happy to make a PR for the above, but I'd probably want some guidance as to where as I'm not sure it wants to be where I've put it in the PoC.

@jonathanstowe
Copy link
Contributor Author

Has anyone had any thoughts about this? I'm kinda warnocked on what PR should be provided. (BTW the above code has been used in a production application since March without any problems.)

@Zer0-Tolerance
Copy link

Zer0-Tolerance commented Aug 5, 2023

is this issue related to Cro::HTTP::Client doing http request to https website when using an https proxy ?

[0] > use Cro::HTTP::Client
Nil
[1] > my $ua = Cro::HTTP::Client.new(:cookie-jar);
Cro::HTTP::Client.new(headers => [], cookie-jar => Cro::HTTP::Client::CookieJar.new, body-serializers => Any, add-body-serializers => Any, body-parsers => Any, add-body-parsers => Any, content-type => Any, follow => 5, http => Any, ca => Any, base-uri => Cro::Uri, tls => {}, push-promises => Bool::False, user-agent => "Cro", auth => {}, http-proxy => Cro::Uri, https-proxy => Cro::Uri, timeout-policy => Cro::Policy::Timeout)
[2] > my $r=await $ua.get: 'https://www.bt.com';
[TRACE(anon 1)] RequestSerializerExtension EMIT HTTP Request
  GET https://www.bt.com HTTP/1.1
  Host: www.bt.com
  User-agent: Cro
[TRACE(anon 1)] Cro::HTTP::RequestSerializer EMIT TCP Message
  47 45 54 20 68 74 74 70 73 3a 2f 2f 77 77 77 2e  GET https://www.
  62 74 2e 63 6f 6d 20 48 54 54 50 2f 31 2e 31 0d  bt.com HTTP/1.1.
  0a 48 6f 73 74 3a 20 77 77 77 2e 62 74 2e 63 6f  .Host: www.bt.co
  6d 0d 0a 55 73 65 72 2d 61 67 65 6e 74 3a 20 43  m..User-agent: C
  72 6f 0d 0a 0d 0a                                ro....

[TRACE(anon 1)] Cro::TCP::Connector EMIT TCP Message
  48 54 54 50 2f 31 2e 31 20 34 30 30 20 42 61 64  HTTP/1.1 400 Bad
  20 52 65 71 75 65 73 74 0d 0a 53 65 72 76 65 72   Request..Server
  3a 20 45 64 67 65 50 72 69 73 6d 53 53 4c 2f 31  : EdgePrismSSL/1
  2e 30 2e 34 2e 30 0d 0a 44 61 74 65 3a 20 53 61  .0.4.0..Date: Sa
  74 2c 20 30 35 20 41 75 67 20 32 30 32 33 20 31  t, 05 Aug 2023 1
  30 3a 31 33 3a 34 32 20 47 4d 54 0d 0a 43 6f 6e  0:13:42 GMT..Con
  74 65 6e 74 2d 54 79 70 65 3a 20 74 65 78 74 2f  tent-Type: text/
  68 74 6d 6c 0d 0a 43 6f 6e 74 65 6e 74 2d 4c 65  html..Content-Le
  6e 67 74 68 3a 20 32 35 35 0d 0a 43 6f 6e 6e 65  ngth: 255..Conne
  63 74 69 6f 6e 3a 20 63 6c 6f 73 65 0d 0a 0d 0a  ction: close....
  3c 68 74 6d 6c 3e 0d 0a 3c 68 65 61 64 3e 3c 74  <html>..<head><t
  69 74 6c 65 3e 34 30 30 20 54 68 65 20 70 6c 61  itle>400 The pla
  69 6e 20 48 54 54 50 20 72 65 71 75 65 73 74 20  in HTTP request
  77 61 73 20 73 65 6e 74 20 74 6f 20 48 54 54 50  was sent to HTTP
  53 20 70 6f 72 74 3c 2f 74 69 74 6c 65 3e 3c 2f  S port</title></
  68 65 61 64 3e 0d 0a 3c 62 6f 64 79 3e 0d 0a 3c  head>..<body>..<
  63 65 6e 74 65 72 3e 3c 68 31 3e 34 30 30 20 42  center><h1>400 B
  61 64 20 52 65 71 75 65 73 74 3c 2f 68 31 3e 3c  ad Request</h1><
  2f 63 65 6e 74 65 72 3e 0d 0a 3c 63 65 6e 74 65  /center>..<cente
  72 3e 54 68 65 20 70 6c 61 69 6e 20 48 54 54 50  r>The plain HTTP
  20 72 65 71 75 65 73 74 20 77 61 73 20 73 65 6e   request was sen
  74 20 74 6f 20 48 54 54 50 53 20 70 6f 72 74 3c  t to HTTPS port<
  2f 63 65 6e 74 65 72 3e 0d 0a 3c 68 72 3e 3c 63  /center>..<hr><c
  65 6e 74 65 72 3e 45 64 67 65 50 72 69 73 6d 53  enter>EdgePrismS
  53 4c 3c 2f 63 65 6e 74 65 72 3e 0d 0a 3c 2f 62  SL</center>..</b
  6f 64 79 3e 0d 0a 3c 2f 68 74 6d 6c 3e 0d 0a     ody>..</html>..

[TRACE(anon 1)] Cro::HTTP::ResponseParser EMIT HTTP Response
  HTTP/1.1 400 Bad Request
  Server: EdgePrismSSL/1.0.4.0
  Date: Sat, 05 Aug 2023 10:13:42 GMT
  Content-Type: text/html
  Content-Length: 255
  Connection: close
[TRACE(anon 1)] ResponseParserExtension EMIT HTTP Response
  HTTP/1.1 400 Bad Request
  Server: EdgePrismSSL/1.0.4.0
  Date: Sat, 05 Aug 2023 10:13:42 GMT
  Content-Type: text/html
  Content-Length: 255
  Connection: close
[TRACE(anon 1)] Cro::TCP::Connector DONE
An operation first awaited:
  in block <unit> at <unknown file> line 1

Died with the exception:
    Server responded with 400 Bad Request (GET http://127.0.0.1:8080)
      in block  at .rakubrew/versions/moar-2023.06/share/perl6/site/sources/3D7AE8DD442BE31392D93ECFBF0B6CACEC0825D6 (Cro::HTTP::Client) line 676

@jonathanstowe
Copy link
Contributor Author

Yes, that's likely the same thing. Basically with a "classic" HTTP proxy as implemented by Cro::HTTP::Client the client makes a network connection to the proxy, and then issues the (e.g.) GET with the fully qualified URI of the target resource - you can see it doing that in your trace.

With a CONNECT proxy connection, which most modern proxy software implements as it deals better with secure connections, the client makes the network connection to the proxy, then issues a CONNECT with the hostname (and port) of the target host, and when that is confirmed the client proceeds to issue the HTTP request as if it was connected directly to the target host.

The Connector hack above definitely works, I've been using it in production code for a year and half, the only reason I've not made a PR with it is that I'm not entirely where it should be applied so probably need some guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants