Skip to content

Commit

Permalink
Various fixes & new features v0.8.0-beta.2
Browse files Browse the repository at this point in the history
- Bump to tls-client to 1.7.2
- Detect & override encoding (#24, #31, #32). Works by encoding non-UTF-8 responses in Base64, then sending to Python. Requires faust-cchardet for encoding detection.
- Add support for auth http proxies & https (#19)
- Add support for socks5 (#28)
- `proxies` keyword deprecated in favor of `proxy`
- Support for Python 3.7-3.8 (#30)
- Added certificate pinning
- Added option to disable IPv6
- More descriptive ClientException errors
- Added proxy support for Firefox and Chromium
- Added `raise_exception` parameter in HTML parser's .find and .find_all
- Updated README for 0.8.0-beta.2
- (Hopefully) Fixed crashing issue when importing. Open ports are now found by go's http api rather than python sockets.
  • Loading branch information
daijro committed Feb 4, 2024
1 parent 4953bab commit 8e9b33e
Show file tree
Hide file tree
Showing 18 changed files with 495 additions and 268 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -169,4 +169,4 @@ cython_debug/
# VsCode

.vscode

.trunk
34 changes: 25 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
- Human-like cursor movement and typing
- Chrome and Firefox extension support
- Full page screenshots
- Proxy support
- Headless and headful support
- No CORS restrictions

Expand All @@ -52,6 +53,7 @@
- High performance ✨
- Minimal dependence on the python standard libraries
- HTTP backend written in Go
- Automatic gzip & brotli decode
- Written with type safety
- 100% threadsafe ❤️

Expand Down Expand Up @@ -87,7 +89,7 @@ pip install -U hrequests

# Documentation

**Gitbook documentation is available [here](https://daijro.gitbook.io/hrequests/).**
**For the latest stable hrequests documentation, check the [Gitbook page](https://daijro.gitbook.io/hrequests/).**

1. [Simple Usage](https://github.com/daijro/hrequests#simple-usage)
2. [Sessions](https://github.com/daijro/hrequests#sessions)
Expand Down Expand Up @@ -127,7 +129,7 @@ Parameters:
history (bool, optional): Remember request history. Defaults to False.
verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
timeout (float, optional): Timeout in seconds. Defaults to 30.
proxies (dict, optional): Dictionary of proxies. Defaults to None.
proxy (str, optional): Proxy URL. Defaults to None.
nohup (bool, optional): Run the request in the background. Defaults to False.
<Additionally includes all parameters from `hrequests.Session` if a session was not specified>
Expand Down Expand Up @@ -164,8 +166,10 @@ Getting the response body:
```py
>>> resp.text: str
'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta charset="UTF-8"><meta content="origin" name="referrer"><m...'
>>> resp.content: Union[bytes, str]
'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta charset="UTF-8"><meta content="origin" name="referrer"><m...'
>>> resp.content: bytes
b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta charset="UTF-8"><meta content="origin" name="referrer"><m...'
>>> resp.encoding: str
'WINDOWS-1250'
```

Parse the response body as JSON:
Expand Down Expand Up @@ -204,7 +208,7 @@ Creating a new Chrome Session object:

```py
>>> session = hrequests.Session() # version randomized by default
>>> session = hrequests.Session('chrome', version=117)
>>> session = hrequests.Session('chrome', version=120)
```

<details>
Expand All @@ -218,9 +222,14 @@ Parameters:
headers (dict, optional): Dictionary of HTTP headers to send with the request. Default is generated from `browser` and `os`.
verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
timeout (float, optional): Default timeout in seconds. Defaults to 30.
proxy (str, optional): Proxy URL. Defaults to None.
cookies (Union[RequestsCookieJar, dict, list], optional): Cookie Jar, or cookie list/dict to send. Defaults to None.
certificate_pinning (Dict[str, List[str]], optional): Certificate pinning. Defaults to None.
disable_ipv6 (bool, optional): Disable IPv6. Defaults to False.
detect_encoding (bool, optional): Detect encoding. Defaults to True.
ja3_string (str, optional): JA3 string. Defaults to None.
h2_settings (dict, optional): HTTP/2 settings. Defaults to None.
additional_decode (str, optional): Additional decode. Defaults to None.
additional_decode (str, optional): Decode response body with "gzip" or "br". Defaults to None.
pseudo_header_order (list, optional): Pseudo header order. Defaults to None.
priority_frames (list, optional): Priority frames. Defaults to None.
header_order (list, optional): Header order. Defaults to None.
Expand All @@ -247,9 +256,15 @@ Parameters:
os (Literal['win', 'mac', 'lin'], optional): OS to use in header. Default is randomized.
headers (dict, optional): Dictionary of HTTP headers to send with the request. Default is generated from `browser` and `os`.
verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
timeout (float, optional): Default timeout in seconds. Defaults to 30.
proxy (str, optional): Proxy URL. Defaults to None.
cookies (Union[RequestsCookieJar, dict, list], optional): Cookie Jar, or cookie list/dict to send. Defaults to None.
certificate_pinning (Dict[str, List[str]], optional): Certificate pinning. Defaults to None.
disable_ipv6 (bool, optional): Disable IPv6. Defaults to False.
detect_encoding (bool, optional): Detect encoding. Defaults to True.
ja3_string (str, optional): JA3 string. Defaults to None.
h2_settings (dict, optional): HTTP/2 settings. Defaults to None.
additional_decode (str, optional): Additional decode. Defaults to None.
additional_decode (str, optional): Decode response body with "gzip" or "br". Defaults to None.
pseudo_header_order (list, optional): Pseudo header order. Defaults to None.
priority_frames (list, optional): Priority frames. Defaults to None.
header_order (list, optional): Header order. Defaults to None.
Expand Down Expand Up @@ -387,7 +402,7 @@ Parameters:
history (bool, optional): Remember request history. Defaults to False.
verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
timeout (float, optional): Timeout in seconds. Defaults to 30.
proxies (dict, optional): Dictionary of proxies. Defaults to None.
proxy (str, optional): Proxy URL. Defaults to None.
<Additionally includes all parameters from `hrequests.Session` if a session was not specified>

Returns:
Expand Down Expand Up @@ -579,6 +594,7 @@ Parameters:
clean: Whether or not to sanitize the found HTML of ``<script>`` and ``<style>``
containing: If specified, only return elements that contain the provided text.
first: Whether or not to return just the first result.
raise_exception: Raise an exception if no elements are found. Default is True.
_encoding: The encoding format.

Returns:
Expand Down Expand Up @@ -684,7 +700,7 @@ Parameters:
headless (bool, optional): Whether to run the browser in headless mode. Defaults to True.
session (hrequests.session.TLSSession, optional): Session to use for headers, cookies, etc.
resp (hrequests.response.Response, optional): Response to update with cookies, headers, etc.
proxy_ip (str, optional): Proxy to use for the browser. Example: 123.123.123
proxy (str, optional): Proxy to use for the browser. Example: http://1.2.3.4:8080
mock_human (bool, optional): Whether to emulate human behavior. Defaults to False.
browser (Literal['firefox', 'chrome'], optional): Generate useragent headers for a specific browser
os (Literal['win', 'mac', 'lin'], optional): Generate headers for a specific OS
Expand Down
2 changes: 1 addition & 1 deletion bridge/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.0
2.0
22 changes: 12 additions & 10 deletions bridge/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,21 @@ module hrequests_bridge
go 1.21.1

require (
github.com/bogdanfinn/fhttp v0.5.24
github.com/bogdanfinn/tls-client v1.6.1
github.com/bogdanfinn/fhttp v0.5.27
github.com/bogdanfinn/tls-client v1.7.2
github.com/goccy/go-json v0.10.2
github.com/google/uuid v1.3.1
github.com/google/uuid v1.6.0
)

require (
github.com/andybalholm/brotli v1.0.4 // indirect
github.com/bogdanfinn/utls v1.5.16 // indirect
github.com/klauspost/compress v1.15.12 // indirect
github.com/andybalholm/brotli v1.1.0 // indirect
github.com/bogdanfinn/utls v1.6.1 // indirect
github.com/cloudflare/circl v1.3.7 // indirect
github.com/klauspost/compress v1.17.5 // indirect
github.com/quic-go/quic-go v0.41.0 // indirect
github.com/tam7t/hpkp v0.0.0-20160821193359-2b70b4024ed5 // indirect
golang.org/x/crypto v0.1.0 // indirect
golang.org/x/net v0.7.0 // indirect
golang.org/x/sys v0.5.0 // indirect
golang.org/x/text v0.7.0 // indirect
golang.org/x/crypto v0.18.0 // indirect
golang.org/x/net v0.20.0 // indirect
golang.org/x/sys v0.16.0 // indirect
golang.org/x/text v0.14.0 // indirect
)
60 changes: 40 additions & 20 deletions bridge/go.sum
Original file line number Diff line number Diff line change
@@ -1,24 +1,44 @@
github.com/andybalholm/brotli v1.0.4 h1:V7DdXeJtZscaqfNuAdSRuRFzuiKlHSC/Zh3zl9qY3JY=
github.com/andybalholm/brotli v1.0.4/go.mod h1:fO7iG3H7G2nSZ7m0zPUDn85XEX2GTukHGRSepvi9Eig=
github.com/bogdanfinn/fhttp v0.5.24 h1:OlyBKjvJp6a3TotN3wuj4mQHHRbfK7QUMrzCPOZGhRc=
github.com/bogdanfinn/fhttp v0.5.24/go.mod h1:brqi5woc5eSCVHdKYBV8aZLbO7HGqpwyDLeXW+fT18I=
github.com/bogdanfinn/tls-client v1.6.1 h1:GTIqQssFoIvLaDf4btoYRzDhUzudLqYD4axvfUCXl3I=
github.com/bogdanfinn/tls-client v1.6.1/go.mod h1:FtwQ3DndVZ0xAOO704v4iNAgbHOcEc5kPk9tjICTNQ0=
github.com/bogdanfinn/utls v1.5.16 h1:NhhWkegEcYETBMj9nvgO4lwvc6NcLH+znrXzO3gnw4M=
github.com/bogdanfinn/utls v1.5.16/go.mod h1:mHeRCi69cUiEyVBkKONB1cAbLjRcZnlJbGzttmiuK4o=
github.com/andybalholm/brotli v1.1.0 h1:eLKJA0d02Lf0mVpIDgYnqXcUn0GqVmEFny3VuID1U3M=
github.com/andybalholm/brotli v1.1.0/go.mod h1:sms7XGricyQI9K10gOSf56VKKWS4oLer58Q+mhRPtnY=
github.com/bogdanfinn/fhttp v0.5.27 h1:+glR3k8v5nxfUSk7+J3M246zEQ2yadhS0vLq1utK71A=
github.com/bogdanfinn/fhttp v0.5.27/go.mod h1:oJiYPG3jQTKzk/VFmogH8jxjH5yiv2rrOH48Xso2lrE=
github.com/bogdanfinn/tls-client v1.7.2 h1:vpL5qBYUfT9ueygEf1yLfymrXyUEZQatL25amfqGV8M=
github.com/bogdanfinn/tls-client v1.7.2/go.mod h1:pOGa2euqTbEkGNqE5idx5jKKfs9ytlyn3fwEw8RSP+g=
github.com/bogdanfinn/utls v1.6.1 h1:dKDYAcXEyFFJ3GaWaN89DEyjyRraD1qb4osdEK89ass=
github.com/bogdanfinn/utls v1.6.1/go.mod h1:VXIbRZaiY/wHZc6Hu+DZ4O2CgTzjhjCg/Ou3V4r/39Y=
github.com/cloudflare/circl v1.3.7 h1:qlCDlTPz2n9fu58M0Nh1J/JzcFpfgkFHHX3O35r5vcU=
github.com/cloudflare/circl v1.3.7/go.mod h1:sRTcRWXGLrKw6yIGJ+l7amYJFfAXbZG0kBSc8r4zxgA=
github.com/go-logr/logr v1.2.4 h1:g01GSCwiDw2xSZfjJ2/T9M+S6pFdcNtFYsp+Y43HYDQ=
github.com/go-logr/logr v1.2.4/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572 h1:tfuBGBXKqDEevZMzYi5KSi8KkcZtzBcTgAUUtapy0OI=
github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572/go.mod h1:9Pwr4B2jHnOSGXyyzV8ROjYa2ojvAY6HCGYYfMoC3Ls=
github.com/goccy/go-json v0.10.2 h1:CrxCmQqYDkv1z7lO7Wbh2HN93uovUHgrECaO5ZrCXAU=
github.com/goccy/go-json v0.10.2/go.mod h1:6MelG93GURQebXPDq3khkgXZkazVtN9CRI+MGFi0w8I=
github.com/google/uuid v1.3.1 h1:KjJaJ9iWZ3jOFZIf1Lqf4laDRCasjl0BCmnEGxkdLb4=
github.com/google/uuid v1.3.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/klauspost/compress v1.15.12 h1:YClS/PImqYbn+UILDnqxQCZ3RehC9N318SU3kElDUEM=
github.com/klauspost/compress v1.15.12/go.mod h1:QPwzmACJjUTFsnSHH934V6woptycfrDDJnH7hvFVbGM=
github.com/google/go-cmp v0.5.9 h1:O2Tfq5qg4qc4AmwVlvv0oLiVAGB7enBSJ2x2DqQFi38=
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/pprof v0.0.0-20210407192527-94a9f03dee38 h1:yAJXTCF9TqKcTiHJAE8dj7HMvPfh66eeA2JYW7eFpSE=
github.com/google/pprof v0.0.0-20210407192527-94a9f03dee38/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/klauspost/compress v1.17.5 h1:d4vBd+7CHydUqpFBgUEKkSdtSugf9YFmSkvUYPquI5E=
github.com/klauspost/compress v1.17.5/go.mod h1:/dCuZOvVtNoHsyb+cuJD3itjs3NbnF6KH9zAO4BDxPM=
github.com/onsi/ginkgo/v2 v2.9.5 h1:+6Hr4uxzP4XIUyAkg61dWBw8lb/gc4/X5luuxN/EC+Q=
github.com/onsi/ginkgo/v2 v2.9.5/go.mod h1:tvAoo1QUJwNEU2ITftXTpR7R1RbCzoZUOs3RonqW57k=
github.com/onsi/gomega v1.27.6 h1:ENqfyGeS5AX/rlXDd/ETokDz93u0YufY1Pgxuy/PvWE=
github.com/onsi/gomega v1.27.6/go.mod h1:PIQNjfQwkP3aQAH7lf7j87O/5FiNr+ZR8+ipb+qQlhg=
github.com/quic-go/quic-go v0.41.0 h1:aD8MmHfgqTURWNJy48IYFg2OnxwHT3JL7ahGs73lb4k=
github.com/quic-go/quic-go v0.41.0/go.mod h1:qCkNjqczPEvgsOnxZ0eCD14lv+B2LHlFAB++CNOh9hA=
github.com/tam7t/hpkp v0.0.0-20160821193359-2b70b4024ed5 h1:YqAladjX7xpA6BM04leXMWAEjS0mTZ5kUU9KRBriQJc=
github.com/tam7t/hpkp v0.0.0-20160821193359-2b70b4024ed5/go.mod h1:2JjD2zLQYH5HO74y5+aE3remJQvl6q4Sn6aWA2wD1Ng=
golang.org/x/crypto v0.1.0 h1:MDRAIl0xIo9Io2xV565hzXHw3zVseKrJKodhohM5CjU=
golang.org/x/crypto v0.1.0/go.mod h1:RecgLatLF4+eUMCP1PoPZQb+cVrJcOPbHkTkbkB9sbw=
golang.org/x/net v0.7.0 h1:rJrUqqhjsgNp7KqAIc25s9pZnjU7TUcSY7HcVZjdn1g=
golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/sys v0.5.0 h1:MUK/U/4lj1t1oPg0HfuXDN/Z1wv31ZJ/YcPiGccS4DU=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/text v0.7.0 h1:4BRB4x83lYWy72KwLD/qYDuTu7q9PjSagHvijDw7cLo=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/crypto v0.18.0 h1:PGVlW0xEltQnzFZ55hkuX5+KLyrMYhHld1YHO4AKcdc=
golang.org/x/crypto v0.18.0/go.mod h1:R0j02AL6hcrfOiy9T4ZYp/rcWeMxM3L6QYxlOuEG1mg=
golang.org/x/net v0.20.0 h1:aCL9BSgETF1k+blQaYUBx9hJ9LOGP3gAVemcZlf1Kpo=
golang.org/x/net v0.20.0/go.mod h1:z8BVo6PvndSri0LbOE3hAn0apkU+1YvI6E70E9jsnvY=
golang.org/x/sys v0.16.0 h1:xWw16ngr6ZMtmxDyKyIgsE93KNKz5HKmMa3b8ALHidU=
golang.org/x/sys v0.16.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/text v0.14.0 h1:ScX5w1eTa3QqT8oi6+ziP7dTV1S2+ALU0bI+0zXKWiQ=
golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/tools v0.9.1 h1:8WMNJAz3zrtPmnYC7ISf5dEn3MT0gY7jBJfw27yrrLo=
golang.org/x/tools v0.9.1/go.mod h1:owI94Op576fPu3cIGQeHs3joujW/2Oc6MtlxbF5dfNc=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
Loading

0 comments on commit 8e9b33e

Please sign in to comment.