Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emscripten support #3330

Draft
wants to merge 42 commits into
base: master
Choose a base branch
from
Draft

Emscripten support #3330

wants to merge 42 commits into from

Conversation

joemarshall
Copy link
Contributor

@joemarshall joemarshall commented Oct 2, 2024

Summary

I added a discussion for this ages back but there's been no input, so I've written it (because I was contracted to do the work anyway, so I might as well contribute it upstream). This PR adds support for running in emscripten / webassembly platforms, where all network connections go via the browser.

Currently in progress, but tests okay locally, so I've opened this to check the CI changes, I've got to update docs also.

Checklist

  • [X ] I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.

@tomchristie
Copy link
Member

This is really interesting, thanks... ☺️

I've taken a bit of a look at the ecosystem here, tho am going to need a bit more orientation... Would it make sense to document an example of how to write an HTML page that includes a Python REPL with httpx imported and available?

@zanieb
Copy link
Contributor

zanieb commented Oct 4, 2024

Cool.

Related python/steering-council#256

@joemarshall
Copy link
Contributor Author

@tomchristie I added some docs, and a page in the docs which is a live demo, along with instructions for hosting it yourself. If you clone this PR and then run scripts/build and scripts/docs you should be able to see the emscripten port working in chrome (on the advanced/emscripten page.

If this gets merged I can contribute this to the main pyodide distribution. Once that is done it would mean that import httpx would just work in pyodide environments.

@tomchristie
Copy link
Member

Okay, really interesting... I've had a bit of a play around with this tho could do with walking through from the basics, if you're able to spend the time...

I'd like to start by getting to the point that I can add a custom transport to httpx in the pyodide console...

Here's my starting steps...

Open up https://pyodide.org/en/latest/console.html

Install httpx. It's not built-in, okay that's expected. It does load with micropip, which makes sense since it's pure python. Oddly ssl needs to be imported first(?). After that it can be imported just fine. 👍

Welcome to the Pyodide 0.27.0.dev0 terminal emulator 🐍
Python 3.12.1 (main, Oct  7 2024 14:46:27) on WebAssembly/Emscripten
Type "help", "copyright", "credits" or "license" for more information.
>>> import httpx
Traceback (most recent call last):
  File "<console>", line 1, in <module>
ModuleNotFoundError: No module named 'httpx'
>>> import micropip
>>> await micropip.install('httpx')
>>> import httpx
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.12/site-packages/httpx/__init__.py", line 2, in <module>
    from ._api import *
  File "/lib/python3.12/site-packages/httpx/_api.py", line 6, in <module>
    from ._client import Client
  File "/lib/python3.12/site-packages/httpx/_client.py", line 12, in <module>
    from ._auth import Auth, BasicAuth, FunctionAuth
  File "/lib/python3.12/site-packages/httpx/_auth.py", line 12, in <module>
    from ._models import Cookies, Request, Response
  File "/lib/python3.12/site-packages/httpx/_models.py", line 11, in <module>
    from ._content import ByteStream, UnattachedStream, encode_request, encode_response
  File "/lib/python3.12/site-packages/httpx/_content.py", line 17, in <module>
    from ._multipart import MultipartStream
  File "/lib/python3.12/site-packages/httpx/_multipart.py", line 8, in <module>
    from ._types import (
  File "/lib/python3.12/site-packages/httpx/_types.py", line 5, in <module>
    import ssl
ModuleNotFoundError: No module named 'ssl'
>>> import ssl
>>> import httpx

The models work as expected...

>>> httpx.Request('GET', 'https://www.example.com')
<Request('GET', 'https://www.example.com/')>

Tho we can't send requests, again, as expected...

>>> httpx.get('https://www.example.com')
Traceback (most recent call last):
...
httpx.ConnectError: [Errno 23] Host is unreachable

Okay, so next step, I need to figure out how to send a JS request/response in this console, so I can then implement a transport class using that. Let's try XMLHttpRequest?...

>>> import js
>>> js_xhr = js.XMLHttpRequest.new()
>>> js_xhr.open('GET', 'http://www.example.com/', False)
>>> js_xhr.send()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
pyodide.ffi.JsException: NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'http://www.example.com/'.

Okay, how about using fetch?...

>>> await js.fetch('https://www.example.com/')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
pyodide.ffi.JsException: TypeError: Failed to fetch

That's where I'm currently stuck... what am I missing in order to make a simple-as-possible XMLHttpRequest or fetch work?

@hoodmane
Copy link

hoodmane commented Oct 8, 2024

I think you're running afoul of cross origin resource sharing (CORS) restrictions here. Try fetching console.html so that it will be a same origin request. Or from a CDN or anything that sets access-control-allow-origin: * as a response header.

@hoodmane
Copy link

hoodmane commented Oct 8, 2024

>>> from js import fetch
>>> resp = await fetch("console.html")
>>> text = await resp.text()
>>> print(text[:100])
<!doctype html>
<html>
  <head>
    <meta charset="UTF-8" />
    <meta
      http-equiv="origin-tria

@tomchristie
Copy link
Member

Ah yep, okay...

>>> r = await js.fetch('https://cdn.jsdelivr.net/pyodide/v0.23.4/full/repodata.json')
>>> t = await r.text()
>>> t[:100]
'{"info": {"arch": "wasm32", "platform": "emscripten_3_1_32", "version": "0.23.4", "python": "3.11.2"'

@tomchristie
Copy link
Member

tomchristie commented Oct 8, 2024

Okay, well this is neat.

Open the pyodide console, then...

>>> import micropip, ssl, js
>>> await micropip.install('httpx')
>>> import httpx
>>> class JSTransport(httpx.AsyncBaseTransport):
    async def handle_async_request(self, request):
        url = str(request.url)
        options = {
            'method': request.method,
            'headers': dict(request.headers),
            'body': await request.aread(),
        }
        fetch_response = await js.fetch(url, options)
        status_code = fetch_response.status
        headers = dict(fetch_response.headers)
        buffer = await fetch_response.arrayBuffer()
        content = buffer.to_bytes()
        return httpx.Response(status_code=status_code, headers=headers, content=content)
>>> client = httpx.AsyncClient(transport=JSTransport())
>>> r = await client.get('https://cdn.jsdelivr.net/pyodide/v0.23.4/full/repodata.json')
>>> r
<Response [200 OK]>
>>> r.json()
{'info': {'arch': 'wasm32', 'platform': 'emscripten_3_1_32', 'version': '0.23.4', 'python': '3.11.2'}, 'packages': {'asciitree': {'n
ame': 'asciitree', 'version': '0.3.3', ...

@tomchristie
Copy link
Member

tomchristie commented Oct 9, 2024

Dealing with this incrementally, here’s some isolated PRs that I think we should address first…

  • Refactor the import of httpcore so that it’s only loaded if HTTPTransport/AsyncHTTPTransport is instantiated.
  • Refactor the import of certifi in _config.py so it’s only loaded if SSLContext is instantiated.
  • Refactor imports of ssl so that it’s only loaded if SSLContext is instantiated, or is behind a TYPE_CHECKING guard.

(If anyone’s up for tackling these, currently ought to be against the version-1.0 branch, until that’s merged)

@tomchristie
Copy link
Member

Thanks again for your work here @joemarshall.
Here's where I think we're at on this...

  • Lazy load certifi & httpcore
  • Import ssl under typechecking branches.
  • Consider introducing JSFetchTransport().

@joemarshall
Copy link
Contributor Author

@tomchristie I put in the PR that makes import ssl optional now (#3385 )

I updated this PR so it follows on from that PR.

How this PR works now is it moves _transports.default into _transports.httpcore, which defines [Async]HTTPCoreTransport, adds an extra module _transports.jsfetch file which defines [Async]JavascriptFetchTransport. Then in _transports/__init__.py it adds an alias of HTTPTransport which goes to whichever HTTP backend is in use (i.e. httpcore by default, JS fetch on emscripten)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants