Async support #1672

sla-te · 2023-09-07T12:22:52Z

Thank you for upgrading the lib to support v8.x.

Maybe this is a good time to ask for officially adding async support according to https://elasticsearch-py.readthedocs.io/en/v8.9.0/async.html, as it is clearly a feature, that will be used by a lot of users (and already has quite a history of requests about it).

I found #1435 which was closed in favor of #1480, which is apparently already 2 years old.

That all being said, we have been using the following "hack", to achieve async support in the past (using v7.4.1):

import asyncio

from elasticsearch import AsyncElasticsearch
from elasticsearch_dsl import Search

config = {...: ...}
indexname = ...
es = AsyncElasticsearch(**config)


async def do_es_async_work():
    s = Search(index=indexname).filter(...).filter(...)
    r = await s.execute().to_dict()
    return r

print(asyncio.run(do_es_async_work()))

On trying it with the newly released version (v.8.9.0), this now sadly throws:

RuntimeWarning: coroutine 'AsyncElasticsearch.search' was never awaited

Upon looking at diffs between v7.4.1 and v.8.9.0, I was able to identify the root of the problem at

elasticsearch-dsl-py/elasticsearch_dsl/search.py

Line 710 in 3073976

self._response = self._response_class(

It seems, that wrapping the response of the call to es.search in an object, now rendered this hack to fail, as body would only exist after awaiting es.search, if AsyncElasticsearch is used as connection. - Now the problem coming up with the new structure is, that I do not see a way to create a new "hack" to achieve passing through the async call through to AsyncElasticsearch directly, without starting to use e.g. asyncio.run_in_executor, which would cause a terrible overhead, as every async call would open a new thread, which kind of destroys the advantage of using async in this scenario if I understand this correctly.

The text was updated successfully, but these errors were encountered:

pquentin · 2023-09-07T13:17:03Z

Thank you for trying the new version and thank you for the heads up about async. I'm not yet ready to work on full async support, but what do you think about this totally unsupported hack?

import asyncio

from elasticsearch import AsyncElasticsearch
from elasticsearch_dsl import Search, connections


config = {...: ...}
indexname = ...
es = AsyncElasticsearch(**config)


class AsyncSearch(Search):
    async def execute_async(self, ignore_cache=False):
        if ignore_cache or not hasattr(self, "_response"):
            es = connections.get_connection(self._using)

            async_response = await es.search(
                index=self._index, body=self.to_dict(), **self._params
            )
            self._response = self._response_class(self, async_response.body)
        return self._response


async def do_es_async_work():
    s = AsyncSearch(using=es, index=indexname)
    r = (await s.execute_async()).to_dict()
    return r


print(asyncio.run(do_es_async_work()))

sla-te · 2023-09-07T14:04:59Z

Thank you for the quick reply, yes that works of course but would also require refactoring our entire code base (across multiple repositories) which sometimes calls execute asynchronously and sometimes synchronously (and yes i know thats hacky :) )

pquentin · 2023-09-08T06:29:16Z

Interesting. Can you please show me how you are using this today, with async and sync examples?

pquentin · 2023-09-08T09:54:12Z

I have been thinking about this quite a lot because it could help with the design of the async support and using body when adding support for Elasticsearch 8 was not an obvious decision, and I'm considering returning the full ObjectApiResponse object that behaves like a dict in the next version.

That said, another totally unsupported hack with the current version is:

import asyncio

from elasticsearch import AsyncElasticsearch, Elasticsearch
from elasticsearch_dsl import Search, connections


config = {...: ...}
indexname = ...
async_es = AsyncElasticsearch(**config)
es = Elasticsearch(**config)


class AsyncSearch(Search):
    async def execute(self, ignore_cache=False):
        if ignore_cache or not hasattr(self, "_response"):
            es = connections.get_connection(self._using)

            if isinstance(es, AsyncElasticsearch):
                response = await es.search(
                    index=self._index, body=self.to_dict(), **self._params
                )
            else:
                response = es.search(
                    index=self._index, body=self.to_dict(), **self._params
                )
            self._response = self._response_class(self, response.body)
        return self._response


async def do_work(using):
    s = AsyncSearch(using=using, index=indexname)
    r = (await s.execute()).to_dict()
    print(r["hits"]["total"])


# async
asyncio.run(do_work(using=async_es))

# sync
def run_secretly_sync_async_fn(async_fn, *args, **kwargs):
    coro = async_fn(*args, **kwargs)
    try:
        coro.send(None)
    except StopIteration as exc:
        return exc.value
    else:
        raise RuntimeError("you lied, this async function is not secretly synchronous")


run_secretly_sync_async_fn(do_work, using=es)

It uses two tricks:

one to await or not depending on the Elasticsearch client instance,
one to run async functions like do_work or execute in a sync context, taking advantage of the fact that they are "secretly synchronous" (source)

sla-te · 2023-09-12T10:08:42Z

Hey, sorry for the late reply, here is what I think:

The workaround for the current version is cool and would actually work (temporarily), nice tricks!

When adding async support I would not follow our rather hacky approach to make a function/method awaitable by passing through the connection object, instead I would define a separate module, that implements new classes with async support analogue to how python-playwright handles such a scenario (structure wise) (https://github.com/microsoft/playwright-python/tree/main/playwright). - Yes that will require the users to refactor their code (including us) but attempting to hack sync and async together would simply create too much potential for bugs.

Apart from the above I would separate out all logic , that is async/sync independant and bundle it in a top level module, to be used by both the async and the sync module. I would think of a structure like this:

Going with urllib3s idea to use composition and pass a kind of baseclass/backend, that defines the core logic is sure a valid approach but if you ask me it makes the the entire thing more (and unnecessarily) complex internally.

miguelgrinberg · 2024-03-25T17:19:17Z

Also #1355 #1624

miguelgrinberg · 2024-04-03T10:07:46Z

Done in #1714, will be part of the 8.13 release.

miguelgrinberg added the Category: Enhancement label Mar 25, 2024

miguelgrinberg mentioned this issue Mar 25, 2024

Add support of Async calls #1624

Closed

miguelgrinberg mentioned this issue Mar 25, 2024

Add support for async I/O #1355

Closed

2 tasks

pquentin mentioned this issue Apr 3, 2024

async support #1714

Merged

16 tasks

miguelgrinberg closed this as completed Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async support #1672

Async support #1672

sla-te commented Sep 7, 2023 •

edited

Loading

pquentin commented Sep 7, 2023 •

edited

Loading

sla-te commented Sep 7, 2023 •

edited

Loading

pquentin commented Sep 8, 2023

pquentin commented Sep 8, 2023

sla-te commented Sep 12, 2023 •

edited

Loading

miguelgrinberg commented Mar 25, 2024

miguelgrinberg commented Apr 3, 2024

Async support #1672

Async support #1672

Comments

sla-te commented Sep 7, 2023 • edited Loading

pquentin commented Sep 7, 2023 • edited Loading

sla-te commented Sep 7, 2023 • edited Loading

pquentin commented Sep 8, 2023

pquentin commented Sep 8, 2023

sla-te commented Sep 12, 2023 • edited Loading

miguelgrinberg commented Mar 25, 2024

miguelgrinberg commented Apr 3, 2024

sla-te commented Sep 7, 2023 •

edited

Loading

pquentin commented Sep 7, 2023 •

edited

Loading

sla-te commented Sep 7, 2023 •

edited

Loading

sla-te commented Sep 12, 2023 •

edited

Loading