-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration of official async ES client into DSL library #921
Comments
Team, do you have any suggestions please? |
Any help is appreciated |
I'm not sure that your changes would be compatible with Python2. I think a good example to look at is the Python ZMQ bindings; they are structured where all asyncio-related components exist in, and are only importable from, a separate sub-module called See PyZMQ asyncio docs and PyZMQ asyncio code; basically any non-asyncio component has an asyncio-specific counterpart in the So instead of I'm not sure that PyZMQ approach would work here, but I think it's worth investigating it to achieve asyncio support in |
@jaddison Thanks for your response. Let me check on that and get back. 👍 |
@jaddison Thanks for the approach, contemplating on how I arrived at this solution, I did not take Py2 into account. I was just thinking between sync and async users and make it compatible. In fact I was trying my best to make the signature exactly the same, but async def, syntax would not allow that. Anyway, packaging it better is a better though, let me vet it further. Thanks for checking this back. |
@vaidsu - note that I'm not a representative from Elastic, I'm just a user like yourself. So please don't take my comments as the way to get this support merged in. 🙂 I would like to see asyncio support in the library too, but it's up to the Elastic team to give royal assent or not. |
@jaddison still I have at least one person voting an interest 😄 I will try my best to see how we can fit into the import issues. |
I think that the best way to do this would be to actually have a module as part of https://github.com/elastic/elasticsearch-py-async (something like If there are any hooks or changes we need to make in this repo for it to work more cleanly we totally can, but having the code live in the |
@honzakral Thanks for your input,
The changes are pertaining to only few places, Search -> AsyncSearch, FacetedSearch -> AsyncFacetedSearch, Document -> AsyncDocument.
Thanks for your time, please let me know. |
I would be very open to changing the implementation of
The problem with that approach is that it wouldn't allow the code to use syntax and features that is only available in 3.x, thus greatly limiting what could be used and done. A separate library or a module in
I see no reason why this would have to be a plugin - it should be perfectly possible for this module to live outside of the library codebase and extend and wrap the functionality. As you mentioned before it is only several API calls ( Overall I see there is demand for this functionality so I propose a following scenario: Let us start with a set of simple helper methods as a proof-of-concept - simple funtctions like Once we have that code we can have another round of discussion about where should the code live - on its own, in Unfortunately I have neither bandwidth nor expertise to work on this; I am, however, available for any questions or feedback if someone chooses to take this on and remain open to include the resulting code in any of the official libraries if that is what would work best. Does that make sense? |
Also /cc @fxdgear since he is currently maintaining |
@honzakral I can work on that, since I am currently using that aspect of the code integration from my fork at work. And the core heart of our systems are purely async, so we cannot afford sync whatsoever. That said, really appreciate taking your time and provide valuable comments to move forward. Taking your example, You want to start with Current What I have done is., while the changes are very minimal, I abstracted the async methods into its own class called OR
Or should we provide full control to users to take care of save/delete low level APIs on their own? Sure, I think this is doable, only concern is when it comes to async Vs. sync due to its syntax differences, even having an abstract interface may be difficult. Former option seems feasible, how do you want me to proceed to give a try. [EDIT] We may still need to balance out the Python3 and 2 differences in localizing in the same place. Thanks 👍 |
Is there anything that I can do to help make this a reality? Glad to see your progress on this already @vaidsu ! |
@Chaosca whatever code I have posted in the fork is what I am using internally in the internal infrastructure. Due to immediate stake holder MVP demos I did not complete the proper separation of concerns in the code. I might close in the weeks to come. Do you want this feature urgently? Or do you have any ideas to quickly upgrade my fork? |
I'm not sure what is all needed to make it complete. However it's for sure something that I would like to use on a project that I'm working on. Does your changes also use the elasticsearch-async-py wrapper for things that the DSL can't do? For example I use the DSL |
@Chaosca sorry for delay. The idea is to use the same library, but when it comes to async you use the async APIs with async/await. # SYNC
# HelloWorld is the DocType
hw = HelloWorld(
hello='foo',
timestamp=time.time())
hw.save()
hw.update(hello="anotherworld", timestamp=int(time.time() * 1000)) And when it comes to async, the same is reused (by using Picking a snippet from the internal documentation template I wrote with the fork. # ASYNC
async def async_fn(future):
print('Executing faceted search')
# Now lets try a faceted search
for i in ['some', 'cool']:
hw = HelloWorld(hello=i, timestamp=int(time.time() * 1000))
await hw.async_save(using=es, refresh=True)
await hw.async_refresh(using=es)
adv = await AdvancedSearch().execute()
for hello, count, selected in adv.facets.hello:
print('Got {0} for {1} counts'.format(hello, count))
future.set_result('type3') Two main differences to note:
I was given nice options from folks to modify the current state of implementation to slightly different version to attempt to merge. I was also given an option to use as a separate library, but that is difficult considering that most audience come here. Here is the current state. I am sad to not being able to spend much immediately to close this right away, but at least I have the code running internally. Am I clear? Please let me know. |
Thanks for the info @vaidsu Is there a way to access the async es wrapper after initializing the DSL with from elasticsearch_dsl.connections import connections
es = connections.create_connection(["hosts"]) # Way to create the connection with the async wrapper?
hw = HelloWorld(
hello='foo',
timestamp=time.time())
await hw.async_save()
await hw.async_update(hello="anotherworld", timestamp=int(time.time() * 1000))
# Async update?
await es.update(body={
"script": {
...
}
}, index='myindex', doc_type='hello_world', id=hw.meta.id) Since there isn't a way to run update scripts in the DSL right now there are some points where the underlying API has to be used. Seems like your implementation idea is well thought out though and I hope it can be merged in the near future :) |
Now that Python2 is imminently reaching EOL, would a Python3 solution be accepted? If not, what would need to be changed from @vaidsu's solution to make it acceptable? Thanks! ~Timothy |
Closing this as a duplicate of #1355 |
Hi team,
Great effort on contributing and maintaining a great layer on top of ES access.
While there are some discussions around the async integration into the official DSL library like #704 and #556 I did not find any conclusion on that. I did my research and also found there isn't anything existing. While there are some prototypes and suggestions like here I did not see an implementation.
I was presuming, it could be due to two reasons: Either its really not there and no body has raised a PR yet, (or) this could be a trivial implementation so most users took care by overriding the parent classes.
In any case, I needed the integration of the official async python library into DSL. I was experimenting many ways, and something simple yet required to duplicate was to create
async_xyz
classes on top of the low level APIs that were accessing thees
client functions and eventually not awaiting the coroutines from async ES client.That said, here is a very basic working/and tested implementation in my fork master...vaidsu:master
Please consider this as a basic prototype implementation and I just started writing tests. I am not sure, if this is good to go, or not. Then if yes, need to understand how far I need to write tests, I am just planning to make all the sync tests runnable on async calls.
Please let me know. If you like the thought, but have suggestions, also feel free to share -- I am open to contribute back.
Thanks
The text was updated successfully, but these errors were encountered: