-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using tokio's AsyncBufRead [Rebased] #314
Conversation
Looks good to me. After we merge, it would probably make sense to extract the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
I have a few small questions but I am willing to merge it so we can iterate on it.
I don't really like all the code duplication and I believe there should be a way to be more DRY.
EDIT: could you run a cargo fmt
?
src/utils.rs
Outdated
} | ||
} | ||
pub fn write_cow_string(f: &mut Formatter<'_>, s: &[u8]) -> Result { | ||
write_byte_string(f, s)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope! good eyes, reverted.
tests/unit_tests.rs
Outdated
loop { | ||
let event = match reader.read_event(&mut buf)? { | ||
Eof => break, | ||
Start(elem) => { | ||
let mut attrs = elem.attributes().collect::<Result<Vec<_>>>()?; | ||
let mut attrs = elem.attributes().collect::<Result<Vec<_>>>().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not keeping the ?
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted!
Ran the formatter, couple small cleanups. |
What is this waiting for? |
There is a missing file, @itsgreggreg any idea? |
It's a path joining error, If you look at the error the first separator in the path is a windows style https://github.com/tafia/quick-xml/runs/3644904506?check_suite_focus=true#step:7:5 I don't know what's causing that. |
whelp, here's why https://doc.rust-lang.org/std/macro.include_bytes.html
|
I'm a bit surprised, I have used forward slash in windows quite a lot at work and I never had this issue, I understand the other way may not work (escaping characters) but it doesn't seem to be the case here. |
Upon closer inspection it seems that through the rebase I missed some changes to xmlrs_reader_tests.rs that removed that path but the file itself was still removed. I went through the file and hopefully made sure it was up to date. My bad for rushing to a conclusion when really I should have noticed it was working before. Interesting that the rust docs say paths in include_str and friends have to be OS specific but unix paths work on windows. I wonder if windows paths work on *nix. |
I'm currently building a crate for an XML-based protocol based on this PR's source branch as I need pull-based async XML deserialization for my crate. It'd be really helpful if this could be merged so I can publish my work to crates.io. Is there anything I could help with to get this merged? |
You could use |
@notgull sadly the specifics of the protocol I'm dealing with require me to read from a network socket until I have read one full XML element. As my network socket should be async, I need pull-based XML parsing from an |
What protocol is this? |
It's the Greenbone Management Protocol. A vendor-specific protocol that, as far as I can tell, only has an official Python implementation, which also does nothing more than use a pull-based XML parser to read all received data until one full XML element has been read. Yes, I know it's a bit stupid but I can't really change it. The protocol really just sends XML data over a TLS-encrypted TCP stream without any kind of header or footer with the connection being kept open for further requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything I could help with to get this merged?
@999eagle, Yes, you can. I'd like to see the following changes:
- investigate the possibility to depend on the crate with only
async
trait, without async runtime. Runtime is a quite big thing and I would prefer not to have it because we don't need it. It is fine to have it if there is no other way, but this should be proved - (required) eliminate code duplication either by introducing a trait and different implementations for the sync and async readers, or by using macros
- I prefer to have a
read_event_async
(borrowed version) /read_event_into_async
(copying into provided buffer version) instead of a separate reader, because I want to introduce a dedicated reader for namespace support and do not want to have 4 readers (non-ns/ns + sync/async). Async abilities seems does not needed a dedicated reader. I'm open for discussion, although - I prefer to put all async examples/tests to their own files
- tests for internal functions similar to that should be added:
Lines 2591 to 2599 in 0febc2b
/// Tests for reader that generates events that borrow from the provided buffer mod buffered { check!(&mut Vec::new()); } /// Tests for reader that generates events that borrow from the input mod borrowed { check!(()); }
Preferably if it could be added to that group of tests
async
support definitely won't be in 0.24.0 release, because I plan to release it when I finish with namespaces support which could be soon. 0.25.0 is an option.
Probably it is better to rewrite async
support from scratch using this PR as inspiration
@@ -33,6 +36,7 @@ default = [] | |||
encoding = ["encoding_rs"] | |||
serialize = ["serde"] | |||
escape-html = [] | |||
asynchronous = ["tokio", "async-recursion"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to name it async
@Mingun a few thoughts on your points:
What do you think about those ideas? And if I was to start that work, should it be as a new PR or as a change to this one? I definitely agree with the last two points that async tests should be separate and that more tests are required. |
I've learned a bit about current async support in Rust and it seems that there is no a single crate with just the
Actually, Lines 464 to 465 in 0febc2b
Actually, I had already planned to remove that bound from another places but doubted would that will provide a better ergonomic, but now I have arguments. Perhaps I will make a separate PR with this (and some other) change (#412). Sync and async readers have the same set of fields, and it is safe to read one event via sync interface and other via async interface -- this would not break anything. So all that we need are a separate // sync methods
impl<R: BufRead> Reader<R> { /*...*/ }
// async methods
impl<R: AsyncBufRead> Reader<R> { /*...*/ } We even can have a native implementation for each async library, using different So I think, the separate reader is not necessary. For comparison's sake, namespaced reader have additional fields, so a separate
Yes, it makes sense to split reading from parsing where possible. Small code duplications are allowed or may be hidden behind macros.
I think it is better to start a new PR -- some parts of code was changed significantly since this PR was created; I also prefer to have a clean history without fixup commits |
Okay, I've started working on this (see the linked PR #417 which I've opened to allow for discussion). For now, I had to use an Additionally, separating implementations based on trait bounds (that is, |
Closing the old PRs in favor of #417 |
This is a rebase of this PR #233 on top of the current master.
I don't know if there was a better way to do this and my intention is not to step on anyone's work but rather keep it moving along. I'm happy to take this PR down if you want to update yours @endor or honestly whatever works for people. This PR is a proper rebase of the endor/async on top of tafia/master so line by line credit is still in tact except for the few places I had to fix rebase conflicts.
@endor, @tafia happy to help with whatever y'all need help with here, lemme know!