Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow a user to stay within BStr / BString types #117

Closed
epage opened this issue Jul 11, 2022 · 7 comments
Closed

Allow a user to stay within BStr / BString types #117

epage opened this issue Jul 11, 2022 · 7 comments
Labels
question Further information is requested

Comments

@epage
Copy link

epage commented Jul 11, 2022

In considering feedback on my use of bstr for 1.0 (see #40), one annoyance that stood out to me is having to convert things back to a BStr when doing transformations.

In the ideal world, the output type would match the input type.

@epage epage mentioned this issue Jul 11, 2022
@epage
Copy link
Author

epage commented Jul 11, 2022

From #40 (comment):

Right yeah. That's what bstr 0.1 was: https://docs.rs/bstr/0.1.4/bstr/

@epage
Copy link
Author

epage commented Jul 11, 2022

From #40 (comment):

@epage See #5 and #8 for more details on why bstr 0.1 -> 0.2 went from concrete BString/BStr types to extension traits on Vec<u8>/[u8]. It is slightly sad that if you have a BStr and call an extension trait method that returns a subslice of it that you get a &[u8] back instead. I'm not sure there is any simple machinery that would let me fix that while maintaining the existing extension trait API on [u8].

Happy to answer more questions on that but would prefer opening a new issue for it if you have more questions.

@epage
Copy link
Author

epage commented Jul 12, 2022

Looking over #5, it seems the main reasons for the switch are:

  • More universal interop since existing APIs work with &[u8] / Vec<u8>
  • Encouraging use of &[u8] / Vec<u8> in public APIs
  • Reduce boiler plate in forwarding calls to the underlying types

@epage
Copy link
Author

epage commented Jul 12, 2022

I suspect we could make the trait reference Self and a borrowed version of Self to allow the u8 version to work with more u8s and the BStr version to work with BStr. This will require a lot of boilerplate but gets us into a middle ground between 0.1 and 0.2. I'd be willing to prototype this if there is interest.

However, I'm assuming this won't be worth it but wanted to mentioned this as one last possibility before ending the conversation. If there isn't interest, feel free to close this issue.

The one reason it might be worth doing is dependent on the question of what this crate would look like merged into core and alloc and if we wanted to move this crate more into that direction now rather than later.

CC @joshtriplett since you have expressed interest in bstr going into the stdlib in #40 (comment)

@BurntSushi
Copy link
Owner

BurntSushi commented Jul 12, 2022

Right. I had used bstr 0.1 in anger and the explict BStr/BString types were just really annoying. The constant converting back-and-forth was super painful and led to a lot of noise in the code. The extension traits have been a lot nicer to use in practice in my own experience.

@epage If you're referencing Self everywhere, then I think that also requires infecting almost every iterator type with a generic type, which is annoying. And note that some iterators (like graphemes()) specifically yield &str and not &[u8] nor &BStr.

In terms of whether this is worth exploring or not, I'm not sure. bstr 0.2 has been sitting around for a long time and folks are (rightfully) looking for a 1.0 release that is stable. I think that is ultimately my priority. Rejiggering the entire API to make BStr a little more convenient would likely mean that I either can't do a 1.0 release now (instead it would be a 0.3), or I would have to do a 2.0 release much sooner than I hoped. So I will say I don't have a ton of appetite for this.

I'm not quite sure bstr is ready to make the jump to core to be honest. I don't know exactly how many people are using it, but it feels like the answer is probably "not that many." Maybe a less resistant path might be to just start adding more "text-like" methods on [u8] and see where that goes. Substring search would be a great start, for example.

The issue of whether to bring the BString/BStr types themselves into std is a thorny one. It is "yet another string type," for example. However, they are really one of the key motivating reasons for why I authored bstr in the first place. I was just sick and tired of the default Debug impl on &[u8]. It is of course the correct Debug impl, but given where I work, I almost always look at &[u8] as a string rather than just some bytes. And so, I want to print it as a string.

@epage
Copy link
Author

epage commented Jul 12, 2022

Fully understandable

The issue of whether to bring the BString/BStr types themselves into std is a thorny one. It is "yet another string type," for example.

Agreed and have been curious what Josh has had on his mind for this (whether he expects to loosen OsStr enought to be a BStr or something else).

@BurntSushi
Copy link
Owner

I'm going to close this out based on my reasoning above. I think finding a middle ground here will ultimately lead to a much more complex API with small benefits. I acknowledge they might be non-zero, but I think most uses of bstr are probably just going to fully stay in &[u8] territory with occasional (if any) conversions to BStr.

@BurntSushi BurntSushi closed this as not planned Won't fix, can't repro, duplicate, stale Sep 2, 2022
@BurntSushi BurntSushi added the question Further information is requested label Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants