Capitalize package name #5

tkelman · 2016-01-11T20:27:35Z

If you plan on registering this. Otherwise will sort at the very end. Only other non capitalized package name is kNN, and that is deprecated and doesn't have any tags.

nalimilan · 2016-01-11T21:30:04Z

Yeah, I wasn't sure about that. Sorting isn't a real issue, but consistency is. The name of the interface is really iconv, so... What would you suggest else, Iconv or IConv?

nalimilan · 2016-01-11T21:30:50Z

We could also imagine giving it a more evocative name, like StringEncodings, StringEncoders...

tkelman · 2016-01-11T22:35:58Z

Case sensitive scripts might also miss the package - potentially PkgEval or others.

If the I stands for something I guess IConv would be better than Iconv, but a more descriptive name without referring to the library name jargon would work too.

ScottPJones · 2016-01-12T14:23:04Z

Either StringEncodings or StringEncoders sounds fine to me, much more general, and if you end up using something besides the iconv libraries to implement the encoding/decoding in the future, or for different platforms, would be more accurate.

nalimilan · 2016-01-12T14:52:22Z

The question is: if at some point we want to write a pure-Julia implementation, will be switch the package to use it, or create another package exporting the same API?

BTW, @ScottPJones, when you'll feel like writing something in Julia, I've discovered this MIT-licensed Node.js package which sounds cool to take as a base: https://github.com/ashtuchkin/iconv-lite/

ScottPJones · 2016-01-12T16:13:18Z

I already have some structures/code in Julia that I'll start benchmarking against iconv & ICU, right now just for all the 8-bit mappings, as well as some ideas on efficiently doing some different mb <-> Unicode
codecs to handle the rest. I'd love to have Julia get to the point where it character set conversions are consistent across all platforms, along with all of the nice features that Python 3.5 supports.

ScottPJones · 2016-01-14T12:51:41Z

@nalimilan So as not to continue off-topic here, I've created another issue to discuss performance / using pure Julia, #8
The link you sent for the Node.js package has some useful bits, like the method of representing the tables as JSON, those at least can be used as input to produce compact binary tables to load for handling multi-byte character sets.

ScottPJones · 2016-01-20T00:42:27Z

@nalimilan Would you be up for a PR to change the name from iconv.jl to StringEncoders.jl?
I think, besides Tony's considerations of not wanting all lower case, that it really deserves to be a nice generic package.

nalimilan · 2016-01-20T09:32:55Z

So, StringEncodings or StringEncoders? I would think "encoding" is the name most people will be looking for.

ScottPJones · 2016-01-20T14:07:10Z

I was leaning towards StringEncoders for this, because I'd seen in the past, if you have a module or package Foobars, it contains a type (or function) Foobar (and this has StringEncoder & StringDecoder).
Also, I think StringEncodings might be a more suitable name for a package/module with a parameterized StringEncoding type.
(I'd like to make one with traits to handle different encodings (little-endian, big-endian, and/or native-endian vs. opposite-endian, 8-bit, 16-bit or 32-bit codeunits, linear indexed vs. not, Unicode or not, etc.)

nalimilan · 2016-01-21T10:22:46Z

I was leaning towards StringEncoders for this, because I'd seen in the past, if you have a module or package Foobars, it contains a type (or function) Foobar (and this has StringEncoder & StringDecoder).

Yeah, but then we could name the package either StringEncoders and StringDecoders. The former isn't great because in many cases people will simply be looking for a way of reading a text file, and won't think about encoding anything.

Also, I think StringEncodings might be a more suitable name for a package/module with a parameterized StringEncoding type.
(I'd like to make one with traits to handle different encodings (little-endian, big-endian, and/or native-endian vs. opposite-endian, 8-bit, 16-bit or 32-bit codeunits, linear indexed vs. not, Unicode or not, etc.)

It such a type proves useful, why wouldn't it live in this package instead of elsewhere? Couldn't it be used to speed up conversions?

ScottPJones · 2016-01-21T13:26:00Z

StringDecoders would also be fine, maybe even StringConvert, which works both ways?
StringEncodings would be useful as the basis for a new, more efficient parameterized String
type, such as in https://github.com/quinnj/Strings.jl, so I think it would be best to be separate,
at least for now.

ScottPJones · 2016-01-21T13:30:07Z

Make that StringConverters maybe instead.

nalimilan · 2016-01-21T21:58:45Z

Sorry, but try typing "string converter" in your preferred search engine, and compare to "string encoding". The latter clearly reflects better the goal of the package.

If you create a package for encoded strings, why not call it EncodedStrings.jl? It would logically complement StringEncodings.jl.

ScottPJones · 2016-01-21T23:23:21Z

OK, fine, I'll run up a PR to change this to StringEncodings.jl if that's your favorite.
I'm thinking it deserves to not be hidden under the "iconv" name.

nalimilan · 2016-01-22T09:49:07Z

No worries, I've just done the rename. Now we need to decide what's needed before we tag a release.

ScottPJones · 2016-01-22T12:51:44Z

Great!
The thing that I feel might be needed (but could be added as an enhancement later), is to have the API for specifying the different types of strategies for invalid input, as we'd discussed before.
I think it is critical that it be done in such a way as to preserve optimal performance, so I'd recommend against keywords or using symbols to select.
I think an extra positional argument, that might take: a character, a string, or a type, might work best.
Default behaviour would be as now, to raise an exception.

nalimilan · 2016-01-22T13:39:22Z

This is definitely an improvement that can be added later, without breaking anything. I only wonder whether there are things that would need to be done immediately. For example, more efficient versions of encode/decode which do not create StringEncoder/StringDecoder objects just to destroy them one second later.

ScottPJones · 2016-01-22T13:47:43Z

I had been wondering about speeding things up by using a Dict to cache those. What do you think?

nalimilan closed this as completed Jan 22, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capitalize package name #5

Capitalize package name #5

tkelman commented Jan 11, 2016

nalimilan commented Jan 11, 2016

nalimilan commented Jan 11, 2016

tkelman commented Jan 11, 2016

ScottPJones commented Jan 12, 2016

nalimilan commented Jan 12, 2016

ScottPJones commented Jan 12, 2016

ScottPJones commented Jan 14, 2016

ScottPJones commented Jan 20, 2016

nalimilan commented Jan 20, 2016

ScottPJones commented Jan 20, 2016

nalimilan commented Jan 21, 2016

ScottPJones commented Jan 21, 2016

ScottPJones commented Jan 21, 2016

nalimilan commented Jan 21, 2016

ScottPJones commented Jan 21, 2016

nalimilan commented Jan 22, 2016

ScottPJones commented Jan 22, 2016

nalimilan commented Jan 22, 2016

ScottPJones commented Jan 22, 2016

Capitalize package name #5

Capitalize package name #5

Comments

tkelman commented Jan 11, 2016

nalimilan commented Jan 11, 2016

nalimilan commented Jan 11, 2016

tkelman commented Jan 11, 2016

ScottPJones commented Jan 12, 2016

nalimilan commented Jan 12, 2016

ScottPJones commented Jan 12, 2016

ScottPJones commented Jan 14, 2016

ScottPJones commented Jan 20, 2016

nalimilan commented Jan 20, 2016

ScottPJones commented Jan 20, 2016

nalimilan commented Jan 21, 2016

ScottPJones commented Jan 21, 2016

ScottPJones commented Jan 21, 2016

nalimilan commented Jan 21, 2016

ScottPJones commented Jan 21, 2016

nalimilan commented Jan 22, 2016

ScottPJones commented Jan 22, 2016

nalimilan commented Jan 22, 2016

ScottPJones commented Jan 22, 2016