Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji data for Scribe apps #14

Closed
2 tasks done
andrewtavis opened this issue Aug 18, 2022 · 41 comments
Closed
2 tasks done

Emoji data for Scribe apps #14

andrewtavis opened this issue Aug 18, 2022 · 41 comments
Assignees
Labels
data Relates to data or Wikidata feature New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@andrewtavis
Copy link
Member

Terms

Description

Now that Scribe-iOS is adding autosuggest in #194 and autocomplete in #188, other additions for these features could be considered including emojis. This feature would be to create python scripts that would create arrays of words and emojis that can represent them. When one of these words is entered, the emojis could then serve as autosuggestions or completions (depending on if the user has pressed space or not respectively).

These scripts would make use of emoji for the representations of emojis as words in the given languages, and would also make use of nltk for stemming (not lemmatizing) words to derive smile from smiling. Other language packages could be leveraged to derive adjective forms like smiley, and convert words like types of trees (aspen, birch, etc) to "tree" so they can also be converted (likely another issue).

Contribution

I'd be happy to work with someone on this once the features in Scribe-iOS are finished :)

The corresponding Scribe-iOS issue for this is #51, which would need to be worked on in the foreseeable future for this issue to make sense.

@andrewtavis andrewtavis added feature New feature or request help wanted Extra attention is needed good first issue Good for newcomers blocked Another issue is blocking data Relates to data or Wikidata labels Aug 18, 2022
@andrewtavis
Copy link
Member Author

Note that final version of this would be included in update_data.py so that changes to the scripts would be reflected in the apps.

@andrewtavis
Copy link
Member Author

andrewtavis commented Aug 18, 2022

Adding some easter eggs into this feature could also be a nice touch. These might be best as just autosuggestions and not completions. Some ideas are:

  • Wikidata: 💚 and 📁
  • Wikipedia: ❤️ and 📚
  • WIkimedia: 🌐 and 📖
  • Scribe: 💙 and ✏️

@wkyoshida
Copy link
Member

Hey @andrewtavis - I'd love to help out with this!

I presume that this is unblocked now with the completion of scribe-org/Scribe-iOS#194 and scribe-org/Scribe-iOS#188 - would that be correct?

Looking into it, I had some initial thoughts (mostly regarding the source for emoji data):

  • Maybe I overlooked something, but would emoji be able to really only provide Scribe the most (or two most) common "short name"[1] perhaps?
  • In its README, emoji links to EmojiTerra, which could be an interesting resource in that it has both "short names" and "keywords"[2] for emojis in multiple languages. I'm thinking that "short names" and "keywords" could be leveraged as the "trigger words" [3].
    • ✔️ With data for multiple languages, EmojiTerra could be useful for implementation across languages.
    • ❌ However, after some browsing, my feeling was that it does lack perhaps in the quantity of "keywords" per emoji. For instance, 🏃 could perhaps also be associated with the word "exercise".
  • I did come across emojilib though, which has an interesting emoji-en-US.json with a larger quantity of "keywords".
    • ✔️ The larger amount of associated "keywords" could translate into more "trigger words" that could be used.
    • ✔️ The project looks like it has a neat contribution process to add/edit/delete "keywords".
    • ❌ The project appears to only have US English presently. Perhaps the Scribe translation data could be used to get the equivalent in other languages, but literal translations are likely not ideal across-the-board.

With all that said, I'd love to continue discussions on this. Curious on further thoughts!

P.S. I do like the idea of using nltk for working with word stems 👍


Clarification on some terminology above:

  • [1] short name: (or shortcode) a shorthand qualifier used to insert an emoji in supported applications, e.g. in GitHub, the short name :+1: renders to 👍
  • [2] keywords: words that have association with the emoji, e.g. the :running: emoji (:running:) also has the keyword "marathon" associated with it in EmojiTerra
  • [3] trigger words: words that would prompt an emoji as an option for the auto-suggestion or auto-completion Scribe functionality

@andrewtavis andrewtavis removed the blocked Another issue is blocking label Oct 14, 2022
@andrewtavis
Copy link
Member Author

Hey @andrewtavis - I'd love to help out with this!

Hey @wkyoshida, would be great to get your help with this, and as always just with your research you already have! 😊 Thanks for your efforts already - especially with all the formatting that you do that makes it quite easy to follow 🙏

I presume that this is unblocked now with the completion of scribe-org/Scribe-iOS#194 and scribe-org/Scribe-iOS#188 - would that be correct?

This is now unblocked, correct 😊

  • Maybe I overlooked something, but would emoji be able to really only provide Scribe the most (or two most) common "short name"[1] perhaps?

Would be something we'd need to investigate, but there could be a way to do this 👍

  • ❌ However, after some browsing, my feeling was that it does lack perhaps in the quantity of "keywords" per emoji. For instance, 🏃 could perhaps also be associated with the word "exercise".

I agree with the positive point, and would say that for now let's not worry too much about the edge cases in the planning phase :) There could be a way to only use keywords for very popular emojis or subset in a logical way as we don't need autosuggestions for all emojis necessarily. With Scribe the big thing is keep single edge cases that we're dealing with to a minimum. There might even be some metadata on emojis on Wikidata that we could use 💡🤔 Big thing is that keywoards are going to be very necessary for some, as your example of :+1: for 👍 indicates, as that would really only be useful if it's linked to good and the language equivalents.

  • ❌ The project appears to only have US English presently. Perhaps the Scribe translation data could be used to get the equivalent in other languages, but literal translations are likely not ideal across-the-board.

I'd say let's limit ourselves to emoji and NLTK for now so that we're not overcomplicating it all :) :) This would definitely be something to look into if multilingual support grows or when Scribe adds the sought after English keyboard. I'd also sadly suggest that we not use the translations here for anything too serious. The whole point of those now is to get the functionality in, as it's very much beta. I'm in talks with folks/the community at Wikidata to try to get translations in, and when that happens we'll be able to really make use of them. As of now they're machine translations from 🤗 Hugging Face, and with that have their problems :)

Where to from here? 😊 I'm away for the weekend, just FYI, but if you wanted to start exploring the generation of an emojis.json file you'd be welcome. I'd also be happy to setup a base file as well, but you'd also be welcome to make gen_emoji_triggers.py to src/scribe_data/load. To me it makes sense that this functions like autosuggest where we'd work on functions that are loaded into Jupyter right now, as seeing the outputs is going to be very important :)

@andrewtavis
Copy link
Member Author

Let me know what your preferred involvement level on this would be, @wkyoshida 😊 Am slamming another project atm, but would be happy to set up some base codes for this in the coming days :)

@wkyoshida
Copy link
Member

Hey @andrewtavis! I meant to reply earlier, but wanted to focus on the keyboard variants discussion first as that was higher priority. I can definitely help with the development for this though!

Resuming:

Would be something we'd need to investigate, but there could be a way to do this 👍

For some added clarification, my initial point, perhaps, was more so cause the data file that the emoji project makes use of appears to be limited to only 1 "short name" per human language (some emojis have 2 for English). Browsing the file, to me it felt a little limiting if just going off of emoji, since, as a user, some words that I'd expect would trigger an emoji, would not. On the other hand though, if we look to really just get an MVP out with this issue, then I think that it'd probably be fine. Trigger word improvements could happen over time.

There might even be some metadata on emojis on Wikidata that we could use 💡🤔

👍 One reason why EmojiTerra piqued my interest was the good amount of data that it has, including for multiple languages. It also has more "keywords", which I thought could make it a little less limiting than only going off of emoji. With that said, one thought that I had was that Scribe could try reaching out to the EmojiTerra team even (their contact) and asking if they'd be interested in helping populate WikiData. No idea if they would, but just a thought.

I'd also sadly suggest that we not use the translations here for anything too serious.

Completely agree. My earlier note was added more as a potential path, but I too think that the Scribe translation data shouldn't be used for this. As you pointed out, it is still very much in beta.

would be happy to set up some base codes for this in the coming days :)

Sounds great! That'd be welcomed. And again, I can definitely take it from there and help with development 👍 🚀

@andrewtavis
Copy link
Member Author

Great to talk to you about all this, @wkyoshida 😊

On the other hand though, if we look to really just get an MVP out with this issue, then I think that it'd probably be fine. Trigger word improvements could happen over time.

Let's start down this path, and maybe we'll discover that there are ways of expanding it that are easier :) This tends to be how things work around here, as we planned to just do autocomplete, and then it snowballed into that with autosuggest and everything else that v2.0 had ☃️😅

With that said, one thought that I had was that Scribe could try reaching out to the EmojiTerra team even (their contact) and asking if they'd be interested in helping populate WikiData. No idea if they would, but just a thought.

The detail you put into all this is much appreciated 🤝 Looking into EmojiTerra makes sense :) Let's definitely consider this as the main source for the extra meanings, and reaching out to them about a collaboration or working with Wikidata would also be great! Is there an effective way that we can access the data on EmojiTerra, or would the hope be that a migration to Wikidata happens? I can reach out to the Wikidata community about emojis too as they might be willing to help with a WIP query :)

Sounds great! That'd be welcomed.

I'll work on the base file for you this weekend! Hope your week is going well so far 😊

@wkyoshida
Copy link
Member

This tends to be how things work around here, as we planned to just do autocomplete, and then it snowballed into that with autosuggest and everything else that v2.0 had ☃️😅

😆 Yeah - alright, I think that sounds fair. Let's go with an MVP, and improvements can then happen over time.

Is there an effective way that we can access the data on EmojiTerra, or would the hope be that a migration to Wikidata happens?

That would be the gap, I think. I'm not sure if I saw an easier way to access the data or either the source code for EmojiTerra. I was hoping a dump to Wikidata could be done, so Scribe could maintain its "powered-by-Wikimedia" status throughout the data it uses. As a last resort, of course, Scribe could always just scrape EmojiTerra, though I don't think that would be the ideal path.

I can reach out to the Wikidata community about emojis too as they might be willing to help with a WIP query :)

That would be awesome! 🙌

Hope your week is going well so far 😊

Oh for sure - thanks! Hope yours is as well ✌️

@wkyoshida
Copy link
Member

Hey @andrewtavis, given the discussions had in scribe-org/Scribe-iOS#89 (mainly referring to the idea of using a DB on the back-end), should development for this issue hold off a bit? If the data structure/storage changes, I'm thinking whatever scripts get created would have to be modified anyways (which could likely be true for all other data extraction scripts). As this issue isn't high priority though, I was thinking it could perhaps wait (at least until it's determined if the data structure will change). Curious what your thoughts are on this as well.

However, I am also thinking that Scribe could still follow through with trying to contact the EmojiTerra team - to at least get a little ahead. It could happen that by the time Scribe is ready to go develop on this issue, the data is already in Wikidata! 🙌 One can dream! 😆 If you'd like to delegate it off, I could give it a shot to contact EmojiTerra. It could also make sense to have yourself do it though as the main Scribe dev and representative. Just let me know!

@andrewtavis
Copy link
Member Author

I think it does make sense to hold off on this for now, but reaching out to them could still be beneficial in the short term :) Do you want to reach out to me in my email that's in my profile about that? We can draft an email and I can CC you in it? I think it would make sense that it comes from me, but I'd be happy to loop you in 😊

Btw: new feedback from one of my cowers who tried Scribe today is that he'd like QWERTY for all keyboards, including German. Looks like it'd make sense to have that option available for all keyboards :) :)

@andrewtavis
Copy link
Member Author

@wkyoshida, scribe-org/Scribe-iOS#241 is the issue opened by a coworker about the keyboard layouts :) Well done predicting the importance of this 👏😊

@wkyoshida
Copy link
Member

wkyoshida commented Oct 24, 2022

Do you want to reach out to me in my email that's in my profile about that?

Sounds great, @andrewtavis! I sent you an email just now. Let's draft something up 👍

Btw: new feedback from one of my cowers who tried Scribe today is that he'd like QWERTY for all keyboards, including German. Looks like it'd make sense to have that option available for all keyboards :) :)

🚀 I've always just sucked it up with whatever layout there was 😆, but if Scribe is able to provide the flexibility that comes with customization, I think that absolutely helps with those working in multiple languages.

One interesting thing though is how this might play out with Scribe-Desktop. Reason being that, with Scribe-iOS and Scribe-Android, changing the keyboard layout can more easily be done since the keyboards are virtual. Scribe-Desktop, likely in most cases, will be dealing with physical keyboards. The bindings can be changed of course, but the physical printed characters, not so much. I still think Scribe could should provide the customization options, but the need or demand for it might not be the same as in Scribe-iOS or Scribe-Android. Little sidetrack, but just a thought I had.


Edit: To clarify, I think the ability to customize Scribe-Desktop keyboards to all use the same layout would be beneficial though, as the coworker already exemplified this in scribe-org/Scribe-iOS#241:

I usually like to have all the keyboards on my phone in the same QWERTY layout.

I'm thinking this would be even more so necessary in Scribe-Desktop, because the physical layout can't change. The "need or demand" I referred to earlier was more to the ability to customize to have different layouts per Scribe-Desktop keyboard. Made an edit above as well to wording from "could" to "should", because I think Scribe would have to consider the physical constraints of Scribe-Desktop.

@andrewtavis
Copy link
Member Author

Sounds great, @andrewtavis! I sent you an email just now. Let's draft something up 👍

Will write back later in the week, @wkyoshida 😊🚀

I still think Scribe could should provide the customization options, but the need or demand for it might not be the same as in Scribe-iOS or Scribe-Android. Little sidetrack, but just a thought I had.

Glad that you're keeping Scribe-Desktop in mind! Yes it'll definitely be more about keyboard shortcuts, so this won't be as much of an issue at first, but then for that the speed of it all will be important as it's easy enough to have Google Translate open or go to a website, so we need to give those instantaneous results and keep people in their workflow 💪

Edit: To clarify, I think the ability to customize Scribe-Desktop keyboards to all use the same layout would be beneficial though

You bring up a good point here. Some people are using their keyboards with keys that are not exactly mapped to their keyboards - specifically our users working in a foreign country 😊 Wouldn't we be reading in the characters that the user is typing rather than the keys that they're pressing though? Not sure 🤔 Your thoughts on this would be very welcome :)

@wkyoshida
Copy link
Member

Will write back later in the week

Sounds good!


I also replied to the points made here about Scribe-Desktop over on an issue on that repo instead, this discussion. Wanted to make sure we didn't derail too much from the discussion here about the emoji data 😅

@andrewtavis
Copy link
Member Author

andrewtavis commented Oct 28, 2022

Checking in with you here as I said via email, @wkyoshida, as it makes sense to keep it in GitHub if we won't reach out to EmojiTerra. Looking into the Unicode and EmojiTerra's sources a bit more, I came across Unicode CLDR, and specifically their annotations and cldr-json files.

That looks to be everything we'd want, but then we'll need to check licensing and figure out if there's a good endpoint for all this 😊 Makes sense that there's someplace in code that all this lives :) :)

Let me know what your thoughts are on using the above!

@wkyoshida
Copy link
Member

Ah-ha! Alright - great find, @andrewtavis 🙌

Eventually I was able to get there also 😆 but I found Unicode's stuff perhaps not as easy to navigate/browse through imo (not sure if you felt the same). Had to move around subdomains under unicode.org to find things.

Anyways - I think those make sense! There does appear to be the ICU, which has official C/C++ and Java libraries that could be used to work with the CLDR data. There is also a list of related wrapper projects for other languages that Unicode links to as well. We could look into those.

@andrewtavis
Copy link
Member Author

Ya the Unicode stuff was very hard to get through, @wkyoshida 😅 I was at one point trying to put /de/ into parts of the web domain in order to try and find something German 🤣

The ICU looks like a good path going forward :) We can maybe look first into pyicu to see if it can work. I see from your profile that you have C/C++ and Java experience, but maybe trying to keep it all under one language right now would be best?

@wkyoshida
Copy link
Member

I was at one point trying to put /de/ into parts of the web domain in order to try and find something German 🤣

😆

maybe trying to keep it all under one language right now would be best?

Oh yes! I would agree. I didn't mean to sound like I was leaning for the C/C++ or Java ones, if it did 😆 Only mentioned them, since they appear to be the officially supported ones by Unicode. I was thinking that the Python wrapper would make sense as well. It does look like it is actively maintained, and according to this statement, I would hope pretty on-par with whatever is available in the official libraries? Keeping with Python would make sense as that's what the other scripts in Scribe-Data already use.

@andrewtavis
Copy link
Member Author

So we have ourselves a decision on this! 😊 Fantastic. Let's read through the docs, and I'll make us a notebook to work from tomorrow :)

@wkyoshida
Copy link
Member

Hey @andrewtavis, real quick just to clarify something for me.

Also sharing some context from our quick email chat just so it's here for others to understand:

I reached out to a contact within the Wikidata community and he agreed that the Wikidata support for emojis in other languages is pretty weak, but then he suggested we go directly from Unicode in this case as they have the endpoints for people to use.

From the above, was it suggested that Scribe could use Unicode directly for the emoji data or that Scribe could use Unicode to fill in the gaps in Wikidata (so Scribe is then able to later use the data from Wikidata also)? Mostly asking, because I'm thinking that, with the former, implementing something now could still run into that earlier point that we discussed of potentially having to modify the data extraction scripts later. That would be due to a possible change in the data solution. The question then, I guess, could be: Is the emoji functionality a priority enough today that Scribe is okay with perhaps dealing with data extraction rework later? Or is there enough reason to first more concretely discuss the long-term strategy for the data solution's back-end? We can have discussions in scribe-org/Scribe-iOS#89 as well, wherever the topic makes the most sense. For transparency, I don't mean to sound definitively opposed to working on emoji data atm 😆 more so just trying to better understand in terms of priority and roadmap.

@andrewtavis
Copy link
Member Author

The suggestion was that we just go directly from Unicode. This would be a mass upload, and this is something that the Wikidata community is very much against. We'd need to be willing to administer the data that was added at the very least, but even then it likely wouldn't get their blessing as the content to admin ratio would be skewed by this.

I'd say we still do something now, and we can switch it over to Wikidata later when there's more support. Or maybe we just won't. There's no necessity that we have to use Wikimedia for everything :) With there being such a universal solution for working with Unicode data, it could be years until it's properly implemented in Wikidata.

Thanks for checking on this! Will get to that script for us to explore a bit in the next few days 😊

@wkyoshida
Copy link
Member

Sounds good!

Gotchu - that makes sense to me 👍
One other point that I was also more wondering in regards to:

Or is there enough reason to first more concretely discuss the long-term strategy for the data solution's back-end?

is on the Scribe data-hosting side - related to the discussion we were having on using GitHub vs Toolforge vs something else, etc.. I was more referring, I guess, to the idea of potentially leveraging a DB on the back-end side also as opposed to JSONs in the Scribe-Data repo as the centralized location where Scribe data is hosted. Are you thinking as well of already going forward with getting the emoji data before? I'm thinking that would be fine; just wanted to throw the following for consideration as well.

For any future data extraction work - be it emoji data, new keyboard data, translation data, etc. - could it make sense to prioritize figuring out the centralized data-hosting first, since moving to a DB could impact how Scribe handles, stores, and provisions data?

How I think I am understanding, prioritized Scribe work is:

  1. main priorities today:
    • iOS SQLite
    • app menu
  2. data download via GitHub

After that, however, I see:

  • the rest of the -priority- work, includes new keyboards and their needed data
  • look into possible data download via a non-GitHub option (perhaps Toolforge)
    • I'm more wondering, I guess, where this investigation/brainstorm fits into the roadmap, since how Scribe data is handled could affect how new keyboard data for the above point and also emoji data for this issue are handled

I might also have some of the work prioritization misunderstood; so please keep me honest! 😆 If getting in the emoji feature and/or new keyboards is priority over investigating a possible non-GitHub data-hosting option, let me know.

@andrewtavis
Copy link
Member Author

andrewtavis commented Nov 5, 2022

Hey @wkyoshida 👋😊 I'm looking into this a bit more now that #21 is all done. Basic thing I can say is that pyicu doesn't seem to be what we want. We're looking for the CLDR data specifically, and that package doesn't have access by the looks of it 🤔

Looking at it, if we want the fastest/easiest way to do this, it might be to simply download the files we need from unicode-org/cldr-json and write the appropriate Python script to invert the JSONs into word -> emoji pairs. As of now it doesn't look like there's a good Python tool, but then I might be missing it.

What do you think of this quicker solution? :)

@wkyoshida
Copy link
Member

This would definitely be a long term goal, but I think it makes sense that the interim is that we're sending a file with the data to be updated. Happy to be proved wrong and jump right into direct SQLite updates though 😊

I'm thinking that a scenario that could call for Scribe to jump more aggressively into SQLite is if downloads actually get too large. As we've discussed, an idea could be that only the diff gets downloaded and Scribe leverages something like a last_updated field on the data. However, I guess a last_updated field could also just be added to the JSON files as well; so this might not be justification enough to jump right in. The trick, I'm thinking, will be more dependent actually on how the data extraction could keep track of what data is new/changed and on the existence of a (possibly-Toolforge) data hosting back-end to do the calculation of these data diffs.

Basic thing I can say is that pyicu doesn't seem to be what we want. We're looking for the CLDR data specifically, and that package doesn't have access by the looks of it 🤔

Yeah.. I took a look as well and wasn't really finding what Scribe needs either, which is unfortunate. I'm also thinking that perhaps just downloading the unicode-org/cldr-json files might be the path-forward here. I have a potential idea on how to do it; I'll open up a PR shortly for it.

@andrewtavis
Copy link
Member Author

I'm thinking that a scenario that could call for Scribe to jump more aggressively into SQLite is if downloads actually get too large. As we've discussed, an idea could be that only the diff gets downloaded and Scribe leverages something like a last_updated field on the data.

Would be 100% fine with this :) Big thing is that I think it makes sense that we never have a direct connection to Wikidata, but instead are keeping the prepared data somewhere and accessing it. Specifically what I mean is that we're giving people pre-prepared packs, not that one person has the most up to date version of the data whenever they do the update. This is how it works for apps like Open Street Maps wrappers, so I think we should follow this example 😊

I have a potential idea on how to do it; I'll open up a PR shortly for it.

Great! Looking forward! 🚀🚀 Anything code based that allows us to update the JSONs we have locally within a workflow would be very very welcome 😊

@andrewtavis
Copy link
Member Author

andrewtavis commented Nov 5, 2022

Also, @wkyoshida, 3c111c9 added a roadmap to the readme (I also added it to iOS). I think it makes sense to include it :) Let me know if you have any feedback on it 😊 Main question I have as far as version IDs are concerned is whether adding the ability to choose the base language the user translates from is a major release? As of now I have it as v2.4.0, but then we'll doubtless be doing lots of other features once the menu's done, and even that by itself is maybe 3.0 due to all the refactoring 🤔

wkyoshida added a commit to wkyoshida/Scribe-Data that referenced this issue Nov 6, 2022
@wkyoshida
Copy link
Member

3c111c9 added a roadmap to the readme (I also added it to iOS). I think it makes sense to include it :)

I agree! 🙌
I'm thinking it'll undoubtedly be helpful for contributors and users to understand the direction the project is taking 👍
Just a quick thought, but do you have any thoughts on using the Projects tab of the Scribe organization? Can't say I've used it a ton before, but I was thinking it could be organized to hold the aggregate roadmap with issues and PRs from all repos together - which could make sense since work on different repos might happen concurrently. The new roadmap section in the READMEs could also potentially just link to the project (might help with not having to update all READMEs whenever something changes).

Main question I have as far as version IDs are concerned is whether adding the ability to choose the base language the user translates from is a major release? As of now I have it as v2.4.0, but then we'll doubtless be doing lots of other features once the menu's done, and even that by itself is maybe 3.0 due to all the refactoring 🤔

That's a good thought - hmm. I'm thinking that it potentially could as well actually.

It can also get tricky thinking what version number features will end up falling under imo 😆 If the organization Projects tab ends up getting leveraged, would you have any thoughts on organizing/assigning the issues with some sort of priority value? I'm thinking that it could still show a general roadmap for all things Scribe, while also not boxing-in features into version numbers prematurely. Versioning can get assigned as features completed, I think, especially since misc. features can also get completed here and there.

As far as what is on the roadmap though - I think it makes sense 👍

@andrewtavis
Copy link
Member Author

Just a quick thought, but do you have any thoughts on using the Projects tab of the Scribe organization?

We have projects for iOS, but as of now I haven't really found them useful... I think that as the team grows the organization wide ones would definitely be the way to go though 🚀 I'm also happy to implement them now if you'd find them helpful, and agree that referencing a project for a point on the roadmap would be more descriptive :)

That's a good thought - hmm. I'm thinking that it potentially could as well actually.

I'll update it to v3.0.0 right now, but let's definitely keep in mind that this is a WIP presentation of what we want to work on 😊 Happy to expand on it later as we implement projects and other organization tools :)

andrewtavis added a commit that referenced this issue Nov 6, 2022
@wkyoshida
Copy link
Member

Sounds good, @andrewtavis 👍 🙌

Regarding the Projects tab, I did notice before too there was some in the iOS repo. I guess I was more thinking of revisiting the idea of using Projects because:

  • As already mentioned, the organization-level one could represent the aggregate roadmap more easily
  • There is new functionality. The projects in Scribe-iOS appear to be using the Classic features; so I wasn't sure if you've already explored the functionality with the new Projects and saw any potential there.
    • It was mainly this actually, I think. There seems to be a lot more features with the new version that make it easier to organize issues.

I don't think there's a dire need for it - only thought of this because referencing an issues board, in my mind, seemed easier to reference and work with than a README section.

@andrewtavis
Copy link
Member Author

This does make sense, @wkyoshida 😊 I haven't really looked into the features of the new projects, and referencing that would be easier than constantly needing to update the readme based on how things change :) At least writing it we have a better idea of how to structure the projects going forward 🤓

Will remove the projects from iOS as a first step, and then we can go from there. You'd suggest that all the projects be at the organization level, right? Makes sense to me that we have one place for organizing the various elements, as there doubtless are ways to filter the projects to see what you're interested in :)

@wkyoshida
Copy link
Member

You'd suggest that all the projects be at the organization level, right?

Yea ✌️ Also, since I'm thinking that related work across repos can be under the same groupings too - one can easily see they go together. Like with the below where there is both Scribe-iOS and Scribe-Data work:

  • Scribe-iOS v3.0.0 and Scribe-Data v3.0.0
    • Expand translation data and add English keyboard
    • Main issues: Scribe-iOS #​7 and Scribe-Data #​23

@andrewtavis
Copy link
Member Author

Sounds great 😊 Give me a few days and I'll try to get them up and running 🚀

wkyoshida added a commit to wkyoshida/Scribe-Data that referenced this issue Nov 8, 2022
wkyoshida added a commit to wkyoshida/Scribe-Data that referenced this issue Nov 8, 2022
wkyoshida added a commit to wkyoshida/Scribe-Data that referenced this issue Nov 9, 2022
wkyoshida added a commit to wkyoshida/Scribe-Data that referenced this issue Nov 9, 2022
andrewtavis added a commit that referenced this issue Nov 21, 2022
@andrewtavis
Copy link
Member Author

Brining the discussion back in here, @wkyoshida :) What should we look to do now? It looks like we might be able to work on the basis of scribe-org/Scribe-iOS#51 right now, as the main focus here is getting the order down given popularity?

Plan is to definitely include all of this in v2.2.0, btw 😊 Really looking forward to adding such a key feature!

@wkyoshida
Copy link
Member

Yep! I'm definitely thinking that getting the popularity worked in will be enough to give Scribe a good starting base for the emoji feature. I'm planning on looking into that today, and hopefully we can get that this week. Working in nltk and some other enhancements can happen later, I think.

@andrewtavis
Copy link
Member Author

Looking forward to the PR, @wkyoshida! I’ll upload the datasets to iOS after and then start working on adding the references to them. We definitely need to have the SQLite feature finished to release this as we can’t add more data without it at this point, but having this done will make v2.2.0 all the closer :)

I think that referencing NLTK can definitely wait till we have something out that we can test, and then iterate from there 😊 Adding a popularity score into the output would be enough so that we can order it on the Swift side, as I figure ordering the results here would leave us open to it not being maintained when it’s added to the DB.

andrewtavis added a commit that referenced this issue Nov 23, 2022
@andrewtavis
Copy link
Member Author

@wkyoshida, I believe that ef8b1e5 is the final touch of adding the output path to Scribe-iOS 😊 Will push the data to that repo now, and then we'll be set to start work on scribe-org/Scribe-iOS#51 🚀

Let me know if you can think of anything else that's needed for the MVP for this 🙃 Really great work as always!

@wkyoshida
Copy link
Member

Adding a popularity score into the output would be enough so that we can order it on the Swift side, as I figure ordering the results here would leave us open to it not being maintained when it’s added to the DB.

Some thoughts around this are coming to mind now. I'll add them in #26 though

Let me know if you can think of anything else that's needed for the MVP for this

I'm thinking of still adding some functionality here and there, e.g. #28 and nltk and etc, but I think we could be pretty good for the MVP! Or close (could depend on the additional thoughts I mentioned above)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Relates to data or Wikidata feature New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants