Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add composition information #27

Merged
merged 10 commits into from
Jan 22, 2025

Conversation

klown
Copy link
Contributor

@klown klown commented Dec 19, 2024

This captures and stores the composition information for every Bliss symbol in the authorized vocabular. The composition information is available in ./data/bliss_symbol_explanations.json in case someone wants access to it.

  1. This uses a TypeScript script and npx vite-node to execute it.
  2. The code removes the composingIds arrays from bliss_symbol_explanations.json since they are often different from the newcomposition array.
  3. Some of the BCI AV IDs do not have any composition information defined for them -- there are three. The webapp lists them, and the information is available in BlissSymbolComposition.md with the ./docs folder. Is there a better place for this information?
  4. Also, there are slightly over 1000 BCI AVI IDs where the composition information is identical to the symbol's BCI AV ID. In that case, no composition array is stored with the symbol. These are also listed by the webapp and in BlissSymbolComposition.md

@klown
Copy link
Contributor Author

klown commented Dec 19, 2024

Hi @cindyli : This is almost ready for review, but see my comments in the main description above. I've left it as a draft PR. If you have any use for the composition information, however, it's available as described above.

@cindyli
Copy link
Contributor

cindyli commented Dec 20, 2024

@klown, I agree using a script to update ./data/bliss_symbol_explanations.json is a good idea.

Since this script is likely to be run only once to update the file, and the composition information is not expected to change frequently, manual updates might be enough for future changes. When the composition information for the three currently missing symbols becomes available, a manual update might be an easy route too. With that in mind, here are some thoughts:

  1. Change this webapp script to a node.js script and save it in the ./scripts folder.
  2. Write a documentation explaining the purpose of the script, its usage, and its one-time nature. It should only be used again if there is a big update to the composition information.
  3. Remove ./data/BciAvCompositionAnalysis.tsv from the repository. Instead, include instructions in the documentation on how to generate this intermediate file from its source spreadsheet, for example, saving the spreadsheet as a .tsv file.

Let me know what you think. Thanks.

@klown
Copy link
Contributor Author

klown commented Jan 8, 2025

@klown, I agree using a script to update ./data/bliss_symbol_explanations.json is a good idea.
...

The latest version of the code uses npx vite-node to run the command line script ./scripts/createAndRecordCompositions.ts. There is documentation near the top of the script that explains what it is for, how to run it, and that it is meant to be a one-time execution.

  1. Remove ./data/BciAvCompositionAnalysis.tsv from the repository. Instead, include instructions in the
    documentation on how to generate this intermediate file from its source spreadsheet, for example, saving the
    spreadsheet as a .tsv file.

I agree with the removal but for different reasons. With respect to the composition information in the spreadsheet, it sometimes differs from the Blissary. Like the palette rendering code, the script here uses only the Blissary for determining the composition of the symbols. Note that Russell has updated the original spreadsheet on the shared drive following a discussion with him and Hannes. That means that BciAvCompositionAnalysis.tsv is out of date and that is another potential reason to get rid of it.

In fact, I'm thinking about deleting all of the tsv material at this point -- the spreadsheet and the associated scripts. The only thing stopping me is that the spreadsheet was used to categorize the symbols as Bliss-characters vs. Bliss-words. That information is available from the Blissary, but I need to talk with Hannes to see if we can access it programatically.

With respect to words vs characters, I have discovered additional inconsistencies between the spreadsheet and the Blissary. One set is small and it could be handled by editing bliss_symbol_explanations.json by hand. The other group is larger. In both cases, it's looks like the Blissary is correct.

The small set of inconsistencies involve four indicators. These are listed as Bliss-words in the spreadsheet, but as Bliss-characters by the Blissary:

  • 28043 indicator_(continuous_form)
  • 28044 indicator_(plural,definite)
  • 28045 indicator_(thing,definite)
  • 28046 indicator_(thing,plural,definite)

There are a lot of other indicators that have the same general form, but are listed as Bliss-characters in both the spreadsheet and by the Blissary. For example, 24677 indicator_(present_action) is a character in both and it's as complicated as 28045 indicator_(thing,definite). I think the Blissary is correct here.

The larger set are a number of symbols that are marked as characters in the spreadsheet, but as words in the Blissary. I'm still looking through them, but I think the Blissary is right. For example "12355 aid" is composed of "help+indicator_thing", but is marked as a character in the spreadsheet. Compare that with "14705 help(to)" which is classified as a word in both places but is constructed in a way that is very similar to "12355 aid": Here they are one above the other:

  • 12355 aid = help+indicator_thing (charecter in the spread sheet; word in the Blissary)
  • 12355 help(to) = help+indicator_action (word in the both cases)

klown added 3 commits January 10, 2025 16:39
- 28043 indicator_(continuous_form)
- 28044 indicator_(plural,definite)
- 28045 indicator_(thing,definite)
- 28046 indicator_(thing,plural,definite)

Note: this agrees with the Blissary and the Unicode draft, but is
at odds with respect to BCI.
This was done after consulations with Hannes and Russell.  The changes
are consistent with respect to the Unicode draft.
@klown klown marked this pull request as ready for review January 14, 2025 21:16
@klown klown requested a review from cindyli January 14, 2025 21:16
@cindyli cindyli merged commit b3b116f into inclusive-design:main Jan 22, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants