Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TICCL-anahash: Add the wordforms to the foci file #22

Open
kosloot opened this issue Jul 4, 2018 · 1 comment
Open

TICCL-anahash: Add the wordforms to the foci file #22

kosloot opened this issue Jul 4, 2018 · 1 comment
Assignees

Comments

@kosloot
Copy link
Collaborator

kosloot commented Jul 4, 2018

@martinreynaert suggested:
It would be handier, too, if the focus file would also list the actual word forms included, for easy reference.

This can easily be implemented. But will of course affect all tools using the foci file. (current only indxer and indexerNT)

@kosloot kosloot self-assigned this Jul 4, 2018
@kosloot kosloot changed the title TICL-anahash: Add the wordforms to the foci file TICCL-anahash: Add the wordforms to the foci file Jul 4, 2018
@kosloot
Copy link
Collaborator Author

kosloot commented Jul 5, 2018

@martinreynaert
as more wordforms could lead to the same anagram/foci value, this would mean that we get some kind of mapping:
foci-value#word1,word2,word3...
This makes loading a bit slower: splitting and such, but alas.

You write:
This last would also enable TICCL-LDcalc to focus only on this (probably) single version of
the possible anagrams associated with a particular anagram value, rather than processing
them all.

At the moment, LDcalc doesn't use the foci ( but indexer and indexer NT can)
I see the advantage of only handling angrams that are 'real' but don't know how you want to use in in LDcalc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant