Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support StarDict format! #21

Open
huzheng001 opened this issue Feb 26, 2019 · 27 comments
Open

Support StarDict format! #21

huzheng001 opened this issue Feb 26, 2019 · 27 comments

Comments

@huzheng001
Copy link

Could you make every dictionary here have a StarDict format version?
Then in http://download.huzheng.org I can create a link to your freedict sourceforge project!

@huzheng001
Copy link
Author

@kenden
Copy link

kenden commented Jun 6, 2019

It seems to me that it should be the opposite, that StarDict should support a standard format but....
what does the StarDict format look like?

@humenda
Copy link
Member

humenda commented Jun 6, 2019 via email

@bansp
Copy link
Member

bansp commented Jun 6, 2019

I'm not sure what you mean, Sebastian, when you say that the TEI "is less useful for "displaying" dictionaries to a user" -- it's as useful as any XML, and then some more, given that it has seen a lot of applications and therefore some ready-made display solutions exist. We've just never cared much about displaying TEI dictionaries as such, because TEI was treated as a storage format for conversion into (initially) DICT and other formats. For TEI, I once whipped up some sketchy CSS, just to demonstrate to students at some seminars that it can easily be done.

Perhaps we should revisit the issue of displaying the source TEI in the browser?

As for the initial message in this thread -- I think we used to have some export to StarDict or its predecessor, but it may have vanished in the recesses of the past. The idea is just as plausible as having export to DICT (maybe even more plausible nowadays). But, frankly, I would expect a minimal gesture on the part of the StarDict developer of providing us with a spec of the format, rather than sending us on an Easter egg hunt somewhere... :-)

@humenda
Copy link
Member

humenda commented Jun 9, 2019 via email

@bansp
Copy link
Member

bansp commented Jun 10, 2019

@piotr

I'm not sure what you mean, Sebastian, when you say that the TEI "is less useful for "displaying" dictionaries to a user" -- it's as useful as any XML, and then some more, given that it has seen a lot of applications and therefore some ready-made display solutions exist.

One of the jpn-* dictionaries is above 100M in size. It is comprehensive, yet not complex. Loading it even in Vim takes long, I'd argue that using it with a CSS style sheet in a browser would not be necessarily a good user experience. Apart from it, TEI has not been designed to allow fast indexing, StarDict (and the predecessor "dict") care about such details. Viewing a dictionary is nice, searching the contents is much more important. Maybe I'm overlooking something?

I'd say you've seen clearly through my unclear post. I went from a point I wished to make (that XML in general, and TEI in particular, is not a hindrance to visualisation) to mentioning CSS as if it were a solution (and you then correctly pointed out the size of some databases).

Perhaps we should revisit the issue of displaying the source TEI in the browser?

For my part, I'm not interested in this.

Again, my bad, because in the context, this can readily mean "by using CSS", and that is clearly only workable for small databases, and even then non-optimal (XSLT in the browser would probably be slightly better, because it would at least let you copy everything, unlike CSS). So let me restate:

Perhaps we should look at ways to visualise the source TEI without pre-conversion into text-based formats?

A possible path could be to base our TEI off the so-called TEI Lex0, which is a format intended as pivot for "retrodigitized" dictionaries that is being currently pushed onto a standardization track, and then see what TEI Lex0 enabled tools can be used for the display?

Conversion of Freedict TEI towards a TEI Lex0 version should not be overly complex -- I think most stuff would remain as is, with some tweaks in attribute values and such. But, naturally, it would be good to wait until Lex0 gets frozen, so it's not a project for right now.

As for the initial message in this thread -- I think we used to have some export to StarDict or its predecessor, but it may have vanished in the recesses of the past. The idea is just as plausible as having export to DICT (maybe even more plausible nowadays).

Who is the StarDict author these days? There is starDict and StarDict-4 and there is a GitHub mirror. I am not sure whom to contact.

My understanding was that the author is the original poster in this thread.

@huzheng001
Copy link
Author

huzheng001 commented Nov 26, 2019 via email

@humenda
Copy link
Member

humenda commented Dec 1, 2019

So I tagged this as help wanted because we currently lack manpower to implement this. @huzheng001 Can you look into this?

@ilius
Copy link
Contributor

ilius commented Sep 1, 2020

You can use PyGlossary to convert FreeDict(tei) or slob files to StarDict.

@humenda
Copy link
Member

humenda commented Sep 2, 2020 via email

@ilius
Copy link
Contributor

ilius commented Sep 2, 2020

Yes, for Android there is GoldenDict Free which contains ads.
And GoldenDict which is a paid app and I haven't used.
The only open source app I found is QDict, but it has a few problems. For example it doesn't have dark mode, and internal links (to another word) do not work (it should be an easy fix).

@ilius
Copy link
Contributor

ilius commented Sep 2, 2020

For desktop there is GoldenDict and StarDict.

@ilius
Copy link
Contributor

ilius commented Sep 2, 2020

There is also sdcv which a simple command line application (for Linux / Unix, works on Android Termux too).
But it can't render html (generally used for rich text in StarDict glossaries)

@huzheng001
Copy link
Author

huzheng001 commented Feb 18, 2023 via email

@humenda
Copy link
Member

humenda commented Feb 18, 2023 via email

@ilius
Copy link
Contributor

ilius commented Feb 18, 2023

Would you be willing to help us integrate it into our build system?

Sure.
Where are your build config files?

@humenda
Copy link
Member

humenda commented Feb 18, 2023 via email

@ilius
Copy link
Contributor

ilius commented Feb 18, 2023

Looks like you convert tei to dictd (.index), then convert that to StarDict with dictd2dic.

I think we need to convert .tei directly to StarDict.

@humenda
Copy link
Member

humenda commented Feb 18, 2023 via email

@karlb
Copy link
Member

karlb commented Feb 18, 2023

Pyglossary can already be used to convert the TEI files generated by WikDict (many of which are part of FreeDict) to arbitrary formats. I've used it to create StarDict and Kobo dictionaries for the last years. But I'm not sure how well it copes with the other TEI files.

I personally find dealing with pyglossary a lot easier (both executing it and changing its code) than with the XLST files, so if there is an easy way to go directly from all our TEI files to pyglossary's different output formats, I would be in favor of doing that.

@humenda
Copy link
Member

humenda commented Feb 18, 2023 via email

@sdasda7777
Copy link

sdasda7777 commented Jan 19, 2024

Hi, I noticed that the StarDict files downloadable on the FreeDict website don't have .ifo file included. That is not great, because the file should provide machine readable metadata such as version of the dictionary, name, language, etc., but it is particularly bad because koreader refuses to read dictionaries without an .ifo file. Could you please make it so the .ifo would be generated as well?

(I'm not completely sure this is the correct repository to report this to, if there is a better one, please do tell.)

@humenda
Copy link
Member

humenda commented Jan 19, 2024 via email

@sdasda7777
Copy link

Thanks for the reply, as long as doesn't get forgotten and gets fixed eventually, sounds good.

IMO the stardict files are not yet advertised on the download page, are they?

I believe I searched for Stardict dictionaries and the FreeDict website was returned. It doesn't mention StarDict per se, but it does mention GoldenDict and .dict.dz file, so I'm not surprised it was found.

@humenda
Copy link
Member

humenda commented Jan 19, 2024 via email

@sdasda7777
Copy link

Ah, I see, sorry for the confusion. I can see the .ifo file being present, and it does work on my reader!

@huzheng001
Copy link
Author

huzheng001 commented Apr 25, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants