-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Few thousand documents written (1995 to 1999) using TamNet.ttf from the old irdu.nus.sg archives [Advise Needed] #15
Comments
Hi there- Interesting topic; I'll post to my Twitter. Perhaps NUS can announce a bug bounty and have some engineers take a look. In the past what has helped is the following:
I'm sure someone can crack this problem with sufficient effort and motivation. |
@ashbeats I can explore on this. Share the font ttf file and few sample documents. |
HTML / JS script should help. Please try. |
Hi, Thank you for responding. The documents just been restored to the original website: And another archived site, holds a bit more information: The fonts are available for download here: and GPT4 had this to add...
|
Found charecter map of tamilnet.ttf file here
https://fontsdata.com/76760/tamilnet.htm
Exploring on that how to use that table for unicode conversion.
|
thanks folks; @tshrinivasan - if you find a fix please post a PR to open-tamil also |
Working with udhayam.in udhayan to get the mapping for this font.
Will update here on the progress soon.
|
@ashbeats - do you still need this feature ? did you make any progress ? |
@arcturusannamalai I do, but the project is on hold. |
Hi,
My name is John. And I have been attempting to convert a few thousand documents and articles that were written in the TamNet.ttf bilingual font, that was released in 1995. The original authors are no longer around, and I have been attempting to find information about the keyboard mapping to write a converter or find an existing converter it to Unicode standard encodings.
Do you know of a converter? I tried your Open-Tamil lib, but it failed to recognise the text or the conversions were not fully accurate.
I understand that the encoding is also questionable as the documents were moved between various formats over the years, such as ansi. So I have preserved them from the originals, and have been inspecting it in binary and comparing it to the same text's written in the TamNet99 formats, and the Murasu formats.
The closest seems to be TamNet99, from google searches and papers, however, there may be edge cases that may elude me.
And insights or direction would be most appreciated.
Best Regards,
John
The text was updated successfully, but these errors were encountered: