-
-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Virtual synth driver which can automatically recognise and switch between certain languages/synths #279
Comments
Comment 1 by jteh (in reply to comment description) on 2009-02-20 23:20
If the web site is designed well, they should. However, as we all know, not all web sites are designed well. :) Nevertheless, I believe that this is the better option for languages based on the Latin alphabet; e.g. English, German, Italian, etc. Also, I think you will find that more sites are starting to honour the language attribute. What sort of problems have you seen problems in other screen readers?
Note that punctuation characters are different in many languages and are used differently. Splitting based on detected characters (e.g. Russian and Hebrew characters are obviously quite different to English, etc.) may be better. However, you then have to decide which language to use when you return to a Latin alphabetic language. I am changing this ticket to restrict its scope to option 2 only, as option 1 will require a completely different implementation. Changes: |
Comment 2 by aleksey_s (in reply to comment 1) on 2009-02-21 14:18
And what about forums, where user can use quotes from another languages, etc?.. i can imagine huge of cases, when this will be broken.
Thanks to Olga Yakovleva (author of festival synthDriver), i played with one python library (http://personal.unizd.hr/~dcavar/LID/), which shows very very promise results. I guess if we rewrite it in c++, there will be no performance degradation. Note, that it will be unique feature, as i know, no one another screen reader offers it.
This is that i have described. Rarely language info provided by web pages is quite enough.
i do not feel i fuly understood what you mean :-)
take a look on that library i talked about, it really impresses me. |
Comment 3 by OlgaYakovleva (in reply to comment 2) on 2009-02-22 12:38 |
Comment 4 by jteh on 2009-05-01 06:59 |
Comment by aleksey_s on 2009-05-04 10:05 |
Comment 6 by Bernd on 2010-08-21 11:59 is someboddy working on this ticket since ticket 312 has been closed? |
Comment 7 by aleksey_s (in reply to comment 6) on 2010-08-26 06:58
I made a synthDriver which recognizes Cyrillic and Latin alphabets and uses a defined synthesizer for each. It is possible to specify different settings (such as rate, voice, variant) for each language separately. It preloads both synthesizers at a load time, so there is no lag when reading multilanguage texts. One of the known limitations is that you can't use the same synth for two languages with different settings. e.g. you can't use one sapi5 voice for Russian and one sapi5 voice for English. But you can use espeak for one and sapi5 for another freely. |
Attachment multilang.ini added by aleksey_s on 2010-08-26 07:02 |
Comment 8 by m11chen on 2011-04-25 08:07 I am wondering if this multilan.py synthDriver could be used with the current 2011.1.1 release? Also, would it be possible to make modifications for it to allow auto synthesizer switching for Chinese, and if so, how should I make such modifications? I am guessing that language detection should be much easier for English-Chinese content compared to other mixed Latin based language content. Thanks |
Attachment multilang.py added by aleksey_s on 2011-04-25 08:22 |
Comment 9 by aleksey_s (in reply to comment 8) on 2011-04-25 08:28
Yes, see updated attachment. Note, however, that due to the latest changes in the main branch, it is broken for snapshots.
Indeed. You probably should create a regular expression or check Unicode character boundaries. Subclass !Language and define !alphabetRegex class attribute or reimplement !recognize entirely, if regular expression check is not enough for you. Add new language to the !languagesList module variable. |
Comment 10 by m11chen on 2011-04-25 11:19 I tried to modify the multilang.py file with the following contents:
class ChineseLanguage(Language): But, as soon as I put in the new ChineseLanguage class, the virtual synth driver no longer appears in the list of available synthesizers. Sorry, as I do not have very much programming experience. For the following section: #with detection order Does this mean that only two languages should be specified at a time? Or is it because I do not have the NewFon synthesizer installed and the script does not get compiled correctly... Thanks for any help. |
Comment 11 by m11chen on 2011-06-05 17:15 Just wanted to find out if this is going to be included in the development priority anytime soon. Allowing NVDA to be configured with two or more synthesizers to speak two different languages, for example Chinese and English, where the two languages have very different pronunciation tables, thus making most voices incapable of speaking both fluently. |
Comment 13 by TimothyLee on 2011-06-17 02:29 The VirtualSynthDriver class refactored code from your multilang.py file to allow better reuse. VirtualSynthDriver supports the new speech API in trunk, and treats change in rate, pitch, inflection and volume as relative adjustment to all synths under its management. The multilang-cjk speech synth inherits from VirtualSynthDriver and performs the actual language detection. There is no need for a separate configuration file, because settings for physical synths can be adjusted via the Settings Dialog Box. It would be excellent if you can review the VirtualSynthDriver class, and have it added to trunk for use by other virtual synthesizers. |
Comment 14 by m11chen (in reply to comment 13) on 2011-06-17 04:39
Thanks for the great work. I am testing this right now with the latest source code, and I have found a few problems. ERROR - synthDriverHandler.getSynthList (12:19:29): This is just an error sound when NVDA loads, and as far as I can tell, it doesn't affect the screen reader's operation. Another thing I found was that if the text to be spoken contains numbers in the middle of the text, the text after the number does not get spoken. For example: hello world 123 and this is the text after the number. This is true regardless of setting the default language of NVDA to Traditional Chinese or English. I am running an English version Windows 7 64-bit, and I haven't tested on Chinese version Windows yet. Everything else is great, except for a minor lag when navigating the menus with the cursor, but I suppose this is due to the fact that two synthesizers are loaded, and I'm not sure how much this can be improved. Previously, I have been using ESpeak in English at 100% rate, and that's just about as responsive as NVDA can be, so it would be pretty hard to match with the newly added overhead. Also, would it be possible to make the speaking of numbers configurable as to having it spoken in English or Chinese? Again, thanks for the great work. I am sure many of the Chinese users of NVDA will be very excited to hear about this long sought for enhancement. |
Comment 15 by TimothyLee on 2011-06-18 02:54 As for the problem with numbers, it is actually related to a bug we have found with the espeak.dll callback. Apparently sometimes the callback is not invoked at all. We're in the process of debugging espeak right now, and hopefully will have a fix for upstream. Thanks for the suggestion about an option to read numbers in English. We'll try to get that implemented and post an updated version of multilang-cjk.py here for further testing. |
Attachment multilang-cjk.py added by TimothyLee on 2011-06-18 03:34 |
Comment 16 by m11chen on 2011-06-19 05:09 Another problem I found for the CJK Dual-voice Virtual Synthesizer is that when a sentence contains both Chinese and English text, if the English text proceeds the chinese text, the speech output of both Chinese and English towards the junction of the two languages becomes blurred together. This is especially obvious when the text is short. For example: 今天天氣很好hello world This is spoken without overlap, compared to: hello world今天天氣很好 Also, say all seems to get stuck in some situations. I have noticed this when reading emails in Windows Mail/Outlook Express. If the email contains previous conversations, the attached previous messages does not read continuously with say all. |
Comment 17 by m11chen on 2011-06-19 05:18 |
Comment 18 by m11chen on 2011-06-19 11:14 Would it be possible for the CJK Dual-voice Virtual Synthesizer to configure two separate SAPI5 voices for both English and Chinese? |
Comment 19 by TimothyLee on 2011-06-20 00:55 It is currently not possible to configure a physical synthesizer with two different voices because of the way the synthesizers are implemented. Sorry. |
Comment 20 by m11chen (in reply to comment 19) on 2011-06-20 14:07
I have been playing around with all types of combination of Primary and English synthesizers, but I find the issue with English text not being read after a string of numbers common not only with ESpeak chosen as the English synthesizer. Specifically, I have tested with SAPI4, SAPI5, Festival, and Pico. |
Comment 21 by m11chen on 2011-06-24 03:03 |
Comment 22 by TimothyLee (in reply to comment 17) on 2011-06-24 04:39
I've compiled espeak.dll using the latest source code from SVN, and the Say All problem has gone away. |
Comment 23 by m11chen (in reply to comment 22) on 2011-06-24 04:52
How could I obtain this new DLL? Is it available for download on the ESpeak Source Forge site, or could you provide an attachment? Thanks. |
Attachment espeak-svn20110524.zip added by TimothyLee on 2011-06-24 07:36 |
Comment 24 by pvagner (in reply to comment 23) on 2011-06-24 08:09
I've also compiled one: http://sk.nvda-community.org/download/espeak-1.45.30-win_dll.zip |
Comment 25 by m11chen on 2011-07-04 03:44 |
Attachment virtualSynthDriver.py added by TimothyLee on 2011-07-07 08:47 |
Comment 26 by TimothyLee on 2011-07-07 08:48 |
Comment 27 by m11chen on 2011-07-08 05:05 Thanks for the update for say all. I can confirm that it is fixed with the latest ESpeak DLL. How might we fix the overlapping of English and Chinese speech? |
Comment 28 by m11chen on 2011-07-13 11:41 |
Comment 29 by m11chen on 2011-10-14 07:35 Will this be fixed for the new beta release? Thanks |
Comment 30 by m11chen on 2012-12-21 15:07 I have been trying to get this to work with the new input composition feature, but for some reason it doesn't work right now. I think it has to do with the locale setting when two synthesizers are specified. Currently, I have the primary synthesizer set to SAPI5 and the English synthesizer set to SAPI4. Can anyone also using this maybe provide me with a hint for where to start for the fix? Thanks |
Attachment multilang-cjk.patch added by TimothyLee on 2013-05-21 08:30 |
Comment 31 by TimothyLee on 2013-05-21 08:33 In the patch set dated 2013-05-21, I've included patch for audiologic, espeak, sapi4 and sapi5. Please test them and comment. Thanks! |
Comment 32 by vgjh2005 on 2014-08-10 16:55 |
Comment 33 by taghavi on 2015-02-02 06:55 |
Blocked by #4877 |
I removed the blocked label since speech refactor is now part of NVDA and this opens up new possibilities. |
@ehollig could you please add the attachments for this issue? The last three attachments should suffice here since they seem to be the most updates ones. |
Hey @Adriani90 sorry for the delay. Here is some of the files. Let me know if these are not the ones you want. |
Reported by aleksey_s on 2009-02-20 19:56
Often, user works with multilingual information. Also it is not a secret, that quite rarely one synthesizer can speak good enough in more than one language. So NVDA can have feature to automatic recognize language and change synthesizer/voice to the prefered for this particular language. I see two cases this can be implemented in:
I myself do not like [becouse i have bad experience with such implementation in jaws. Also, i am not sure assistive APIs provide language info for each portion of text data. Or often text has very mixed cases, and i am even more unsure AT will split such text by portions and provide appropriate language info for each.
So for me, 2 is more suitable, and i will try explain my view how it can be done in NVDA.
We can implement virtual synthesizer, let's call it "Auto language". It will split input text by punctuation characters and build queue of phrases. For each phrase, language will be detected by applying some hard-coded rules, and then portion will be sent to the configured synthesizer. also, this virtual synthesizer needs know when portion is finished by real synthesizer, so it will add blank portion with some defined "index", and after wait in different thread when this index become active. Then it will send the next portion etc.
to avoid overhead of synthesizers switching, "auto language" synth can initialize all required synthesizers when it inits itself.
Requires
Problems
It is difficult to write rules for most east european languages.
Blocked by #312
The text was updated successfully, but these errors were encountered: