Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

While reading text, play a sound for spelling errors instead of saying "spelling error". #10474

Closed
wants to merge 1 commit into from

Conversation

jcsteh
Copy link
Contributor

@jcsteh jcsteh commented Nov 7, 2019

Link to issue number:

Fixes #4233 .

Summary of the issue:

We already use a sound to indicate spelling errors while typing. However, while reading text, we currently report "spelling error". It is more efficient to use a sound instead, since the sound can be played in parallel and is also not conflated with the text itself.

Description of how this pull request fixes the issue:

  1. Update getFormatFieldSpeech to produce a WaveFileCommand instead of "spelling error".
  2. Tweak speakTextInfo so that when reporting a single character, it reports initial fields for a non-empty sequence, regardless of whether the sequence contains text strings. Without this change, when moving by character into a spelling error (with no other initial fields), the sound was not played because the sequence didn't contain any text strings.

Testing performed:

  1. Opened this test case in Firefox:
    data:text/html,<textarea>This is some textt. Each sentence contanes a single spelling error. Coming up with theese errors is quite amusing.
  2. Focused the textarea.
  3. Read through the textarea by line, say all and word. Heard the buzz sound alongside each spelling error.

Known issues with pull request:

  1. This currently completely replaces the "spelling error" text. I guess some users might not like this. However, we don't currently have a mechanism for configuring sounds instead of spoken messages. We do have the "Play sound for spelling errors while typing" option, but that lives in Keyboard Settings and it's probably odd to have that setting control this functionality.
  2. I'm not quite sure why speakTextInfo was testing for strings in the initial fields sequence. I can't think of a reason for this now. This was implemented 8 years ago in 82c4bdc. Testing for a non-empty sequence should be sufficient.

Change log entry:

New features:
- While reading text, NVDA now plays a sound to indicate spelling errors instead of saying "spelling error".

…g "spelling error".

1. Update getFormatFieldSpeech to produce a WaveFileCommand instead of "spelling error".
2. Tweak speakTextInfo so that when reporting a single character, it reports initial fields for a non-empty sequence, regardless of whether the sequence contains text strings.
    Without this change, when moving by character into a spelling error (with no other initial fields), the sound was not played because the sequence didn't contain any text strings.
@jcsteh jcsteh requested a review from michaelDCurran November 7, 2019 04:52
@LeonarddeR
Copy link
Collaborator

LeonarddeR commented Nov 7, 2019 via email

@jcsteh
Copy link
Contributor Author

jcsteh commented Nov 7, 2019

Have you considered to make it configurable

I mentioned that consideration above. I worry that we don't have a good framework for configuring sounds vs messages for these things, so we'll end up with special case config options/checks scattered throughout. However:

just like with indentation reporting?

I actually forgot about this one; as you say, there's already precedent for this, including the UI. I'll follow that pattern.

Also,may be I'm missing something, but could you describe the 'out of spelling error' behavior

This hasn't changed. It doesn't really make sense to say "out of {buzz}", so we still say "out of spelling error". While admittedly inconsistent, you only get this when moving by character or word, and I don't think there's a better alternative.

as well as how grammar is covered here?

Grammar isn't changed here. Even though the sound is called textError, I noticed we don't report grammar errors while typing. I guess it makes less sense while typing, though, since grammar errors are usually broader than a word. What do you think?

@MarcoZehe
Copy link
Contributor

Also,may be I'm missing something, but could you describe the 'out of spelling error' behavior

This hasn't changed. It doesn't really make sense to say "out of {buzz}", so we still say "out of spelling error". While admittedly inconsistent, you only get this when moving by character or word, and I don't think there's a better alternative.

A different sound maybe, like a lower buzz?

@LeonarddeR
Copy link
Collaborator

I mentioned that consideration above. I worry that we don't have a good framework for configuring sounds vs messages for these things, so we'll end up with special case config options/checks scattered throughout.

I share that concern. May be @feerrenrut has some valuable ideas here?

as well as how grammar is covered here?

Grammar isn't changed here. Even though the sound is called textError, I noticed we don't report grammar errors while typing. I guess it makes less sense while typing, though, since grammar errors are usually broader than a word. What do you think?

I think not covering grammar errors makes the uX a bit weird. We're talking about spelling errors now, but I'm pretty sure this also covers grammar errors. Now if one chooses to only play a sound for spelling errors, either grammar errors might be silenced or still reported as grammar errors with speech.

May be:

  • Play the sound for both spelling and grammar
  • Rename spelling errors in the gui to spelling and grammar errors

@LeonarddeR
Copy link
Collaborator

The following just popped into my mind.

getControlFieldSpeech and getFormatFieldSPeech currently add a lot of strings to a speech sequence. It is therefore not possible to process them later on.

Something like this:

  1. Add a base class: speech.commands.ProxyCommand. Also add speech.commands.TextErrorCommand, which subclasses ProxyCommand
  2. A ProxyCommand can behave as a dynamic command, holding an inner command that is either a string when we're talking about reporting spelling errors with text, and a WaveCommand when we expect SPelling errors to be reported by a wave command.
  3. When speech encounters a ProxyCommand, it should stay in the sequence. The speech manager should be responsible for getting the inner command from the proxy.

In any case, it makes getControlFieldSpeech less complex.

@jcsteh
Copy link
Contributor Author

jcsteh commented Nov 7, 2019 via email

@LeonarddeR
Copy link
Collaborator

why do we need to process this later on? What benefit does that bring?

I could think of NVDA Remote, where the controlling machine wants spelling errors to be reported with sounds whereas the controlled machine uses speech to report them. However, while I can see the benefit of that, it could also cause major confusion about expected behavior.

Taking this to its logical end, many things would eventually become speech commands (emphasis, links, etc.). So sure, we've taken the decision making out of the speech functions, but we still have the code to determine whether to insert those objects, plus now we have hundreds of little command classes.

What I actually consider a problem in the current situation where all these control fields are exposed as strings in a sequence, is that it is still very difficult for add-ons to modify the behavior. Imagine an add-on like audio themes, for example. What if audio themes wants to play a sound for a certain role? I assume it will still have to monkeypatch getPropertiesSpeech at some point.

If contro/format fields are somehow distinguishable as speech commands, or getPropertiesSPeech could use a command that adds some metadata to it, adding customisability to add-ons is as simple as allowing them to filter the speech sequence.

I realize now that this is kind of a different subject, may be.

For #9937, I did something like this:

class _TextChunk(str):
+	"""str subclass to distinguish normal text from field text when processing text info speech."""

@Brian1Gaff
Copy link

Brian1Gaff commented Nov 7, 2019 via email

@jcsteh
Copy link
Contributor Author

jcsteh commented Nov 7, 2019

What I actually consider a problem in the current situation where all these control fields are exposed as strings in a sequence, is that it is still very difficult for add-ons to modify the behavior.

That's fair.

Imagine an add-on like audio themes, for example. What if audio themes wants to play a sound for a certain role? I assume it will still have to monkeypatch getPropertiesSpeech at some point.

I thought about this in the past. getPropertiesSpeech isn't so bad because you could have a filter called at the start of the function which takes the input dict and an output sequence, removes stuff it consumes (e.g. role) and throws stuff into the output sequence (e.g. a sound for the role). Since role isn't present in the dict any more, the normal presentation won't occur. This doesn't work for getFormatFieldSpeech, though, because it can do both entry (spelling error) and exit (out of spelling error). Still, I think I'd prefer to see filters at specific points like this rather than overloading the SpeechCommand processing stage, though there are valid arguments either way.

I do think this is out of scope here, though. :)

@jcsteh jcsteh removed the request for review from michaelDCurran November 7, 2019 10:06
@jcsteh
Copy link
Contributor Author

jcsteh commented Nov 7, 2019

Removing review request until I update this to be configurable. It seems I can't change this to a draft PR now. :(

@feerrenrut
Copy link
Contributor

I do think this is out of scope here, though. :)

I agree. Although this may not be the best place to have it, this discussion is quite valuable. Sorry for making your PR noisey Jamie!

I have been imagining the SpeechSequences to eventually contain ONLY commands of different types. We might add a plainText speech type to start to replace str values, these could have a single "source" eg "name", "role", "description", "NVDA Speech UI". In my opinion this would offer the greatest flexibility with presentation. I actually think perhaps speech commands should be able to "wrap" other speech sequences, This can help with the "into x" and "out of x" complexity. Essentially we would be making the structure a tree rather than a flat sequence.

Some features I would like to use this to enable:

  • Customisable / themeable speech ording and verbosity.
  • More semantics in speech viewer. This would make accessibility testing more pleasant, at a glance being able to know speech is coming from role vs description vs somewhere else

@jcsteh
Copy link
Contributor Author

jcsteh commented Nov 7, 2019 via email

@feerrenrut
Copy link
Contributor

For example, if a link was a nested sequence, the sequence would later have to
be processed to figure out whether to report entry/exit anyway. At some
point, it has to be collapsed down to a linear series of actions.

Yes, I'm thinking that collapsing nodes of the tree at a presentation layer allows for more control and flexibility for how to speak things. Tying the "speech UI", (ie whether to announce "link", speak links at all, beep, or change tone) to the contents of the link has some nice benefits.

Also, whether you report entry/exit often depends heavily on context; e.g. moving by character.

I was mostly thinking about solving the problem of ensuring symmetry, but of course there are situations where asymmetry might be a feature.

@feerrenrut
Copy link
Contributor

@jcsteh Indicated that this should be changed to a draft with #10474 (comment)

Converting this into a draft until the remaining work is completed.

@feerrenrut feerrenrut marked this pull request as draft June 16, 2020 16:15
@codeofdusk
Copy link
Contributor

Any updates on this PR?

@LeonarddeR
Copy link
Collaborator

@jcsteh Happy to take this one if you'd like.

@jcsteh
Copy link
Contributor Author

jcsteh commented Aug 29, 2023

You're welcome to if you'd like, but I also don't want to earn a reputation for expecting others to clean up my half-baked pull requests. :) I probably need to get better at not submitting them at all until they're more complete.

@Adriani90
Copy link
Collaborator

@LeonarddeR are you still planning to take this one?
#7599 has been merged a while ago, so this might have introduced a possibility to separate what is reported by speech and what is reported by tones. Also this is already possible for capital letter, they can be indicated either by tones or speech which can be set in the voice settings category. So maybe we can make use of the same concept here as well?

One further concern we should be aware of:
If capitals are indicated by tones, and we introduce a spelling error tone while reading character by character, this might make things more complicated to distinguish.
So how about introducing tones for spelling errors only when reading by all chunks of text but not character by character?
This would make sure that we don't interfere with capitals tones and the navigation character by character still reports "spelling error" instead of playing a tone.

@LeonarddeR
Copy link
Collaborator

I'm afraid I don't have time currently.

@jcsteh
Copy link
Contributor Author

jcsteh commented Mar 25, 2024

This PR already uses speech refactor (#7599). It uses the spelling error buzz sound already heard after typing a word with a spelling error. However, I never got around to adding configurability to this PR, and #7599 doesn't provide any generic UI for that.

@seanbudd
Copy link
Member

@jcsteh - should we close this abandoned?

@jcsteh jcsteh added the stale PR is stale/old - NV Access has missed reviewing this or this is an abandoned draft label Jun 17, 2024
@jcsteh jcsteh closed this Jun 17, 2024
@seanbudd seanbudd added the Abandoned requested reports or updates are missing since more than 1 year, author or users are not available. label Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Abandoned requested reports or updates are missing since more than 1 year, author or users are not available. feature stale PR is stale/old - NV Access has missed reviewing this or this is an abandoned draft
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provision of indication options for reporting spelling errors.
8 participants