Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Autocorrect with accent #24864

Open
1 of 4 tasks
Gabriele-tomai00 opened this issue Jan 25, 2025 · 5 comments
Open
1 of 4 tasks

[Feature Request] Autocorrect with accent #24864

Gabriele-tomai00 opened this issue Jan 25, 2025 · 5 comments

Comments

@Gabriele-tomai00
Copy link

Feature Request Type

  • Core functionality
  • Add-on hardware support (eg. audio, RGB, OLED screen, etc.)
  • Alteration (enhancement/optimization) of existing feature(s)
  • New behavior

Description

Hi, I set the standard keyboard layout on my OS.
Is there any way to use autocorrection to write grave accent and acute accent?
For example:

c'e -> c'è
perche -> perché
cosi -> così

(I'm italian) Thank you

@myst729
Copy link
Contributor

myst729 commented Feb 12, 2025

Switch to a smarter input method.

@fretep
Copy link
Contributor

fretep commented Feb 22, 2025

Switch to a smarter input method.

Not exactly what is being requested.

I've not done much with Unicode or accents, so I may be off base here.

The Autocorrect documentation specifically calls out Unicode characters as not supported in the replacement. The default implementation uses send_string_P to apply the correction, which does not support Unicode.

However, it may be possible to use the override apply_autocorrect to use one of the ways to send Unicode. This may possibly work for the replacement text.

I believe the typo matching only supports 8-bit basic keycodes, so will be a problem with Unicode too. From your examples, you aren't using accents or Unicode in the typo side of the dictionary, so this may not be a problem for you / this request. I suspect this limitation is to keep memory usage down, and also reduce processing on each keypress.

@Gabriele-tomai00
Copy link
Author

Yes, exactly, thank you for your answer. I guess too that it is not allowed to use Unicode characters to avoid slowdown.
But so I didn't understand if there is a possible solution or not. Even just to try.

Thank you

@fretep
Copy link
Contributor

fretep commented Feb 23, 2025

You might get more people jumping in to help asking over in r/qmk/, r/olkb/ or on the QMK Discord. I suspect this is something that should be possible in userspace without changes to core.

@getreuer, have any thoughts on this request?

@getreuer
Copy link
Contributor

Yes, Autocorrect is restricted to recognizing typos of letters AZ and single quote '. This probably makes it of limited use in most languages besides English, unfortunately.

The crux of the challenge is that typing in a non-English language is usually done by configuring the host computer's layout to US International or another language-specific layout. The host being configured to a non-QWERTY layout means that the usual KC_-prefixed keycodes don't have their usual meaning, since the OS is remapping them to different meaning. This is complicating to a feature like Autocorrect, which wants to process the words being typed in the keyboard firmware before that remapping occurs. (Side note: it is also technically possible to type non-English letters through QMK's Unicode input feature, but this is less practical, so I'll ignore that here.)

Supporting such use in Autocorrect is possible, though tricky. Here is an outline for how it might be done.

In the Python utility autocorrect_data.py, which generates the typo dictionary data: the main change needed is to revise the dict variable "TYPO_CHARS," which represents how letters (Python Unicode strings) are mapped to numeric QMK keycodes.

  • A new command line argument is needed to say which layout is configured on the host computer, and the TYPO_CHARS definition needs to be expanded to represent the conversion for that layout to keycodes. For every layout, it needs the mapping from Unicode letter to keycode. Essentially, the definitions under keymap_extras need to be ported to Python, which should hopefully be possible to automate.
  • In addition to the keycodes, it is significant for some layouts whether AltGr (Right Alt mod) is held. And maybe other mods too besides that? So TYPO_CHARS needs to represent this as well.
  • Autocorrect stores the typos in a trie, with letters serialized as 8-bit keycodes. This serialization needs to be revised to also represent AltGr.
  • Autocorrect currently serializes the correction as an ASCII string, to be passed to send_string_P(). But since the correction is no longer ASCII, a revised serialization is needed. Say, maybe a null-terminated sequence of 16-bit keycodes to be passed to tap_code16() (though certainly, this would increase flash memory use).

In the firmware in process_autocorrect.c, which compares key events to the typo dictionary:

  • Autocorrect's event handling currently does not consider mods for the most part. This needs revision to remember for which keys was AltGr is held. And when querying the typo dictionary, consider AltGr when testing for a match.
  • How corrections are sent needs revision, corresponding to the revised serialization in the earilier bullet. As @fretep said, send_string_P() would not work.

Apologies that there is not a shorter path. I hope this outline helps if a brave soul does pursue this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants