[Feature Request] Autocorrect with accent #24864

Gabriele-tomai00 · 2025-01-25T07:55:11Z

Feature Request Type

Core functionality
Add-on hardware support (eg. audio, RGB, OLED screen, etc.)
Alteration (enhancement/optimization) of existing feature(s)
New behavior

Description

Hi, I set the standard keyboard layout on my OS.
Is there any way to use autocorrection to write grave accent and acute accent?
For example:

c'e -> c'è
perche -> perché
cosi -> così

(I'm italian) Thank you

myst729 · 2025-02-12T03:54:05Z

Switch to a smarter input method.

fretep · 2025-02-22T22:43:02Z

Switch to a smarter input method.

Not exactly what is being requested.

I've not done much with Unicode or accents, so I may be off base here.

The Autocorrect documentation specifically calls out Unicode characters as not supported in the replacement. The default implementation uses send_string_P to apply the correction, which does not support Unicode.

However, it may be possible to use the override apply_autocorrect to use one of the ways to send Unicode. This may possibly work for the replacement text.

I believe the typo matching only supports 8-bit basic keycodes, so will be a problem with Unicode too. From your examples, you aren't using accents or Unicode in the typo side of the dictionary, so this may not be a problem for you / this request. I suspect this limitation is to keep memory usage down, and also reduce processing on each keypress.

Gabriele-tomai00 · 2025-02-23T11:17:06Z

Yes, exactly, thank you for your answer. I guess too that it is not allowed to use Unicode characters to avoid slowdown.
But so I didn't understand if there is a possible solution or not. Even just to try.

Thank you

fretep · 2025-02-23T20:34:59Z

You might get more people jumping in to help asking over in r/qmk/, r/olkb/ or on the QMK Discord. I suspect this is something that should be possible in userspace without changes to core.

@getreuer, have any thoughts on this request?

getreuer · 2025-02-24T08:31:01Z

Yes, Autocorrect is restricted to recognizing typos of letters A–Z and single quote '. This probably makes it of limited use in most languages besides English, unfortunately.

The crux of the challenge is that typing in a non-English language is usually done by configuring the host computer's layout to US International or another language-specific layout. The host being configured to a non-QWERTY layout means that the usual KC_-prefixed keycodes don't have their usual meaning, since the OS is remapping them to different meaning. This is complicating to a feature like Autocorrect, which wants to process the words being typed in the keyboard firmware before that remapping occurs. (Side note: it is also technically possible to type non-English letters through QMK's Unicode input feature, but this is less practical, so I'll ignore that here.)

Supporting such use in Autocorrect is possible, though tricky. Here is an outline for how it might be done.

In the Python utility autocorrect_data.py, which generates the typo dictionary data: the main change needed is to revise the dict variable "TYPO_CHARS," which represents how letters (Python Unicode strings) are mapped to numeric QMK keycodes.

A new command line argument is needed to say which layout is configured on the host computer, and the TYPO_CHARS definition needs to be expanded to represent the conversion for that layout to keycodes. For every layout, it needs the mapping from Unicode letter to keycode. Essentially, the definitions under keymap_extras need to be ported to Python, which should hopefully be possible to automate.
In addition to the keycodes, it is significant for some layouts whether AltGr (Right Alt mod) is held. And maybe other mods too besides that? So TYPO_CHARS needs to represent this as well.
Autocorrect stores the typos in a trie, with letters serialized as 8-bit keycodes. This serialization needs to be revised to also represent AltGr.
Autocorrect currently serializes the correction as an ASCII string, to be passed to send_string_P(). But since the correction is no longer ASCII, a revised serialization is needed. Say, maybe a null-terminated sequence of 16-bit keycodes to be passed to tap_code16() (though certainly, this would increase flash memory use).

In the firmware in process_autocorrect.c, which compares key events to the typo dictionary:

Autocorrect's event handling currently does not consider mods for the most part. This needs revision to remember for which keys was AltGr is held. And when querying the typo dictionary, consider AltGr when testing for a match.
How corrections are sent needs revision, corresponding to the revised serialization in the earilier bullet. As @fretep said, send_string_P() would not work.

Apologies that there is not a shorter path. I hope this outline helps if a brave soul does pursue this.

Gabriele-tomai00 added enhancement help wanted labels Jan 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Autocorrect with accent #24864

[Feature Request] Autocorrect with accent #24864

Gabriele-tomai00 commented Jan 25, 2025

myst729 commented Feb 12, 2025

fretep commented Feb 22, 2025

Gabriele-tomai00 commented Feb 23, 2025

fretep commented Feb 23, 2025

getreuer commented Feb 24, 2025

[Feature Request] Autocorrect with accent #24864

[Feature Request] Autocorrect with accent #24864

Comments

Gabriele-tomai00 commented Jan 25, 2025

Feature Request Type

Description

myst729 commented Feb 12, 2025

fretep commented Feb 22, 2025

Gabriele-tomai00 commented Feb 23, 2025

fretep commented Feb 23, 2025

getreuer commented Feb 24, 2025