Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working with 10+MB strings #34

Open
sunny-chung opened this issue Nov 18, 2024 · 0 comments
Open

Working with 10+MB strings #34

sunny-chung opened this issue Nov 18, 2024 · 0 comments

Comments

@sunny-chung
Copy link

Environment:

ktreesitter 0.23.0
Kotlin 1.9
JDK 17

Description:

ktreesitter takes 20 seconds to initialize a 13 MB string. All other stuffs, like incremental updating or querying the AST, work fine and quick. This is what I am using to initialize:

        ast = parser.parse { byte, point ->
            if (byte in 0u until text.length.toUInt()) {
                s.substring(byte.toInt() ..byte.toInt()).let {
                    val codePoints = it.codePoints().toArray()
                    if (codePoints.size > 1 || codePoints.first() > 255) {
                        "X" // replace multibyte char as single-byte char
                    } else {
                        it
                    }
                }
            } else {
                "" // the doc is wrong. null would result in crash
            }
        }

I tried to feed direct strings like below, and it doesn't speed up to an acceptable level, and got troubles with multi-byte characters.

ast = parser.parse(singleByteCharSequence)

My use case is to load and edit a 13 MB JSON with syntax highlighting. If this passes, I will feed in even larger JSON data.

Anything could be done or work around? Not sure if it helps, I am using a rope data structure on JVM side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant