`json_encode` can use SIMD #17672

mvorisek · 2025-02-02T15:08:43Z

Description

The current json_encode implementation [1] iterates on every string character.

I belive there is a potential to utilize SIMD to copy multiple characters to the output string as long as they are not to-be-escaped.

Benchmark: https://3v4l.org/XeJTn/rfc#vgit.master

This can improve performace on applications that do a lot of JSON encoding.

[1] https://github.com/php/php-src/blob/php-8.4.3/ext/json/json_encoder.c#L411

The text was updated successfully, but these errors were encountered:

cmb69 · 2025-02-02T15:27:12Z

There is already https://pecl.php.net/package/simdjson.

mvorisek · 2025-02-02T15:44:50Z

That lib is for decoding JSON. This feature request is about encoding.

cmb69 · 2025-02-02T16:16:07Z

That lib is for decoding JSON. This feature request is about encoding.

Oh, right.

nielsdos · 2025-02-02T18:08:53Z

I played a bit with this, and have this patch: https://gist.github.com/nielsdos/3b42ffaa4476bb5cb7eb498935072312 (somewhat dirty, would need some more work)

Without patch:

json_encode: 88.3 ms
serialize: 2.38 ms

Naive SSE2:

json_encode: 17.53 ms
serialize: 2.34 ms

Semi-naive SSE4.2:

json_encode: 8.88 ms
serialize: 2.37 ms

So 5x speedup on SSE2 and 9.9x speedup on SSE4.2. Not quite 20x (which would not be possible with SSE anyway because those vectors are 16 byte elements and you also have to account for the overhead); but still some win.
Maybe there's some obvious trick I'm missing in my patch especially regarding SSE2. Note also that this could maybe cause a slowdown when you have a lot of special characters next to each other that need escaping. Maybe there are also some articles about fast json encoding.

EDIT: by getting rid of the tail variable it's down to 6.94ms for SSE4.2

nielsdos · 2025-02-02T20:34:26Z

After further simplifying my patch (getting rid of tail handling and simplifying the range check)
I arrive at:

SSE4.2:

json_encode: 6.81 ms
serialize: 2.36 ms

SSE2:

json_encode: 14.87 ms
serialize: 2.34 ms

nielsdos · 2025-02-02T20:39:52Z

I also did a test with an adverserial input. The absolute worst case is a very long string containing only escape characters (or all non-printable ASCII). Because in that case, we have no benefit from the SIMD but do have its overhead.
Both the SSE2 & SSE4.2 have a comparable slowdown compared to not using SIMD in that case: 3.25 ± 0.06 times slower...

Test script for adverserial input:

$test = str_repeat('"', 64 * 100);
for ($i = 0; $i < 10000; $i++) {
    json_encode($test);
}

Ran using hyperfine.

nielsdos · 2025-02-02T21:09:48Z

BTW

I belive there is a potential to utilize SIMD to copy multiple characters to the output string as long as they are not to-be-escaped.

The copying itself happens via memcpy, which is optimized like crazy by the C library already. Still, this takes a lot of time as well but we can't really go around that unless we start doing some very advanced stuff like supporting rope strings everywhere, and even then we get the performance penalty once we need to materialize the rope.

cmb69 · 2025-02-02T21:41:30Z

By the way, I still believe there is some general potential to improve the performance of smart_str(ing) by increasing by a factor (instead of increasing by a summand). I might be wrong, though.

nielsdos · 2025-02-02T21:46:35Z

Wouldn't affect this a lot here because we preallocate a buffer equal to the input length (well unless there's a lot of escaping, then it does matter).
But indeed, in general, using a factor should lead to a lower amortized time complexity. And ideally a factor lower than the golden ratio (e.g. something like 1.5) to have better memory chunk reusage.

We do increase per page though so the effect may only happen on very large string builds:

php-src/Zend/zend_smart_str.c

Lines 27 to 35 in 6ae1209

    
           #define SMART_STR_NEW_LEN(len) \ 
        
           	(ZEND_MM_ALIGNED_SIZE_EX(len + SMART_STR_OVERHEAD, SMART_STR_PAGE) - SMART_STR_OVERHEAD) 
        
           ZEND_API void ZEND_FASTCALL smart_str_erealloc(smart_str *str, size_t len) 
        
           { 
        
           	if (UNEXPECTED(!str->s)) { 
        
           		str->a = len <= SMART_STR_START_LEN 
        
           				? SMART_STR_START_LEN 
        
           				: SMART_STR_NEW_LEN(len);

mvorisek · 2025-02-02T21:50:06Z

The speedup you nailed is awesome 😍 and will greatly improve apps that serve large blobs in JSON. Thank you!

slowdown

If I understand the impl. well, the computed "mask" is currently discarded once "to-be-escaped" character is found. What about reusing the computed knowledge for the remaining characters?

nielsdos · 2025-02-02T21:52:48Z

Indeed. What happens now is that we just move the "pos" to the first character that must be escaped.
In theory, we know based on the set bits in the "mask" which characters need escaping. How to leverage this efficiently is something I'd have to think about more sometime in the upcoming week.

nielsdos · 2025-02-03T21:34:42Z

By reusing the mask information I got the overhead of the worse case down to about 2.07x.
Development PR here if anyone's interested: nielsdos#120, still quite in a WIP state.

mvorisek added Feature Status: Needs Triage labels Feb 2, 2025

nielsdos linked a pull request Feb 8, 2025 that will close this issue

Optimize JSON string encoding #17734

Open

nielsdos added Status: Verified and removed Status: Needs Triage labels Feb 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`json_encode` can use SIMD #17672

`json_encode` can use SIMD #17672

mvorisek commented Feb 2, 2025 •

edited

Loading

cmb69 commented Feb 2, 2025

mvorisek commented Feb 2, 2025 •

edited

Loading

cmb69 commented Feb 2, 2025

nielsdos commented Feb 2, 2025 •

edited

Loading

nielsdos commented Feb 2, 2025

nielsdos commented Feb 2, 2025

nielsdos commented Feb 2, 2025

cmb69 commented Feb 2, 2025

nielsdos commented Feb 2, 2025

mvorisek commented Feb 2, 2025

nielsdos commented Feb 2, 2025 •

edited

Loading

nielsdos commented Feb 3, 2025

json_encode can use SIMD #17672

json_encode can use SIMD #17672

Comments

mvorisek commented Feb 2, 2025 • edited Loading

Description

cmb69 commented Feb 2, 2025

mvorisek commented Feb 2, 2025 • edited Loading

cmb69 commented Feb 2, 2025

nielsdos commented Feb 2, 2025 • edited Loading

nielsdos commented Feb 2, 2025

nielsdos commented Feb 2, 2025

nielsdos commented Feb 2, 2025

cmb69 commented Feb 2, 2025

nielsdos commented Feb 2, 2025

mvorisek commented Feb 2, 2025

nielsdos commented Feb 2, 2025 • edited Loading

nielsdos commented Feb 3, 2025

`json_encode` can use SIMD #17672

`json_encode` can use SIMD #17672

mvorisek commented Feb 2, 2025 •

edited

Loading

mvorisek commented Feb 2, 2025 •

edited

Loading

nielsdos commented Feb 2, 2025 •

edited

Loading

nielsdos commented Feb 2, 2025 •

edited

Loading