Add fluentd benchmark #243

k0kubun · 2023-07-26T22:13:09Z

This PR adds a new benchmark fluentd. Fluentd is a log collector that Shopify currently uses for services like Core and SFR. When Core calls Rails.logger.info("foo=bar"), our Fluentd config parses it into {"foo":"bar"} and forwards the log to Splunk, for example.

The benchmark code is inspired by the [JA]How fast really is Ruby 3.x? talk in RubyKaigi 2022. According to him, a Fluentd committer, their parser creates a Ruby object(s) for each log, so the parser is a very runtime-intensive part. Benchmarking the forwarding feature could be challenging in a single-process script, so this benchmarks only the parser. Instead of his synthetic ltsv.rb that is shown later in his talk, this PR uses the actual Fluent::Plugin::LabeledTSVParser, while parsing the same thing as his data.ltsv.

Since he said "CRuby performance is very important for Fluentd" in the talk, I thought it'd be interesting to include it as a Ruby benchmark.

Current result

ruby: ruby 3.3.0dev (2023-07-26T17:58:15Z master 37160be439) [x86_64-linux]
yjit: ruby 3.3.0dev (2023-07-26T17:58:15Z master 37160be439) +YJIT [x86_64-linux]

-------  ---------  ----------  ---------  ----------  ------------  ---------
bench    ruby (ms)  stddev (%)  yjit (ms)  stddev (%)  yjit 1st itr  ruby/yjit
fluentd  1528.4     0.2         1384.9     0.1         1.10          1.10
-------  ---------  ----------  ---------  ----------  ------------  ---------

benchmarks/fluentd/benchmark.rb

maximecb · 2023-07-27T15:42:09Z

Looks good. What are the stats like currently?

k0kubun · 2023-07-27T16:49:17Z

Here's YJIT stats on Ruby master:

***YJIT: Printing YJIT statistics on exit***
method call exit reasons:
    iseq_missing_optional_kw:        974 (40.6%)
                   block_arg:        816 (34.0%)
                    kw_splat:        436 (18.2%)
       cfunc_ruby_array_varg:         67 ( 2.8%)
           klass_megamorphic:         66 ( 2.8%)
                 send_getter:         26 ( 1.1%)
                 interrupted:         10 ( 0.4%)
                 not_fixnums:          2 ( 0.1%)
invokeblock exit reasons:
      proc:        749 (87.1%)
    symbol:        111 (12.9%)
invokesuper exit reasons:
    block:         91 (100.0%)
leave exit reasons:
    interp_return:  3,026,100 (100.0%)
     se_interrupt:          6 ( 0.0%)
getblockparamproxy exit reasons:
    (all relevant counters are zero)
getinstancevariable exit reasons:
    (all relevant counters are zero)
setinstancevariable exit reasons:
    (all relevant counters are zero)
definedivar exit reasons:
    (all relevant counters are zero)
opt_aref exit reasons:
    (all relevant counters are zero)
opt_aref_with exit reasons:
    (all relevant counters are zero)
expandarray exit reasons:
    (all relevant counters are zero)
opt_getinlinecache exit reasons:
    (all relevant counters are zero)
invalidation reasons:
    constant_state_bump:         80 (50.3%)
          method_lookup:         51 (32.1%)
       constant_ic_fill:         28 (17.6%)
num_send:                  6,110,292
num_send_known_class:          8,959 ( 0.1%)
num_send_polymorphic:          8,807 ( 0.1%)
num_send_dynamic:              3,478 ( 0.1%)
num_send_x86_rel32:            3,318
num_send_x86_reg:                  1
iseq_stack_too_large:              0
iseq_too_long:                     0
temp_reg_opnd:                17,400
temp_mem_opnd:                12,628
temp_spill:                   11,015
bindings_allocations:              0
bindings_set:                      0
compiled_iseq_entry:             198
compiled_iseq_count:             245
compiled_blockid_count:        2,492
compiled_block_count:          3,241
versions_per_block:            1.301
compiled_branch_count:         5,569
block_next_count:              2,963
defer_count:                   1,046
defer_empty_count:               201
branch_insn_count:               282
branch_known_count:               48 (17.0%)
freed_iseq_count:                 30
invalidation_count:              159
constant_state_bumps:              0
get_ivar_max_depth:                0
inline_code_size:            328,820
outlined_code_size:          309,413
code_region_size:            684,032
freed_code_size:                   0
live_context_size:           141,897
live_context_count:            4,893
live_page_count:                  42
freed_page_count:                  0
code_gc_count:                     0
num_gc_obj_refs:               2,393
object_shape_count:              713
side_exit_count:                 735
total_exit_count:          3,026,835
total_insns_count:        76,170,760
vm_insns_count:              469,899
yjit_insns_count:         75,701,596
ratio_in_yjit:                 99.4%
avg_len_in_yjit:                25.0
Top-11 most frequent exit ops (100.0% of exits):
             setblockparam:        366 (49.8%)
             setlocal_WC_0:        114 (15.5%)
               invokesuper:         91 (12.4%)
    opt_send_without_block:         73 ( 9.9%)
                      once:         48 ( 6.5%)
      opt_getconstant_path:         27 ( 3.7%)
                     leave:          6 ( 0.8%)
                checkmatch:          5 ( 0.7%)
          opt_regexpmatch2:          2 ( 0.3%)
                    opt_eq:          2 ( 0.3%)
                      send:          1 ( 0.1%)

maximecb · 2023-07-27T17:00:07Z

Interesting. I'm wondering if we get so little speedup despite such a high percentage because we spend too much time in C, maybe regexp parsing? Otherwise maybe a lot of usage of keyword arguments?

k0kubun · 2023-07-27T17:14:04Z

Parsing time values is probably done by something written in C, but the main LTSV parsing logic is written in Ruby with operations like String#split, Hash#[]=, and local/instance variable accesses. So this particular parser should be a bit more interesting benchmark for YJIT than some other parsers that solely rely on a regexp. We'd need to look at profiling results to see why the speedup is limited despite that.

Add fluentd benchmark

6c85a07

maximecb reviewed Jul 27, 2023

View reviewed changes

benchmarks/fluentd/benchmark.rb Outdated Show resolved Hide resolved

Add more variants to error messages

38d4d0f

k0kubun force-pushed the fluentd branch from 513d5d8 to 38d4d0f Compare July 27, 2023 16:43

maximecb approved these changes Jul 27, 2023

View reviewed changes

k0kubun merged commit 4cc2446 into Shopify:main Jul 27, 2023

k0kubun deleted the fluentd branch July 27, 2023 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fluentd benchmark #243

Add fluentd benchmark #243

k0kubun commented Jul 26, 2023

maximecb commented Jul 27, 2023

k0kubun commented Jul 27, 2023

maximecb commented Jul 27, 2023

k0kubun commented Jul 27, 2023 •

edited

Loading

Add fluentd benchmark #243

Add fluentd benchmark #243

Conversation

k0kubun commented Jul 26, 2023

Current result

maximecb commented Jul 27, 2023

k0kubun commented Jul 27, 2023

maximecb commented Jul 27, 2023

k0kubun commented Jul 27, 2023 • edited Loading

k0kubun commented Jul 27, 2023 •

edited

Loading