Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fluentd benchmark #243

Merged
merged 2 commits into from
Jul 27, 2023
Merged

Add fluentd benchmark #243

merged 2 commits into from
Jul 27, 2023

Conversation

k0kubun
Copy link
Member

@k0kubun k0kubun commented Jul 26, 2023

This PR adds a new benchmark fluentd. Fluentd is a log collector that Shopify currently uses for services like Core and SFR. When Core calls Rails.logger.info("foo=bar"), our Fluentd config parses it into {"foo":"bar"} and forwards the log to Splunk, for example.

The benchmark code is inspired by the [JA]How fast really is Ruby 3.x? talk in RubyKaigi 2022. According to him, a Fluentd committer, their parser creates a Ruby object(s) for each log, so the parser is a very runtime-intensive part. Benchmarking the forwarding feature could be challenging in a single-process script, so this benchmarks only the parser. Instead of his synthetic ltsv.rb that is shown later in his talk, this PR uses the actual Fluent::Plugin::LabeledTSVParser, while parsing the same thing as his data.ltsv.

Since he said "CRuby performance is very important for Fluentd" in the talk, I thought it'd be interesting to include it as a Ruby benchmark.

Current result

ruby: ruby 3.3.0dev (2023-07-26T17:58:15Z master 37160be439) [x86_64-linux]
yjit: ruby 3.3.0dev (2023-07-26T17:58:15Z master 37160be439) +YJIT [x86_64-linux]

-------  ---------  ----------  ---------  ----------  ------------  ---------
bench    ruby (ms)  stddev (%)  yjit (ms)  stddev (%)  yjit 1st itr  ruby/yjit
fluentd  1528.4     0.2         1384.9     0.1         1.10          1.10
-------  ---------  ----------  ---------  ----------  ------------  ---------

@maximecb
Copy link
Contributor

Looks good. What are the stats like currently?

@k0kubun
Copy link
Member Author

k0kubun commented Jul 27, 2023

Here's YJIT stats on Ruby master:

***YJIT: Printing YJIT statistics on exit***
method call exit reasons:
    iseq_missing_optional_kw:        974 (40.6%)
                   block_arg:        816 (34.0%)
                    kw_splat:        436 (18.2%)
       cfunc_ruby_array_varg:         67 ( 2.8%)
           klass_megamorphic:         66 ( 2.8%)
                 send_getter:         26 ( 1.1%)
                 interrupted:         10 ( 0.4%)
                 not_fixnums:          2 ( 0.1%)
invokeblock exit reasons:
      proc:        749 (87.1%)
    symbol:        111 (12.9%)
invokesuper exit reasons:
    block:         91 (100.0%)
leave exit reasons:
    interp_return:  3,026,100 (100.0%)
     se_interrupt:          6 ( 0.0%)
getblockparamproxy exit reasons:
    (all relevant counters are zero)
getinstancevariable exit reasons:
    (all relevant counters are zero)
setinstancevariable exit reasons:
    (all relevant counters are zero)
definedivar exit reasons:
    (all relevant counters are zero)
opt_aref exit reasons:
    (all relevant counters are zero)
opt_aref_with exit reasons:
    (all relevant counters are zero)
expandarray exit reasons:
    (all relevant counters are zero)
opt_getinlinecache exit reasons:
    (all relevant counters are zero)
invalidation reasons:
    constant_state_bump:         80 (50.3%)
          method_lookup:         51 (32.1%)
       constant_ic_fill:         28 (17.6%)
num_send:                  6,110,292
num_send_known_class:          8,959 ( 0.1%)
num_send_polymorphic:          8,807 ( 0.1%)
num_send_dynamic:              3,478 ( 0.1%)
num_send_x86_rel32:            3,318
num_send_x86_reg:                  1
iseq_stack_too_large:              0
iseq_too_long:                     0
temp_reg_opnd:                17,400
temp_mem_opnd:                12,628
temp_spill:                   11,015
bindings_allocations:              0
bindings_set:                      0
compiled_iseq_entry:             198
compiled_iseq_count:             245
compiled_blockid_count:        2,492
compiled_block_count:          3,241
versions_per_block:            1.301
compiled_branch_count:         5,569
block_next_count:              2,963
defer_count:                   1,046
defer_empty_count:               201
branch_insn_count:               282
branch_known_count:               48 (17.0%)
freed_iseq_count:                 30
invalidation_count:              159
constant_state_bumps:              0
get_ivar_max_depth:                0
inline_code_size:            328,820
outlined_code_size:          309,413
code_region_size:            684,032
freed_code_size:                   0
live_context_size:           141,897
live_context_count:            4,893
live_page_count:                  42
freed_page_count:                  0
code_gc_count:                     0
num_gc_obj_refs:               2,393
object_shape_count:              713
side_exit_count:                 735
total_exit_count:          3,026,835
total_insns_count:        76,170,760
vm_insns_count:              469,899
yjit_insns_count:         75,701,596
ratio_in_yjit:                 99.4%
avg_len_in_yjit:                25.0
Top-11 most frequent exit ops (100.0% of exits):
             setblockparam:        366 (49.8%)
             setlocal_WC_0:        114 (15.5%)
               invokesuper:         91 (12.4%)
    opt_send_without_block:         73 ( 9.9%)
                      once:         48 ( 6.5%)
      opt_getconstant_path:         27 ( 3.7%)
                     leave:          6 ( 0.8%)
                checkmatch:          5 ( 0.7%)
          opt_regexpmatch2:          2 ( 0.3%)
                    opt_eq:          2 ( 0.3%)
                      send:          1 ( 0.1%)

@maximecb
Copy link
Contributor

Interesting. I'm wondering if we get so little speedup despite such a high percentage because we spend too much time in C, maybe regexp parsing? Otherwise maybe a lot of usage of keyword arguments?

@k0kubun
Copy link
Member Author

k0kubun commented Jul 27, 2023

Parsing time values is probably done by something written in C, but the main LTSV parsing logic is written in Ruby with operations like String#split, Hash#[]=, and local/instance variable accesses. So this particular parser should be a bit more interesting benchmark for YJIT than some other parsers that solely rely on a regexp. We'd need to look at profiling results to see why the speedup is limited despite that.

@k0kubun k0kubun merged commit 4cc2446 into Shopify:main Jul 27, 2023
@k0kubun k0kubun deleted the fluentd branch July 27, 2023 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants