Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environments where Sentry is not enabled still instrument Rails Application #2399

Closed
AwolDes opened this issue Sep 11, 2024 · 5 comments
Closed
Assignees

Comments

@AwolDes
Copy link

AwolDes commented Sep 11, 2024

Issue Description

We merged Sentry into our staging environment the other week. We use both DelayedJob and ActiveJob APIs. In some cases we will queue a job like Class.delay.method_name(arg, arg_02) and in others we will queue using the ActiveJob syntax like SomeJob.perform_later

A few unexpected things happened:

  • Our staging environment was unable to queue jobs due to this ambiguous error:
    Image

  • Sentry injects args to jobs - we ha a test fail with this error, due to the test asserting the log line was only including the 1 argument (123456789)

unexpected invocation: #<AnyInstance:RemoveExcessRecacheJobsJob>.log_info("Dedup jobs: Deleting 2 jobs for a Timesheet with source Timesheet#method and arguments: [123456789, {:sentry=>{\"sentry-trace\"=>\"xxx-xxx\", \"baggage\"=>\"sentry-trace_id=xxx,sentry-environment=test,sentry-release=xxx,sentry-public_key=xxx\"}}]")
  • Even though the Sentry config only listed staging in the enabled_environments it seems that Sentry was still instrumenting in production and not emitting events. Is this the expected behaviour?

We know this because when we reverted including Sentry, a of of jobs began to fail with ArgumentErrors - see below screenshot. Existing jobs that had been queued were being ran with extra arguments. This is the key problem we're interested in learning more about, as it unexpectedly caused 10s of thousands of jobs to error.

Image

Thanks for any further clarification you're able to provide!

System details:

Rails version: Edge - the latest Rails master commit
Ruby version: 3.3.4
OS: Ubuntu 20.04
sentry-delayed_job: 5.19.0
sentry-rails: 5.19.0
sentry-ruby: 5.19.0

Reproduction Steps

TBD - we are attempting to reproduce in dev, currently this has only been reproduced in a staging environment.

Expected Behavior

If an environment is not listed in enabled_environments - Sentry should not even be instrumenting that environment. Reason being is it becomes unsafe to roll out the addition of Sentry via staging, because regardless production will still be instrumented.

Actual Behavior

It seems no matter what the list is in enabled_environments the application is instrumented, and simply does not emit the events to Sentry.

Ruby Version

3.3.4

SDK Version

5.19.0

Integration and Its Version

Rails & DelayedJob

Sentry Config

# typed: false
# frozen_string_literal: true

require "sentry-ruby"

Sentry.init do |config|
  config.dsn = "OUR_DSN"
  config.breadcrumbs_logger = [:active_support_logger, :http_logger]

  # In tests, calling Instance.info.commit_hash will override expected mock data
  release_hash = Rails.env.test? ? nil : Instance.info.commit_hash

  config.release = "project-#{release_hash}" || "NoCommitHash-#{Time.zone.now.to_i}"
  config.include_local_variables = true
  config.enabled_environments = ["staging"]
  config.environment = Rails.env
  # Appends to defaults Sentry::Configuration::IGNORE_DEFAULT
  config.excluded_exceptions += [
    "CGI::Session::CookieStore::TamperedWithCookie",
    "ActionController::InvalidAuthenticityToken",
    "ActionDispatch::Cookies::CookieOverflow",
    "ActionController::RoutingError",
    "ActionController::UnknownFormat",
    "ActionController::UnknownHttpMethod",
    "ActionDispatch::Http::Parameters::ParseError",
    "ActionDispatch::Http::MimeNegotiation::InvalidType",
    "ActionController::MissingExactTemplate",
    "JSON::ParserError",
    "SignalException"
  ]

  # config.traces_sample_rate = Rails.env.production? ? 0.001 : 0
  # Base config from https://docs.sentry.io/platforms/ruby/configuration/options/#traces-sampler
  config.traces_sampler = lambda do |sampling_context|
    return 0 unless Rails.env.staging?

    # if this is the continuation of a trace, just use that decision (rate controlled by the caller)
    next sampling_context[:parent_sampled] unless sampling_context[:parent_sampled].nil?

    # the sampling context also has the full rack environment if you want to check the path directly
    rack_env = sampling_context[:env]
    return 0.0 if /healthcheck/.match?(rack_env["PATH_INFO"])

    # transaction_context is the transaction object in hash form
    # keep in mind that sampling happens right after the transaction is initialized
    # for example, at the beginning of the request
    transaction_context = sampling_context[:transaction_context]

    # transaction_context helps you sample transactions with more sophistication
    # for example, you can provide different sample rates based on the operation or name
    op = transaction_context[:op]
    transaction_name = transaction_context[:name]

    case op
    when /http/
      # for Rails applications, transaction_name would be the request's path (env["PATH_INFO"]) instead of "Controller#action"
      case transaction_name
      when /healthcheck/
        0.0
      # API traffic is much large than other traffic
      when /api/
        0.001
      else
        0.01
      end
    when /delayed_job/
      0.0001 # we queue a lot of jobs
    else
      0.0 # ignore all other transactions
    end
  end
end
@sl0thentr0py
Copy link
Member

sl0thentr0py commented Sep 11, 2024

@AwolDes sorry that you faced problems.

So there are several things going on here.

enabled_environments

If an environment is not listed in enabled_environments - Sentry should not even be instrumenting that environment. Reason being is it becomes unsafe to roll out the addition of Sentry via staging, because regardless production will still be instrumented.

We rely on patches to the underlying gems, enabled_environments only controls whether events are sent upstream to the DSN in that env or not. If you want to completely disable sentry instrumentation currently, the only way is to not include the corresponding gem, like sentry-delayed_job in that environment.

Bug in sentry-delayed_job

It seems from your report that the problem is with DelayedJob and not ActiveJob. Can you confirm that ActiveJob works normally?

If yes, there is something going wrong with our DelayedJob integration. For that, can you show me how your DelayedJob job class looks like (including the base class, if any)?

@sl0thentr0py
Copy link
Member

this is very likely because delayed_job doesn't support keyword arguments, see
collectiveidea/delayed_job#1134
https://blog.valerauko.net/2021/12/04/patching-delayed_job-for-ruby-3/

@AwolDes
Copy link
Author

AwolDes commented Sep 11, 2024

Thank you for the clarification about how enabled_environments works. Is it possible to update the documentation to clarify this option in the config?

this is very likely because delayed_job doesn't support keyword arguments, see collectiveidea/delayed_job#1134 https://blog.valerauko.net/2021/12/04/patching-delayed_job-for-ruby-3/

Thanks @sl0thentr0py - I reckon this is on the money. We have a lint rule to prevent new additions of .delay calls, but old ones still exist. I'll give this monkey patch a try over the next week & see if it resolves the issue. Thank you for your help.

@AwolDes
Copy link
Author

AwolDes commented Sep 19, 2024

@sl0thentr0py here is an updated sentry.rb config - spot the bug! The config I copied from the docs broke as soon as the sentry-delayed_job gem was included. Removing these lines from the config resolved the problem

# typed: false
# frozen_string_literal: true

require "sentry-ruby"

Sentry.init do |config|
  config.dsn = "OUR_DSN"
  config.breadcrumbs_logger = [:active_support_logger, :http_logger]

  # In tests, calling Instance.info.commit_hash will override expected mock data
  release_hash = Rails.env.test? ? nil : Instance.info.commit_hash

  config.release = "project-#{release_hash}" || "NoCommitHash-#{Time.zone.now.to_i}"
  config.include_local_variables = true
  config.enabled_environments = ["staging"]
  config.environment = Rails.env
  # Appends to defaults Sentry::Configuration::IGNORE_DEFAULT
  config.excluded_exceptions += [
    "CGI::Session::CookieStore::TamperedWithCookie",
    "ActionController::InvalidAuthenticityToken",
    "ActionDispatch::Cookies::CookieOverflow",
    "ActionController::RoutingError",
    "ActionController::UnknownFormat",
    "ActionController::UnknownHttpMethod",
    "ActionDispatch::Http::Parameters::ParseError",
    "ActionDispatch::Http::MimeNegotiation::InvalidType",
    "ActionController::MissingExactTemplate",
    "JSON::ParserError",
    "SignalException"
  ]

  # config.traces_sample_rate = Rails.env.production? ? 0.001 : 0
  # Base config from https://docs.sentry.io/platforms/ruby/configuration/options/#traces-sampler
  config.traces_sampler = lambda do |sampling_context|
    return 0 unless Rails.env.staging?

    # if this is the continuation of a trace, just use that decision (rate controlled by the caller)
    next sampling_context[:parent_sampled] unless sampling_context[:parent_sampled].nil?

    # the sampling context also has the full rack environment if you want to check the path directl
+  # These lines break in the context of delayed job 
-    rack_env = sampling_context[:env]
-    return 0.0 if /healthcheck/.match?(rack_env["PATH_INFO"])

    # transaction_context is the transaction object in hash form
    # keep in mind that sampling happens right after the transaction is initialized
    # for example, at the beginning of the request
    transaction_context = sampling_context[:transaction_context]

    # transaction_context helps you sample transactions with more sophistication
    # for example, you can provide different sample rates based on the operation or name
    op = transaction_context[:op]
    transaction_name = transaction_context[:name]

    case op
    when /http/
      # for Rails applications, transaction_name would be the request's path (env["PATH_INFO"]) instead of "Controller#action"
      case transaction_name
      when /healthcheck/
        0.0
      # API traffic is much large than other traffic
      when /api/
        0.001
      else
        0.01
      end
    when /delayed_job/
      0.0001 # we queue a lot of jobs
    else
      0.0 # ignore all other transactions
    end
  end
end

@getsantry getsantry bot moved this from Waiting for: Community to Waiting for: Product Owner in GitHub Issues with 👀 3 Sep 19, 2024
@sl0thentr0py
Copy link
Member

ah hmmm I see, I will fix that code to make it safe, thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

3 participants