Intermittement errors & CI failures after upgrading to Rails 7.1
Are you seeing your CI hang, or random error messages like:
message type 0x43 arrived from server while idle
in your test log?
It could all be related to a missing default setting in your Rails
config.
Introduction
As part of an upgrade to Rails 7.1 we started seeing various CI failures which
all related to either Postgres becoming unavailable, or the test run timing out
or completely random rspec
errors.
This was only affecting 1 particular spec file, but we could not immediately
isolate the culprit.
Investigation
I have started by disabling all initializers for rspec
, reducing DB pool size,
and kept continuing until I ended up having to disable activejob
. It turns out
that ActiveJob
by default in test mode uses :async
as its running method,
which will require an active DB connection so it can cache DB schema.
OK, but what was calling activejob
, as we only make use of sidekiq
in the
application? Well, it turns out that the root cause is that activestorage
depends on activejob
and in turn this lead to the following backtrace:
/ruby/3.1.4/lib/ruby/gems/3.1.0/gems/activestorage-7.1.2/app/jobs/active_storage/analyze_job.rb:5
The solution
The solution was swapping the ActiveJob
running mode to :test
in the
environment config file (config/environments/test.rb
). By adding the
below configuration it has immediately resolved all of our CI issues:
config.active_job.queue_adapter = :test
This was hinted at in this Rails issue.