[Discussion] Has anyone tired using Postgres with multiple databases holding multiple tenants/schemas? #206

JonMcCarthy7 · 2022-08-25T17:36:05Z

I'd like to hear from people who have been able to successfully accomplish this type of architecture using this gem. I'm currently stuck with connection errors from constantly switching database connections. I think the route I need to take is better managing the connection_handler/connection_pools that connect to each database. Has anyone ran into something similar? Thanks

SkyM · 2022-08-26T22:12:49Z

@JonMcCarthy7 We have been using this in production for ~5 years. At this point we have >100 tenant databases with 1-5gb of data per database. I honestly consider this architecture decision one of the worst in my career. By all metrics we have 'successfully' accomplished this architecture, however it is turning into a major issue as we look forward to scaling and re-factoring our codebase. Backing out of this style multi-tenant database is a HUGE problem. Be advised.

For your specific issue, if you find yourself switching between multiple tenants per request or per lifecycle of a thread, you probably need to reconsider how you are separating data. You should definitely be looking at expanding thread pools and minimizing the amount of switching per request that you are performing.

JonMcCarthy7 · 2022-08-26T22:27:08Z

Hey @SkyM. Thanks for the response. Are you saying you did the 1 Tenant per DB approach? Sorry if I wasn't clear, in my case I'm trying to set up multiple tenants on each database/shard.

SkyM · 2022-08-26T22:41:45Z

We used MySQL and did one tenant per database with a master database to handle shared data.

SkyM · 2022-08-26T22:45:51Z

For context, you may want to read through this discussion on the original library: influitive/apartment#636

oleksii-leonov · 2022-08-27T00:30:36Z

@JonMcCarthy7

We have been using the apartment gem for at least 5 years, and absolutely happy with it.

We using PostgreSQL with a separate schema for each tenant.
Our product is a B2B SaaS, designed for large companies in a specific industry. So, we have thousands of clients (not millions).
We have a significant amount of business logic and ~250 tables in each tenant, some data stored as JSONB.
Tenants could be quite large (hundreds of gigabytes).
We need strict data isolation between tenants.
It's extremely rare we need to collect data from different tenants.
We do switch to the proper tenant once per request. We never do switch inside requests. If we need to show some aggregated data, we use a background job to collect it first and then keep it in the shared schema.

If you have the same use case, this setup with apartment gem and PostgreSQL with a separate schema for each tenant will work like a charm:

It's much easier to migrate 1000 relatively small schemas (with millions of records in tables) than one DB/schema with tables with billions of records.
It's almost impossible to accidentally show one client's data to other clients. In the case of a classic setup with one DB/schema and hundreds of models with the company_id column — it's easy to accidentally miss scope.
It's easy to restore one particular client's schema from DB backup in case of some emergency (if the client accidentally deleted some critical data from its account) without touching other clients' data.

But there are use cases when this setup would not be beneficial:

If you expect to have millions of clients. Separate schemas are pretty expensive for DB (you need much more space and experience in tuning your DB for such load).
If you need to mix data between different clients easily.
If your business logic is relatively small and straightforward and it's possible to control scope.

oleksii-leonov · 2022-08-29T22:19:23Z

@JonMcCarthy7

Sorry, I realized we use multiple DBs with tenants differently than you asked.

In our case, we have multiple DB servers and use separate schemas for each tenant. But we keep all the tenants on each of DBs.

This means we have schema "test-client-1" on all DB servers. But "DB server 1" has tables dedicated to one business domain and "DB server 2" has tables dedicated to another business domain. Each worker keeps a connection to all DBs and switches between tenants with our custom switch method (set search_path for all connected DBs). So, we don't sharding tenants between different DB servers.

tjhanley · 2022-11-12T18:48:10Z

@aleksejleonov

This means we have schema "test-client-1" on all DB servers. But "DB server 1" has tables dedicated to one business domain and "DB server 2" has tables dedicated to another business domain. Each worker keeps a connection to all DBs and switches between tenants with our custom switch method (set search_path for all connected DBs). So, we don't sharding tenants between different DB servers.

I am experimenting with multiple DB Hosts [#208]. I am experiencing a lot of issues. You mention a custom switch method with search_path. When I run the stock switch Apartment doesn't seem to find the tenant's schema on the secondary host.

irb(main):007:0> Apartment.tenants_with_config
=>
{"tenant_1"=>{"adapter"=>"postgresql", "host"=>"localhost", "port"=>5432, "encoding"=>"unicode", "database"=>"development"},
 "tenant_2"=>
  {"host"=>"some-host-ip",
   "adapter"=>"postgresql",
   "database"=>"tenant2_db",
   "username"=>"postgres",
   "port"=>5432,
   "encoding"=>"unicode",
   "password"=>"[redacted]"}}

irb(main):006:0> Apartment::Tenant.switch('tenant_2')
/Users/tom/.rbenv/versions/3.0.4/lib/ruby/gems/3.0.0/gems/ros-apartment-2.11.0/lib/apartment/adapters/postgresql_adapter.rb:139:in `raise_schema_connect_to_new': Could not set search path to schemas, they may be invalid: "tenant_2" "tenant_1". (Apartment::TenantNotFound)
Original error: ActiveRecord::StatementInvalid: Could not find schema tenant_2
/Users/tom/.rbenv/versions/3.0.4/lib/ruby/gems/3.0.0/gems/ros-apartment-2.11.0/lib/apartment/adapters/postgresql_adapter.rb:75:in `connect_to_new': Could not find schema tenant_2 (ActiveRecord::StatementInvalid)

The odd thing is if I issue a Apartment::Tenant.drop('tenant_2') command from the console the schema drops successfully.

irb(main):008:0> Apartment::Tenant.drop('tenant_2')
(90.3ms)  DROP SCHEMA "tenant_2" CASCADE
=> #<Thread::ConditionVariable:0x000000010baf2510>

oleksii-leonov · 2022-11-14T04:08:52Z

@tjhanley

class ApplicationRecord < ActiveRecord::Base
  primary_abstract_class
end

class Db1Base < ApplicationRecord
  self.abstract_class = true
  establish_connection(:db1)
end

class Db2Base < ApplicationRecord
  self.abstract_class = true
  establish_connection(:db2)
end

class ModelMain < ApplicationRecord; end
class Model1 < Db1Base; end
class Model2 < Db2Base; end

module SwitchTenant
  def self.reset
    switch!(Apartment.default_tenant)
  end

  def self.switch!(tenant = nil)
    if tenant.nil?
      reset
    else
      # Main DB
      Apartment::Tenant.switch!(tenant)

      # DB1
      Db1Base.connection.schema_search_path = tenant
      Db1Base.connection.clear_query_cache

      # DB2
      Db2Base.connection.schema_search_path = tenant
      Db2Base.connection.clear_query_cache
    end
  end

  def self.switch(tenant = nil)
    switch!(tenant)
    yield
  ensure
    reset
  end
end

> SwitchTenant.switch!('tenant1')
# Now you will get all the data from the "tenant1" schema in each DB.
> ModelMain.all
> Model1.all
> Model2.all

mnovelo added the stale label Oct 29, 2024

mnovelo closed this as completed Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Has anyone tired using Postgres with multiple databases holding multiple tenants/schemas? #206

[Discussion] Has anyone tired using Postgres with multiple databases holding multiple tenants/schemas? #206

JonMcCarthy7 commented Aug 25, 2022

SkyM commented Aug 26, 2022

JonMcCarthy7 commented Aug 26, 2022 •

edited

Loading

SkyM commented Aug 26, 2022

SkyM commented Aug 26, 2022

oleksii-leonov commented Aug 27, 2022 •

edited

Loading

oleksii-leonov commented Aug 29, 2022

tjhanley commented Nov 12, 2022

oleksii-leonov commented Nov 14, 2022 •

edited

Loading

[Discussion] Has anyone tired using Postgres with multiple databases holding multiple tenants/schemas? #206

[Discussion] Has anyone tired using Postgres with multiple databases holding multiple tenants/schemas? #206

Comments

JonMcCarthy7 commented Aug 25, 2022

SkyM commented Aug 26, 2022

JonMcCarthy7 commented Aug 26, 2022 • edited Loading

SkyM commented Aug 26, 2022

SkyM commented Aug 26, 2022

oleksii-leonov commented Aug 27, 2022 • edited Loading

oleksii-leonov commented Aug 29, 2022

tjhanley commented Nov 12, 2022

oleksii-leonov commented Nov 14, 2022 • edited Loading

JonMcCarthy7 commented Aug 26, 2022 •

edited

Loading

oleksii-leonov commented Aug 27, 2022 •

edited

Loading

oleksii-leonov commented Nov 14, 2022 •

edited

Loading