Skip to content

GitLab Cells Development Guidelines

For background of GitLab Cells, refer to the design document.

Available Cells / Organization schemas

Below are available schemas related to Cells and Organizations:

Schema Description
gitlab_main (deprecated) This is being replaced with gitlab_main_cell, for the purpose of building the Cells architecture.
gitlab_main_cell To be renamed to gitlab_main_org. Use for all tables in the main: database that are for an Organization. For example, projects and groups
gitlab_main_cell_setting All tables in the main: database related to cell settings. For example, application_settings.
gitlab_main_clusterwide (deprecated) All tables in the main: database where all rows, or a subset of rows needs to be present across the cluster, in the Cells architecture. For example, plans. For the Cells 1.0 architecture, there are no real clusterwide tables as each cell will have its own database. In effect, these tables will still be stored locally in each cell.
gitlab_main_cell_local For tables in the main: database that are related to features that is distinct for each cell. For example, zoekt_nodes, or shards. These cell-local tables should not have any foreign key references from/to organization tables.
gitlab_ci Use for all tables in the ci: database that are for an Organization. For example, ci_pipelines and ci_builds
gitlab_ci_cell_local For tables in the ci: database that are related to features that is distinct for each cell. For example, instance_type_ci_runners, or ci_cost_settings. These cell-local tables should not have any foreign key references from/to organization tables.
gitlab_main_user Schema for all User-related tables, ex. users, emails, etc. Most user functionality is organizational level so should use gitlab_main_cell instead (e.g. commenting on an issue). For user functionality that is not organizational level, use this schema. Tables on this schema must strictly belong to a user.

Most tables will require a sharding key to be defined.

To understand how existing tables are classified, you can use this dashboard.

After a schema has been assigned, the merge request pipeline might fail due to one or more of the following reasons, which can be rectified by following the linked guidelines:

What schema to choose if the feature can be cluster-wide?

The gitlab_main_clusterwide schema is now deprecated. We will ask teams to update tables from gitlab_main_clusterwide to gitlab_main_cell as required. This requires adding sharding keys to these tables, and may require additional changes to related features to scope them to the Organizational level.

Clusterwide features are heavily discouraged, and there are no plans to perform any cluster-wide synchronization.

Choose a different schema from the list of available GitLab schemas instead. We expect most tables to use the gitlab_main_cell schema, especially if the table in the table is related to projects, or namespaces. Another alternative is the gitlab_main_cell_local schema.

Consult with the Tenant Scale group: If you believe you require a clusterwide feature, seek design input from the Tenant Scale group. Here are some considerations to think about:

  • Can the feature to be scoped per Organization (or lower) instead ?
  • The related feature must work on multiple cells, not just the legacy cell.
  • How would the related feature scale across many Organizations and Cells ?
  • How will data be stored ?
  • How will organizations reference the data consistently ? Can you use globally unique identifiers ?
  • Does the data need to be consistent across different cells ?
  • Do not use database tables to store static data.

Static data

Problem: A database table is used to store static data. However, the primary key is not static because it uses an auto-incrementing sequence. This means the primary key is not globally consistent.

References to this inconsistent primary key will create problems because the reference clashes across cells / organizations.

Example: The plans table on a given Cell has the following data:

 id |             name             |              title
----+------------------------------+----------------------------------
  1 | default                      | Default
  2 | bronze                       | Bronze
  3 | silver                       | Silver
  5 | gold                         | Gold
  7 | ultimate_trial               | Ultimate Trial
  8 | premium_trial                | Premium Trial
  9 | opensource                   | Opensource
  4 | premium                      | Premium
  6 | ultimate                     | Ultimate
 10 | ultimate_trial_paid_customer | Ultimate Trial for Paid Customer
(10 rows)

On another cell, the plans table has differing ids for the same name:

 id |             name             |            title
----+------------------------------+------------------------------
  1 | default                      | Default
  2 | bronze                       | Bronze
  3 | silver                       | Silver
  4 | premium                      | Premium
  5 | gold                         | Gold
  6 | ultimate                     | Ultimate
  7 | ultimate_trial               | Ultimate Trial
  8 | ultimate_trial_paid_customer | Ultimate Trial Paid Customer
  9 | premium_trial                | Premium Trial
 10 | opensource                   | Opensource

This plans.id column is then used as a reference in the hosted_plan_id column of gitlab_subscriptions table.

Solution: Use globally unique references, not a database sequence. If possible, hard-code static data in application code, instead of using the database.

In this case, the plans table can be dropped, and replaced with a fixed model:

class Plan
  include ActiveModel::Model
  include ActiveModel::Attributes
  include ActiveRecord::FixedItemsModel::Model

  ITEMS = [
    {:id=>1, :name=>"default", :title=>"Default"},
    {:id=>2, :name=>"bronze", :title=>"Bronze"},
    {:id=>3, :name=>"silver", :title=>"Silver"},
    {:id=>4, :name=>"premium", :title=>"Premium"},
    {:id=>5, :name=>"gold", :title=>"Gold"},
    {:id=>6, :name=>"ultimate", :title=>"Ultimate"},
    {:id=>7, :name=>"ultimate_trial", :title=>"Ultimate Trial"},
    {:id=>8, :name=>"ultimate_trial_paid_customer", :title=>"Ultimate Trial Paid Customer"},
    {:id=>9, :name=>"premium_trial", :title=>"Premium Trial"},
    {:id=>10, :name=>"opensource", :title=>"Opensource"}
  ]

  attribute :id, :integer
  attribute :name, :string
  attribute :title, :string
end

The hosted_plan_id column will also be updated to refer to the fixed model's id value.

Examples of hard-coding static data include:

Cells Routing

Coming soon, guide on how to route your request to your organization's cell.