Bundle URIs
Tier: Free, Premium, Ultimate
Offering: GitLab Self-Managed
Version history
-
Introduced in GitLab 17.0 with a flag named
gitaly_bundle_uri
. Disabled by default.
Gitaly supports Git bundle URIs. Bundle URIs are locations where Git can download one or more bundles to bootstrap the object database before fetching the remaining objects from a remote. Bundle URIs are built in to the Git protocol.
Using Bundle URIs can:
- Speed up clones and fetches for users with a poor network connection to the GitLab server. The bundles can be stored on a CDN, making them available around the world.
- Reduce the load on servers that run CI/CD jobs. If CI/CD jobs can pre-load bundles from somewhere else, the remaining work to incrementally fetch missing objects and references creates a lot less load on the server.
Prerequisites
The prerequisites for using bundle URI depend on whether cloning in a CI/CD job or locally in a terminal.
Cloning in CI/CD jobs
To prepare to use bundle URI in CI/CD jobs:
-
Select a GitLab Runner helper image used by GitLab Runner to a version that runs:
- Git version 2.49.0 or later.
- GitLab Runner helper version 18.0 or later.
This step is required because bundle URI is a mechanism that aims to reduce the load on the Git server during a
git clone
. Therefore, when a CI/CD pipeline runs, thegit
client that initiates thegit clone
command is the GitLab Runner. Thegit
process runs inside the helper image.Make sure to select an image that corresponds to the operating system distribution and the architecture you use for your GitLab runners.
You can verify that the image satisfies the requirements by running these commands:
docker run -it <image:tag> $ git version $ gitlab-runner-helper -v
We rely on the operating system distribution's package manager to manage the Git version in the
gitlab-runner-helper
image. Therefore, some of the latest available images might still not run Git 2.49.If you do not find an image that meets the requirements, use the
gitlab-runner-helper
as a base image for your own custom-built image. You can host on your custom-build image by using GitLab container registry. -
Configure your GitLab Runner instances to use the select image by updating your
config.toml
file:[[runners]] (...) executor = "docker" [runners.docker] (...) helper_image = "image:tag" ## <-- put the image name and tag here
For more details, see information on the helper image.
-
Restart the runners for the new configuration to take effect.
-
Enable the
FF_USE_GIT_NATIVE_CLONE
GitLab Runner feature flag in your.gitlab-ci.yml
file by setting ittrue
:variables: FF_USE_GIT_NATIVE_CLONE: "true"
Cloning locally in your terminal
To prepare to use bundle URI for cloning locally in your terminal, enable bundle-uri
in your local Git configuration:
git config --global transfer.bundleuri true
Server configuration
You must configure where the bundles are stored. Gitaly supports the following storage services:
- Google Cloud Storage
- AWS S3 (or compatible)
- Azure Blob Storage
- Local file storage (not recommended)
Configure Azure Blob storage
How you configure Azure Blob storage for Bundle URI depends on the type of
installation you have. For self-compiled installations, you must set the
AZURE_STORAGE_ACCOUNT
and AZURE_STORAGE_KEY
environment variables outside of
GitLab.
Linux package (Omnibus)
Edit /etc/gitlab/gitlab.rb
and configure the bundle_uri.go_cloud_url
:
gitaly['env'] = {
'AZURE_STORAGE_ACCOUNT' => 'azure_storage_account',
'AZURE_STORAGE_KEY' => 'azure_storage_key' # or 'AZURE_STORAGE_SAS_TOKEN'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 'azblob://<bucket>'
}
}
Self-compiled (source)
Edit /home/git/gitaly/config.toml
and configure go_cloud_url
:
[bundle_uri]
go_cloud_url = "azblob://<bucket>"
Configure Google Cloud storage
Google Cloud storage (GCP) authenticates using Application Default Credentials. Set up Application Default Credentials on each Gitaly server using either:
- The
gcloud auth application-default login
command. - The
GOOGLE_APPLICATION_CREDENTIALS
environment variable. For self-compiled installations, set the environment variable outside of GitLab.
For more information, see Application Default Credentials.
The destination bucket is configured using the go_cloud_url
option.
Linux package (Omnibus)
Edit /etc/gitlab/gitlab.rb
and configure the go_cloud_url
:
gitaly['env'] = {
'GOOGLE_APPLICATION_CREDENTIALS' => '/path/to/service.json'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 'gs://<bucket>'
}
}
Self-compiled (source)
Edit /home/git/gitaly/config.toml
and configure go_cloud_url
:
[bundle_uri]
go_cloud_url = "gs://<bucket>"
Configure S3 storage
To configure S3 storage authentication:
- If you authenticate with the AWS CLI, you can use the default AWS session.
- Otherwise, you can use the
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
environment variables. For self-compiled installations, set the environment variables outside of GitLab.
For more information, see AWS Session documentation.
The destination bucket and region are configured using the go_cloud_url
option.
Linux package (Omnibus)
Edit /etc/gitlab/gitlab.rb
and configure the go_cloud_url
:
gitaly['env'] = {
'AWS_ACCESS_KEY_ID' => 'aws_access_key_id',
'AWS_SECRET_ACCESS_KEY' => 'aws_secret_access_key'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 's3://<bucket>?region=us-west-1'
}
}
Self-compiled (source)
Edit /home/git/gitaly/config.toml
and configure go_cloud_url
:
[bundle_uri]
go_cloud_url = "s3://<bucket>?region=us-west-1"
Configure S3-compatible servers
Version history
-
awssdk
parameter introduced in GitLab 15.10. -
use_path_style
anddisable_https
parameters introduced in GitLab 17.4.
S3-compatible servers such as MinIO are configured similarly to S3 with the
addition of the endpoint
parameter.
The following parameters are supported:
-
region
: The AWS region. -
endpoint
: The endpoint URL. -
disableSSL
: Set totrue
to disable SSL. Available for GitLab 17.4.0 and earlier. For GitLab versions after 17.4.0, usedisable_https
. -
disable_https
: Set totrue
to disable HTTPS in the endpoint options. -
s3ForcePathStyle
: Set totrue
to force path-style URLs for S3 objects. Unavailable in GitLab versions 17.4.0 to 17.4.3. In those versions, useuse_path_style
instead. -
use_path_style
: Set totrue
to enable path-style S3 URLs (https://<host>/<bucket>
instead ofhttps://<bucket>.<host>
). -
awssdk
: Force a particular version of AWS SDK. Set tov1
to force AWS SDK v1 orv2
to force AWS SDK v2. If:- Set to
v1
, you must usedisableSSL
instead ofdisable_https
. - Not set, defaults to
v2
.
- Set to
use_path_style
was introduced when the Go Cloud Development Kit dependency was updated from v0.38.0 to v0.39.0, which switched from AWS SDK v1 to v2. However, the s3ForcePathStyle
parameter was restored in GitLab 17.4.4 after the gocloud.dev maintainers added backward compatibility support. For more information, see issue 6489.
disable_https
was introduced in the Go Cloud Development Kit v0.40.0 (AWS SDK v2).
awssdk
was introduced in the Go Cloud Development Kit v0.24.0.
Linux package (Omnibus)
Edit /etc/gitlab/gitlab.rb
and configure the go_cloud_url
:
gitaly['env'] = {
'AWS_ACCESS_KEY_ID' => 'minio_access_key_id',
'AWS_SECRET_ACCESS_KEY' => 'minio_secret_access_key'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 's3://<bucket>?region=minio&endpoint=my.minio.local:8080&disable_https=true&use_path_style=true'
}
}
Self-compiled (source)
Edit /home/git/gitaly/config.toml
and configure go_cloud_url
:
[bundle_uri]
go_cloud_url = "s3://<bucket>?region=minio&endpoint=my.minio.local:8080&disable_https=true&use_path_style=true"
Generating bundles
After Gitaly is configured, Gitaly can generate bundles either manually or automatically.
Manual generation
This command generates the bundle and stores it on the configured storage service.
sudo -u git -- /opt/gitlab/embedded/bin/gitaly bundle-uri \
--config=<config-file> \
--storage=<storage-name> \
--repository=<relative-path>
Gitaly does not automatically refresh the generated bundle. When you want to generate a more recent version of a bundle, you must run the command again.
You can schedule this command with a tool like cron(8)
.
Automatic generation
Version history
-
Introduced in GitLab 18.0 with a flag named
gitaly_bundle_generation
. Disabled by default.
The availability of this feature is controlled by a feature flag. For more information, see the history.
Gitaly can generate bundles automatically by determining if it is handling frequent clones for the same repository.
The current heuristic keeps track of the number of times a git fetch
request is issued for each repository. If the
number of requests reaches a certain threshold in a given interval, Gitaly automatically generates a bundle.
Gitaly also keeps track of the last time it generated a bundle for a repository. When a new bundle should be regenerated,
based on the threshold
and interval
, Gitaly looks at the last time a bundle was generated for the given repository.
Gitaly only generates a new bundle if the existing bundle is older than maxBundleAge
configuration, in which case the
old bundle is overwritten. There can only be one bundle per repository in cloud storage.
Bundle URI example
In the following example, we demonstrate the difference between cloning
gitlab.com/gitlab-org/gitlab.git
with and without using bundle URI.
gitaly['env'] = {
'AZURE_STORAGE_ACCOUNT' => 'azure_storage_account',
'AZURE_STORAGE_KEY' => 'azure_storage_key' # or 'AZURE_STORAGE_SAS_TOKEN'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 'azblob://<bucket>'
}
}
```0
In the previous example:
- When not using a Bundle URI, there were 5,271,177 objects received from the
GitLab server.
- When using a Bundle URI, there were 1,322,255 objects received from the GitLab
server.
This reduction means GitLab needs to pack together fewer objects (in the previous
example, roughly a quarter of the number of objects) because the client first
downloaded the bundle from the storage server.
## Securing bundles
The bundles are made accessible to the client using signed URLs. A signed URL is
a URL that provides limited permissions and time to make a request. To see if
your storage service supports signed URLs, see the documentation of your storage
service.