# Using a managed object storage service (S3 or GCS)

By default, Sourcegraph will use a `sourcegraph/blobstore` server bundled with the instance to temporarily store [code graph indexes](../../code-navigation/precise-code-navigation) uploaded by users as well as the results of [search jobs](../../code-search/types/search-jobs).

You can alternatively configure your instance to instead store this data in an S3 or GCS bucket. Doing so may decrease your hosting costs as persistent volumes are often more expensive than the same storage space in an object store service.

<Callout type="info">
	Starting in [Sourcegraph 7.2](https://sourcegraph.com/changelog/releases/7.2), new instances can configure only the
	[Sourcegraph bucket](#sourcegraph-bucket), and Sourcegraph will use that
	single bucket for all features. If a separate bucket is needed for Code Graph
	Indexes or Search Job Results, that can still be configured, but we recommend
	using one bucket.
</Callout>

## Sourcegraph bucket

<Callout type="warning">
	Starting in [Sourcegraph 7.2](https://sourcegraph.com/changelog/releases/7.2), self-hosted Sourcegraph instances using S3 or
	GCS object storage should now provision an additional bucket for shared
	Sourcegraph uploads. Sourcegraph
	currently reports a warning when this bucket is not present, and it will
	become required for new features in a future release. No action is required
	if you are using the default `sourcegraph/blobstore`.
</Callout>

The Sourcegraph bucket is intended to be the single bucket for new Sourcegraph features. Instead of creating one bucket per feature, new features store objects under namespaced key prefixes within this bucket.

Existing buckets for code graph indexes and search jobs remain in use. This change ensures future features can be enabled without requiring a new bucket for each feature.

### Using S3 for the Sourcegraph bucket

Set the following environment variables to target an S3 bucket for shared Sourcegraph uploads.

- `SOURCEGRAPH_UPLOAD_BACKEND=S3`
- `SOURCEGRAPH_UPLOAD_BUCKET=<my bucket name>`
- `SOURCEGRAPH_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
- `SOURCEGRAPH_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
- `SOURCEGRAPH_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
- `SOURCEGRAPH_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
- `SOURCEGRAPH_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
- `SOURCEGRAPH_UPLOAD_AWS_USE_PATH_STYLE=false` (optional)
- `SOURCEGRAPH_UPLOAD_AWS_REGION=<bucket region>`

### Using GCS for the Sourcegraph bucket

Set the following environment variables to target a GCS bucket for shared Sourcegraph uploads.

- `SOURCEGRAPH_UPLOAD_BACKEND=GCS`
- `SOURCEGRAPH_UPLOAD_BUCKET=<my bucket name>`
- `SOURCEGRAPH_UPLOAD_GCP_PROJECT_ID=<my project id>`
- `SOURCEGRAPH_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>` (optional)
- `SOURCEGRAPH_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>` (optional)

If you are running on GKE with Workload Identity, or otherwise relying on
Application Default Credentials, you can omit the GCS credentials file
variables.

### Automatically provision the Sourcegraph bucket

Most deployments should provision this bucket directly in their cloud provider and leave this disabled. If you would like to allow your Sourcegraph instance to manage the target bucket configuration, set the following environment variable:

<Callout type="note">
	This requires additional bucket-management permissions from your configured
	storage vendor (AWS or GCP).
</Callout>

- `SOURCEGRAPH_UPLOAD_MANAGE_BUCKET=true`

## Code Graph Indexes

To target a managed object storage service for storing [code graph index uploads](../../code-navigation/precise-code-navigation), you will need to set a handful of environment variables for configuration and authentication to the target service.

<Callout type="info">
	Starting in [Sourcegraph 7.2](https://sourcegraph.com/changelog/releases/7.2), new instances can configure only the
	[Sourcegraph bucket](#sourcegraph-bucket), and Sourcegraph will use that
	single bucket for all features. If a separate bucket is needed for Code Graph
	Indexes, that can still be configured, but we recommend using one bucket.
</Callout>

- If you are running a `sourcegraph/server` deployment, set the environment variables on the server container
- If you are running via Docker-compose or Kubernetes, set the environment variables on the `frontend`, `worker`, and `precise-code-intel-worker` containers

### Using S3 for the Code Graph Indexes bucket

To target an S3 bucket you've already provisioned, set the following environment variables. Authentication can be done through [an access and secret key pair](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys) (and optional session token), or via the EC2 metadata API.

<Callout type="warning">
	Never commit AWS access keys in Git. You should consider using a secret
	handling service offered by your cloud provider.
</Callout>

- `PRECISE_CODE_INTEL_UPLOAD_BACKEND=S3`
- `PRECISE_CODE_INTEL_UPLOAD_BUCKET=<my bucket name>`
- `PRECISE_CODE_INTEL_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
- `PRECISE_CODE_INTEL_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
- `PRECISE_CODE_INTEL_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
- `PRECISE_CODE_INTEL_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
- `PRECISE_CODE_INTEL_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
- `PRECISE_CODE_INTEL_UPLOAD_AWS_REGION=<bucket region>`

<Callout type="note">
	If a non-default region is supplied, ensure that the subdomain of the
	endpoint URL (_the `AWS_ENDPOINT` value_) matches the target region.
</Callout>

<Callout type="tip">
	You don't need to set the `PRECISE_CODE_INTEL_UPLOAD_AWS_ACCESS_KEY_ID`
	environment variable when using
	`PRECISE_CODE_INTEL_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` because role
	credentials will be automatically resolved. Attach the IAM role to the EC2
	instances hosting the `frontend`, `worker`, and `precise-code-intel-worker`
	containers in a multi-node environment.
</Callout>

### Using GCS for the Code Graph Indexes bucket

To target a GCS bucket you've already provisioned, set the following environment variables.

- `PRECISE_CODE_INTEL_UPLOAD_BACKEND=GCS`
- `PRECISE_CODE_INTEL_UPLOAD_BUCKET=<my bucket name>`
- `PRECISE_CODE_INTEL_UPLOAD_GCP_PROJECT_ID=<my project id>`
- `PRECISE_CODE_INTEL_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>` (optional)
- `PRECISE_CODE_INTEL_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>` (optional)

If you are running on GKE with Workload Identity, or otherwise relying on
Application Default Credentials, you can omit the GCS credentials file
variables.

### Automatically provision the Code Graph Indexes bucket

If you would like to allow your Sourcegraph instance to control the creation and lifecycle configuration management of the target buckets, set the following environment variables:

<Callout type="note">
	This requires additional bucket-management permissions from your configured
	storage vendor (AWS or GCP).
</Callout>

- `PRECISE_CODE_INTEL_UPLOAD_MANAGE_BUCKET=true`
- `PRECISE_CODE_INTEL_UPLOAD_TTL=168h` (default)

## Search Job Results

To target a third party managed object storage service for storing [search job results](../../code-search/types/search-jobs), you must set a handful of environment variables for configuration and authentication to the target service.

<Callout type="info">
	Starting in [Sourcegraph 7.2](https://sourcegraph.com/changelog/releases/7.2), new instances can configure only the
	[Sourcegraph bucket](#sourcegraph-bucket), and Sourcegraph will use that
	single bucket for all features. If a separate bucket is needed for Search Job
	Results, that can still be configured, but we recommend using one bucket.
</Callout>

- If you are running a `sourcegraph/server` deployment, set the environment variables on the server container
- If you are running via Docker-compose or Kubernetes, set the environment variables on the `frontend` and `worker` containers

### Using S3 for the Search Job Results bucket

Set the following environment variables to target an S3 bucket you've already provisioned. Authentication can be done through [an access and secret key pair](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys) (and optionally through session token) or via the EC2 metadata API.

<Callout type="warning">
	Never commit AWS access keys in Git. You should consider using a secret
	handling service offered by your cloud provider.
</Callout>

- `SEARCH_JOBS_UPLOAD_BACKEND=S3`
- `SEARCH_JOBS_UPLOAD_BUCKET=<my bucket name>`
- `SEARCH_JOBS_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
- `SEARCH_JOBS_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
- `SEARCH_JOBS_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
- `SEARCH_JOBS_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
- `SEARCH_JOBS_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
- `SEARCH_JOBS_UPLOAD_AWS_REGION=<bucket region>`

<Callout type="note">
	If a non-default region is supplied, ensure that the subdomain of the
	endpoint URL (the `AWS_ENDPOINT` value) matches the target region.
</Callout>

<Callout type="tip">
	You don't need to set the `SEARCH_JOBS_UPLOAD_AWS_ACCESS_KEY_ID` environment
	variable when using `SEARCH_JOBS_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true`
	because role credentials will be automatically resolved.
</Callout>

### Using GCS for the Search Job Results bucket

Set the following environment variables to target a GCS bucket you've already provisioned.

- `SEARCH_JOBS_UPLOAD_BACKEND=GCS`
- `SEARCH_JOBS_UPLOAD_BUCKET=<my bucket name>`
- `SEARCH_JOBS_UPLOAD_GCP_PROJECT_ID=<my project id>`
- `SEARCH_JOBS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>` (optional)
- `SEARCH_JOBS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>` (optional)

If you are running on GKE with Workload Identity, or otherwise relying on
Application Default Credentials, you can omit the GCS credentials file
variables.

### Automatically provision the Search Job Results bucket

If you would like to allow your Sourcegraph instance to control the creation and lifecycle configuration management of the target buckets, set the following environment variables:

<Callout type="note">
	This requires additional bucket-management permissions from your configured
	storage vendor (AWS or GCP).
</Callout>

- `SEARCH_JOBS_UPLOAD_MANAGE_BUCKET=true`
