Skip to main content

Getting Started

Cloning the Repository

Start by cloning the repository and switch to the dev/ilt branch.

git clone
cd common-voice
git checkout dev/ilt


Docker Environment Variables

Common Voice uses a file to set some environment variables that are used by the services.

cp .env-local-docker.example .env-local-docker

Default values are fine but feel free to changes them.

Using S3 Bucket

- CV_S3_CONFIG='{"endpoint": "http://s3proxy:80", "accessKeyId": "local-identity", "secretAccessKey": "local-credential", "s3ForcePathStyle": true}'
+ CV_S3_CONFIG='{"endpoint": "", "accessKeyId": "my_access_key_id", "secretAccessKey": "my_secret_access_key", "s3For cePathStyle": true}'
+ BUCKET_LOCATION="ca-central-1"
+ CV_CLIP_BUCKET_NAME=smallteamtest
+ CV_DATASET_BUCKET_NAME=smallteamtest

Configuring Auth0

+ CV_AUTH0_CLIENT_ID="client_id"
+ CV_AUTH0_CLIENT_SECRET="client_secret"

Changing Audio Format

We need higher quality audio recordings.

+ CV_TRANSCODE_CODEC='pcm_s16le'

Changing the Sentence Directory

Importing all language utterances can be extemely long. You can create a subset or a different list of language utterances to import a new directory and have that directory use a the source of utterances by:

+ CV_SENTENCES_FOLDER="server/data.ilt"

Tasks Environment Variables

We've added a new service that does backup. It needs to know some credentials to access the nextcloud server.

cp .env-tasks.example .env-tasks

which looks like this


You need to head to and create a token which is your password. Then populate the WEBDAV_LOGIN & WEBDAV_PASSWORD.


Finally, start the services. Note that we specify a user/group because the stack can write some files/diretories that will be owned by root and you won't be able to delete them later.

CURRENT_UID=$(id -u):$(id -g) docker-compose --project-name "common-voice" up