
A brass medallion with an image of a traditional Cornish Piskie.Photo by chrisinplymouth.
I really like using GitLab CI/CD, every time I push a commit to the cloud repo a magical Cornish piskie does the hard work for me (it must be magic, right?).
For example I use GitLab CI/CD to publish the blog you are reading.
But how to do you push new content in the CI pipeline back to the project repo? It’s easier than you might think.
In this post I’m going to assume you understand the basics of Git and GitLab CI/CD (I’m just going to say CI from know on). You should also note that this is written and tested on Docker containers executing on the GitLab cloud. If you use local runners you may need to use a different approach.
Usually when you build some content inside a CI runner it’s listed as one or more artifacts for use in later test or deployment jobs and there is no further need to preserve them in the git repo.
However, sometimes you do want to add or update content in the actual project repo (i.e. you want to commit changes generated in the pipeline back to the repo). For example release notes, Project README files, or images used in the documentation.
However the CI pipeline is not configured with write access to project repo, and a further complication is that the repo is cloned with a detached HEAD (HEAD refers to a commit, not a branch)
Your mileage may vary of course depending on your specific requirements, but at the very least this should give a starting point.
Setup
Before we can start pushing changes from our CI jobs some setup is required:
-
Create a project access token (PAT). Follow this process to create a PAT, making sure it has the
write_repository
scope and thedeveloper
role.Note that from GitLab 16.0 all PATs will have an expiry date, so they will need to be refreshed regularly.
Copy the PAT immediately as you cannot recover it later.
-
Now use the value of the PAT to create a CI Project Variable. When creating the variable make sure “mask” is selected. Masking the PAT makes sure the value is not accidentally exposed and reduces any security risk.
In this example I have assumed the variable is called
ACCESS_TOKEN
. -
Make sure that the developer role can commit to any protected branches that will be the target of your
git-push
during the CI job ($CI_DEFAULT_BRANCH
, which ismain
, in this example).
Making file changes
The first thing you need in your CI script is a content creation job. We’ll do something trivial, i.e. append the current date and time to the repo README.
makechanges:
image: busybox
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH ||
$CI_MERGE_REQUEST_TARGET_BRANCH_NAME == $CI_DEFAULT_BRANCH
artifacts:
paths: # You must list all files you want committed to the project repo
- README.md
expire_in: 5 minutes # They don't need to hang around long
script:
- printf "$(date) \n" >> README.md
The important thing to note is that you have to explicitly pass the modified files as artifacts to the next job. Depending on your project that may not be ideal and we’ll come back to alternatives later.
Doing the actual commit
We need a Docker image with Git installed. Alpine provides a handy prebuilt image, but the ENTRYPOINT needs to be overwritten.
updaterepo:
image:
name: alpine/git:latest
entrypoint: [""] # Alpine/git image needs entrypoint to be overwritten
Before doing any real work it’s worth checking of there are any changes that need be committed
script:
- '[ -z "$(git status --porcelain)" ] && exit 0'
(If there are no untracked files and no modified files then git status --porcelain
prints an empty string, and the script exists)
Once we know we need to commit something the first “trick” is to create a new git remote that provides write access to our project repo. On GitLab remotes can use HTTP Basic Authentication, where a username and password are included in the URL. Such a URL looks like this:
https://username:password@host/path/
As GitLab provides all the values needed to form the URL we don’t have to hard code any part of the URL into the script.
Further, in GitLab the username can be any string (must be at least length one, I use the value of CI_PROJECT_NAME
just to be tidy.
The password should be the PAT, contained in the CI variable ACCESS_TOKEN
(see above). Setting the remote becomes:
- git remote set-url project_repo \
https://${CI_PROJECT_NAME}:${ACCESS_TOKEN}@${CI_SERVER_HOST}/${CI_PROJECT_NAMESPACE}/${CI_PROJECT_NAME}.git
We should also make sure the user email and name are correct:
- git config user.email "${GITLAB_USER_EMAIL}"
- git config user.name "${GITLAB_USER_NAME}"
This sets up the identity of the user running the CI pipeline. If you want to be extra fancy you can set it to the identity of the commit author instead (they are probably the same person most of the time):
- git config user.email $(echo ${CI_COMMIT_AUTHOR} | sed -Ee 's/^[^<]+<([^>]*)>$/\1/')
- git config user.name "${CI_COMMIT_AUTHOR%<*}"
Now we can add and commit any changes.
- git add . # Any new changes, i.e. Any artifacts from previous jobs
- git commit --no-verify --message "Commit via a GitLab CI job"
Note that I use the --no-verify
to skip hook scripts, as running hooks inside a CI job usually doesn’t make sense.
Of course by default hook scripts are not replicated when a repo is cloned,
but I like to be sure
as technically it would be possible to set up hooks in a previous part of the CI job (I might explain how in a future post).
The result of that is:
$ git commit --no-verify --message "Commit via a GitLab CI job"
[detached HEAD f54fe80] Update changed content via a GitLab CI job
1 file changed, 1 insertion(+), 1 deletion(-)
If you look at the job log above you’ll notice the commit results in a detached head. If you don’t know what that is, don’t worry about it for now but it’s the result of the way Gitlab clones the repo, viz:
Initialized empty Git repository in /builds/alecthegeek/git-in-gitlab-ci/.git/
Created fresh repository.
Checking out 65ff8728 as detached HEAD (ref is main)...
When we push we need to make sure that the new local detached commit (represented by HEAD)
is applied to the tip of the default branch in the repo
with HEAD:$CI_DEFAULT_BRANCH
(but change the destination branch if needed by your project).
- git push --push-option=ci.skip --no-verify
origin HEAD:$CI_DEFAULT_BRANCH
The other important thing to notice
is --push-option=ci.skip
which stops the CI pipeline running again when we update the project repo via the CI pipeline job.
And finally we are skipping any hook scripts again (--no-verify
).
Other Ways to Organise the Pipeline
The solution above is neat and tidy, and if it’s all your pipeline does then the approach works well. You can just include the job script (for the second job) from another repo (for example https://gitlab.com/alecthegeek/git-in-gitlab-ci/-/blob/main/.gitlab-ci-updaterepo.yml) – making sure the CI variable is named correctly.
However your CI pipeline may be generating many other artifacts that you don’t want committed. Instead they are used during subsequent test and deploy jobs, and so still need be listed as artifacts.
The simplest approach is to just add the Git logic into single job that’s creating your content, but you do need to make sure that Git is available in the image used to run the job (because you are running Git in the script section). This is certainly an option, but you will need to manage this additional image.
As GitLab supports runnings Docker in Docker, there is a third option.
In this approach the job is run in a the official docker:git
image. Furthermore the Docker in Docker service is enabled.
This means that any other image, including the image needed to make file updates, can be run in the same job. The Git working directory is shared between the two containers.
No artifacts are created, because it all happens in one job; and no additional images need to be managed.
Here is an example:
include:
- project: alecthegeek/git-in-gitlab-ci
file: .gitlab-ci-git-commit.yml
makechanges:
image: docker:git
services:
- docker:dind
variables:
RUN_IMAGE: busybox
FF_NETWORK_PER_BUILD: "true"
RUN_CMD: 'docker run --rm --mount "type=bind,src=$(pwd),dst=$(pwd)" --workdir "$(pwd)" --network=host ${RUN_IMAGE}'
before_script:
- !reference [.git_commit_and_push, script]
script:
# Make a change by running a docker image
- ${RUNCMD} printf "$(date) \n" >> README.md
# Commit and push the change using function defined in include file
- git_commit_and_push
Finally
-
Avoid unnecessary commits to the repo by judicious use of a rules section so that the job only runs as needed. As this will be very specific to your project I have only a simple example:
rules: - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH || $CI_MERGE_REQUEST_TARGET_BRANCH_NAME == $CI_DEFAULT_BRANCH
To make your life easier I have set up a repo with all this content and an example job. Feel free to clone