Skip to content

Version Control

Overview

Version control systems track changes to files over time, allowing you to recall specific versions, compare changes, and collaborate with others. Git is the dominant version control system, using a distributed model where every user has a complete copy of the repository history.

For system administrators, version control serves multiple purposes: storing Ansible playbooks and configuration files, tracking infrastructure changes, enabling rollback of problematic changes, and providing documentation of what changed, when, and why. Platforms like GitLab provide additional features like issue tracking, CI/CD pipelines, and code review workflows.

How It Works

Git Fundamentals

Git is a distributed version control system — every clone of a repository contains the complete history. There is no single point of failure, and work can continue offline.

The basic Git workflow:

  1. Modify files in your working directory
  2. Stage changes with git add (selecting what goes into the next commit)
  3. Commit the staged changes with git commit (creating a permanent snapshot)
  4. Push commits to a remote repository with git push
git init                              # Initialise a new repository
git add .                             # Stage all changes
git commit -m "First commit"          # Commit with a message
git remote add origin <url>           # Link to a remote repository
git push -u origin master             # Push to the remote

Initial Configuration

Before making commits, configure your identity:

git config --global user.name "Your Name"
git config --global user.email "your@email.com"

This information is embedded in every commit and provides an audit trail of who made each change.

Key Concepts

Working directoryStaging area (index)Local repositoryRemote repository

  • The working directory is where you edit files.
  • The staging area holds changes you intend to commit. git add moves changes here.
  • A commit is an immutable snapshot of the staged changes, identified by a SHA-1 hash.
  • The remote (e.g. origin) is a server-hosted copy of the repository.

Branches

Branches allow parallel lines of development. The default branch is typically master or main. Create and switch branches with:

git checkout -b feature-branch     # Create and switch to a new branch
git checkout master                # Switch back to master
git merge feature-branch           # Merge changes into current branch

Branches are lightweight in Git — they are simply pointers to commits, so creating them is nearly instant.

Remote Repositories and Platforms

Git repositories are typically hosted on platforms that add collaboration features:

  • GitLab (self-hosted or gitlab.com) — used in this course via gitlab.cs.ut.ee. Provides issue tracking, CI/CD pipelines, merge requests, and code review.
  • GitHub (github.com) — the largest public hosting platform.
  • Gitea, Bitbucket — other alternatives.

Authentication to remote repositories uses either HTTPS (username + password/token) or SSH keys (the same key-based authentication used for server access).

Git for Infrastructure as Code

In system administration, Git stores Ansible playbooks, configuration templates, and infrastructure definitions. This provides:

  • A backup of your automation — if a VM is destroyed, the playbook can rebuild it.
  • A history of every change — git log shows what changed, when, and why.
  • Collaboration — multiple administrators can work on the same infrastructure code with merge-based workflows.
  • Exam preparation — a well-written Ansible repository can automate the entire course exam setup.

The typical Ansible repository structure stored in Git:

ansible/
├── inventory/
│   └── hosts
├── roles/
│   ├── etais/
│   │   └── tasks/main.yml
│   ├── dns/
│   │   └── tasks/main.yml
│   └── web/
│       └── tasks/main.yml
└── playbook.yml

Common Git Commands Reference

git status                  # Show modified/staged/untracked files
git diff                    # Show unstaged changes
git log --oneline           # Compact commit history
git pull                    # Fetch and merge remote changes
git clone <url>             # Copy a remote repository locally
git stash                   # Temporarily shelve uncommitted changes
git stash pop               # Restore stashed changes

Key Terminology

Repository (repo)
A directory tracked by Git, containing the complete history of all files.
Commit
An immutable snapshot of changes, identified by a SHA-1 hash. Includes a message, author, and timestamp.
Branch
A movable pointer to a commit. Allows parallel development without affecting the main line.
Remote
A reference to a copy of the repository hosted elsewhere (e.g. origin pointing to GitLab).
Clone
Creating a complete local copy of a remote repository, including all history.
Merge
Combining changes from one branch into another.
Merge Conflict
When Git cannot automatically combine changes because the same lines were modified differently in two branches. Requires manual resolution.
Pull Request / Merge Request
A platform feature (GitHub / GitLab) for proposing and reviewing changes before merging them into the main branch.
.gitignore
A file listing patterns of files and directories that Git should not track (e.g. compiled binaries, secrets, editor temp files).

Why It Matters

  • Disaster recovery: If a VM is corrupted or accidentally destroyed, a Git-tracked Ansible playbook lets you rebuild the entire environment from scratch.
  • Audit trail: Every change is recorded with who made it, when, and a description of why. This is essential for debugging and compliance.
  • Collaboration at scale: Multiple administrators can work on infrastructure simultaneously. Git's branching and merging model prevents conflicts and enables code review before changes are applied.
  • Reproducibility: Checking out a specific commit guarantees an exact reproduction of the infrastructure state at that point in time.

Common Pitfalls

  • Committing secrets: Never commit passwords, private keys, API tokens, or database credentials. Use .gitignore to exclude sensitive files and use Ansible Vault or environment variables for secrets.
  • Large binary files: Git is designed for text files. Committing large binaries (disk images, database dumps) bloats the repository permanently. Use .gitignore or Git LFS for large files.
  • Forgetting to push: Commits only exist locally until pushed. If your VM is also your Git working directory and it is destroyed, unpushed commits are lost.
  • Not committing regularly: Commit after completing each logical unit of work (e.g. after each lab). Large, infrequent commits make it harder to identify when a problem was introduced.
  • Merge conflicts from whitespace/formatting: Inconsistent indentation in YAML files (common in Ansible) causes unnecessary merge conflicts. Agree on a style and stick to it.

Further Reading