Learning Git will allow you to:
Different developers use different tools for using Git.
This lesson focuses on using Git through a terminal via a CLI. This is because if we can use Git via a CLI, we can use it both interactively as we work and in CI/CD pipelines.
Other Git tools include:
Git commits & branches can be naturally visualized, making visual tools popular and useful.
Install Git here - you can then use the Git CLI:
If you aren’t comfortable using a terminal or CLIs, work through the lesson on the Bash Shell first.
The commands below are what you need to get started:
Other Git operations need to happen before we run these commands (like git init
or git clone
) - but the four commands above will get you through the day.
Git is designed to protect your code but isn’t completely foolproof. Some Git commands can cause permanent data loss if used incorrectly.
The most dangerous Git commands are:
git reset --hard
- discards all uncommitted changes & resets to a specific commit,git clean -f
- permanently deletes all untracked files,git push --force
- overwrites remote history (can cause problems for other developers),git rebase
- rewrites commit history (can cause conflicts for other developers),git checkout
- can discard uncommitted changes when switching branches.Most of these you will not need to use in daily work. If in doubt, copy the folder the Git repository folder so you have a local backup if needed - or better push to a remote repository before doing dangerous commands.
Remote repositories (like those on GitHub) are extremely safe:
Your local Git repository can lose work in a few ways:
git reset --hard
or git checkout
,You can keep your work safe by:
git reset --hard
),These practices mean that even if you do lose work locally, you’ll only ever lose a small amount of recent changes.
Git’s main function is version control of files. Developers write code that is stored in text files.
Version control gives developers a history of their work, by providing the changes made to a given file.
Version control also allows switching between different versions of a codebase.
Git works by keeping track of every change made to a project.
Every time a change is made and saved in Git, it is recorded in the project’s history. This means that you can go back and see exactly what changes were made when.
This keep everything approach means that anything you commit to a repository will be there forever. This is important to remember when working with secrets (like AWS keys) or with large datasets.
A local repository is created on a developer’s computer using the git init
, and is contained in a folder called .git
.
It contains a copy of the entire project commit history, including all the commits and branches. A local repository can be used for version control and collaboration even when working offline.
A remote repository is a copy of the local repository that is stored on a remote server, such as GitHub.
The remote allows developers to share their work.
A remote repository can be created using git remote add
, and it can be connected to a local repository using git push
and git pull
.
The git init
command is used to initialize a new repository in the current directory:
It creates a directory .git
, that contains data for the new Git repository:
You don’t need to understand or look at these files - the most important thing to know is that the .git
folder is where Git will store the entire history of your Git project:
The git status
command allows you to check the status of your repository:
You can delete a Git repo by:
This can be useful when you get a Git repo into a bad state (it happens - for example if you checked in secrets) and just want to start again.
Be careful though - if you remove this folder (and you don’t have a remote copy on a service like GitHub) then you entire project history will be lost.
A repository (or repo) holds all the files and metadata associated with a codebase, including the codebase’s commit history and branches.
A repository is created using git init
. A repository can be either local or remote.
Status will show you what files are staged or unstaged and tracked versus untracked.
The commit is the atomic unit of Git.
Git joins changes from multiple files into a single unit - a commit. These commits are snapshots of your project at different points in time.
A Git commit is a snapshot of an entire codebase at one point in time.
A commit has unique hash identifier - a string like d6a583a419797104d985ab8aaa471a153cd24d2f
.
The hash uniquely identifies a commit.
The difference between one commit to another is known as a diff.
When developers are reviewing the commits of others, they often only look at the diff between one commit and another.
Commits are created using git commit
and include a message - a short bit of text that describes what changes are made with each commit.
Let’s create a new Git repository with git init
, and create a file README.md
:
If we now check what is going on, Git tells us that there is an untracked file:
We can add this file to the repository, which makes the file tracked and staged:
We can then commit this file, which turns the staged changes into committed changes:
We now have this commit in our history, which we can see through git log
:
These changes to our Git repository live only on our local machine.
Let’s simulate some more work by changing our README.md
file.
Git now tells us that we have changes to tracked files that are unstaged:
Let’s add a new file - this file main.py
is untracked by Git:
We can add both of these changes into a single commit with git add .
, which adds all changes in the current directory and subdirectories:
We now have two changes staged for commit - one change to a tracked file README.md
and a new untracked file main.py
.
We can create the Git commit using git commit -m 'message
:
git log
now shows our two commits:
There are a few ways to add files to a Git commit:
git add README.md
- tracks & changes in a file README.md
,git add .
- tracks & changes all files in all directories,git add -u
- adds changes in all tracked files (untracked files are ignored),git add *
- tracks & changes all files in the current directory only.Commits are organized in a linear sequence which allows developers to see the entire history of changes made to the project. This linear sequence is the commit history.
The entire commit history is stored in the project’s repository and can be viewed using git log
.
The git log
command displays a list of all the commits made to the repository.
Each entry shown by git log
includes the commit’s SHA-1 checksum, the author’s name and email, the date and time of the commit, and the commit message.
Two useful git log
commands are:
git log --pretty=fuller --abbrev-commit --stat -n 5
,git log --pretty=fuller --abbrev-commit --stat -n 5
,GitHub is a web-based platform that hosts Git repositories and adds collaboration features on top of Git.
Git is the version control system that tracks changes in your code. GitHub is a service that hosts Git repositories and makes it easier to collaborate with others.
GitHub is as a central hub where developers can share their code, contribute to others’ projects, and collaborate on software development.
Github is not the only platform that developers use to work with Git repositories - services like Azure Devops or Gitlab offer similar functionality to Github.
So far we have only created a Git repository locally - it only exists on our local machine.
To put a Git repo onto Github, we need to do a few things:
After logging in to Github, you’ll find a '+'
button on the upper right side where you can add a new repository.
You’ll be directed to a new page where you’ll be asked to fill out some information:
Almost always it’s best to initialize empty repositories on GitHub.
Next, you need to link your local repository to the remote repository on GitHub and push your commits to it. To do this, use the following commands:
Now you have a repository on GitHub, you can push your local repo up into GitHub by adding it as a remote repository called origin
:
git push -u origin master
pushes your commits to the ‘master’ branch of the ‘origin’ repository. The -u
flag tells Git to remember the parameters.
Now your local Git repository is connected & backed up to your GitHub repository, enabling version control.
Other developers can now clone and work on it separately, enabling collaboration.
A branch is a copy of the codebase that can be worked on independently.
Branches allow you to work on multiple features or bug fixes in parallel without affecting the main development branch.
A branch is given a human readable name like amazing-new-feature
or fix-the-bug
.
A branch is created using the git branch
command and can be switched between using the git checkout
command.
When a branch is created, it is based on the current state of the codebase, and it includes all the commits up to that point.
Any new commits made while on that branch will be added to that branch, creating a separate branch history.
Once the work on a branch is finished, it can be merged back into the main codebase using git push
, git pull
or git merge
. This allows developers to incorporate their work into any other branch.
The ability to work on multiple branches allows developers to work on features or bug fixes in separate versions of the same codebase, without affecting other branches of the codebase.
Similar to commit messages, consistency around branch naming can be useful.
For example, prefixing with feature/
or bug/
or a GitHub issue number can help other understand what a branch is used for.
By default Git starts on the master branch.
For Git the master
branch is the default branch - it’s the one that is automatically created when you create a Git repository:
It’s also common for the default branch to be called main
.
We can create a new branch using git branch
:
We can then switch to this branch with git checkout
:
This new branch is at the same state as master
.
We can add commits to a branch using our git add
and git commit
workflow:
We now have two branches master
and tech/requirements
.
We can look at the difference between these branches using git diff
:
Diffs can be quite large – many developers will view the diff between branches on a tool like GitHub or using git difftool.
We can bring our changes from tech/requirements
into our master
branch using git pull
:
git pull
allows specifying the repository (in the command above .
).
It’s also possible to use git merge
:
Currently we have our two branches locally – we can push these branches up to our remote repository origin
on GitHub: