Introduction to Git and GitHub
Naming conventions
Repository names should be following the following conventions:
1) General repositories: all lowercase with a - in
between words e.g. find-validation-platform-aws-hosting, 2)
Repositories for R packages or shiny applications: all
lowercase with no delimiter in between e.g. findcompositeindicator.
Main terminology
Git: version control system that allows users work
collaboratively on a codebase workflow to track changes and manage
different versions of the files created during a codebase project. Git
can be installed on a local machine and be used via the command
line.
GitHub: a web-based platform that uses a graphical
interface for Git that allows users to share, store, and collaborate
remotely in public or private projects.
GitHub Desktop: Windows/macOS graphical interface
for Git that allows users to perform Git commands without the command
line interface.
Repository: a collection of files, folders, and
version history managed by Git. A repository usually contains:
The project files (of any type) and folders
A hidden folder called “.git” that contains the metadata of the
changes made in the codebase project
All the commits, which represent a snapshot of the project files
that were in the repository at a specific point in time. The commit
objects also include a description of the changes made to the
files.
All the branches, which represent multiple versions of the
codebase project and allow working in multiple features by different
users at the same time.
Initializing a repository: Create a new local Git
repository or initialize an existing directory.
Staging: Prepare the files to be committed to the
repository.
Committing: Commit the changes to the repository
(snapshot of the changes).
Pushing: Push the committed local files to the
online repository. Updates the remote repository with the latest
changes.
Pulling: Fetch the files of a remote repository to a
local repository. Updates the local repository with the latest
changes.
Cloning: Clone a remote repository to a local
machine.
Branching: Create new/secondary versions of the
repository. This allows working on new features of the codebase project
without affecting the files that are in the main branch of the
repository.
Merging: Merge the changes from one branch into
another (usually to the main branch).
Getting Started with GitHub
The first step to using GitHub is to create an account. Go to https://github.com and click
on the “Sign up” button in the top right corner. Enter all the
information required (i.e., email, username, password…) and click on
“Create account”.
Connecting to Git and GitHub Desktop
As mentioned in the terminology, Git is a version control system that
allows tracking and managing all changes made to codebase projects. To
do so, you can use either Git, which is based on the command lines from
the terminal, GitHub Desktop, which is a graphical, and more
user-friendly, interface or RStudio which is a graphical interface for R
that also permits to perform all the Git and GitHub operations directly
on this interface. The use of one or another depends on which one you
feel more comfortable with, but all are capable of performing the same
tasks.
The tutorial shows screenshots of the whole process following as main
example GitHub Desktop because it is more user friendly and easy to
visualize. But it also includes the commands needed to perform the same
tasks in Git and RStudio. Please note, screenshots are taken using dark
mode, you may see white backgrounds depending on your configuration.
GitHub Desktop
You can download and install GitHub Desktop from https://desktop.github.com/. Once installed, open the
software, and add your GitHub credentials (email and password) to
connect to GitHub.
Git
Access https://git-scm.com/ to download and install Git. After
installation, run the following commands to connect to your GitHub
account:
git config --global user.name "your name"
git config --global user.email "your.email@finddx.org"
Please note, you may be asked for your GitHub password when pushing
or pulling changes from a remote repository.
RStudio
If you are an R user you can also do everything directly in the
RStudio interface. To do so, you need to first setup Git and GitHub as
is mentioned above. Assuming you have R and Rstudio installed, you will
need to install (install.packages()) and load (library()) the “usethis”
and “gitcreds” libraries.
In Rstudio, go to “Tools”, “Global options” and then “Git/SVN” and
make sure that the option version control interface for RStudio projects
is enabled and that there is an executable path, usually is something
like this:
“C:/Users/YOURUSERNAME/AppData/Local/Programs/Git/bin/git.exe. Click OK
to return to RStudio.

To connect RStudio to Git, run the following commands in the Terminal
(not the Console) using the user name and email you already setup in Git
and GitHub:
git config --global user.name "Your name"
git config --global user.email youremail@finddx.org
Or you can also use the following command to run it directly from the
Console:
usethis::use_git_config(user.name="Your name", user.email="youremail@finddx.org")
To avoid using a password each time you want to connect to git you
can use:
usethis::use_git_config(scope="user", credential.helper="store")
Additionally, you will need to create a token in GitHub to connect to
this platform. To do so, you can either type
usethis::create_github_token()
to be redirected to GItHub
and create a token for the connection or go directly to https://github.com/settings/tokens and click on
“Generate new token (classic)”, in any case you will be launching the
same page, and you will need to configure basic information about the
permissions you want to grant to this connection. For this example, we
are just granting permissions to do basic operations in the repository
(commit, pull, push, etc.), but you could potentially grant additional
admins permissions.

Once you have obtained the token, use the function
gitcreds::gitcreds_set(url=“https://github.com”), and specify github as the
repository page. After running the command, you will be asked in the
Console in RStudio to provide the token, simply pasted in the
corresponding area and this will setup the connection.
Creating a new repository (GitHub)
It is possible to create a new repository from either GitHub, GitHub
Desktop, Git, or RStudio (). In the steps below we are going to see how
to do it using GitHub only, see “Creating a repository (GitHub Desktop,
Git, and RStudio)” to do it using the those approaches.
GitHub
To create a new repository on GitHub, click on the “New” button on
the GitHub dashboard or go to the GitHub page of the organization you
want to create a new repository, and on the “Overview” tab you should
see a “New” button.
Once you click on new you should see the below image, where you need
to give a name for the repository you are creating and an optional
description. It is also possible to choose whether to make the
repository public or private. Public repositories are visible to anyone,
while private repositories can only be accessed by authorized
collaborators. After this simply click on “Create repository”.

And you should see now an empty repository.

Cloning the repository
To be able to work on a repository that is available in GitHub, it is
necessary to first clone the repository to a local machine. This will
create a local copy of all the files that are available online, so an
user can make any local modifications of the files and code, and once
the necessary modifications are made, push them into the GitHub remote
repository.
To clone an empty repository, copy the URL that would appear on the
new repository page.

Or if you want to clone a repository with existing files, click on
the green “Code” button on the repository page, and then copy the
URL.

GitHub Desktop
In GitHub Desktop simply go to “File” then “Clone repository” and
paste the clipboard (GitHub repository address) in the “URL” tab that
will appear in the popup window. You should also select the location
(local path) where you want to store the repository and click on
“Clone”. You can now go to the local path you chose, and you will see a
folder with the name of the repository you cloned.

Git
Open the terminal and navigate to the directory where you want to
store the project (cd /path to your directory/
). Then,
run:
git clone <paste the copied URL here>
For this example, would look something like this:
git clone <https://github.com/finddx/githubTutorial.git>
RStudio
To clone a repository, click on “File”, “New Project”, “Version
Control” and then “Git” copy and paste the address of the repository,
and automatically will popup the directory name, so you just need to
select the folder where you want to store the repository locally. Once
you click on “Create Project” a new tab called “Git” will appear in the
top-right panel.

Creating a repository (GitHub Desktop and Git)
As mentioned above, it is also possible to first create a local
repository and then connect it to GitHub. The main difference between
first creating the remote repository in GitHub with creating it directly
on a local machine is that, unlike the first option, it is now no longer
necessary to clone the repository because it is already in the local
machine, but now it is necessary to connect the local repository with a
GitHub account to create the remote repository.
GitHub Desktop
In GitHub Desktop you can create a repository simply by clicking on
“File” and “New repository”, you will see a popup window like the one
below to add basic information about the repository such as name, the
location of the local repository, etc.

After creating a local repository you will see that you are now asked
to publish the repository to GitHub.

Click on “Publish repository” to link the repository to a GitHub
account.

Git
To create a new Git repository, open the terminal (or command prompt)
and navigate to the location where the repository will be locally
created:
cd /path to your directory/
Then create a new directory for the repository (for this example, it
is called “githubTutorial”):
mkdir githubTutorial
And navigate into the new directory to initialize a new Git
repository:
cd githubTutorial
git init
Now it will be necessary to create a repository in GitHub (follow the
steps from “Creating a new repository (GitHub)” until “Cloning a
repository”) to connect it to the local repository. Please make sure
that the names of the repositories in both Git and GitHub are the same,
including lower- and upper-case letters.
Once the GitHub account is created will be necessary to copy the URL
of the GitHub repository to Git and run the command below (again, in
this example we are using a repository called githubTutorial that is
stored in the FIND GitHub account):
git remote add origin https://github.com/finddx/githubTutorial.git
You should then verify that the “origin” remote has been added
successfully:
git remote -v
If so, you should see something like this:
origin https://github.com/your-username/githubTutorial.git (fetch)
origin https://github.com/your-username/githubTutorial.git (push)
And then simply push the local repository to GitHub:
git push -u origin main
Note, depending on the GitHub repository, “main” could also be called
“master” as the default branch.
Rstudio
In RStudio, you can simply type:
usethis::use_github()
You can also specify parameters for the repository such as the
organization, and its visibility:
usethis::use_github(organization=”Organisation name”, visibility=”private”)
Making changes to the repository
Once there is a local copy of a repository in a local machine, it is
now possible to start making changes to the code/files of the project.
To do so, simply open the script or file and do the modifications you
need to do. Once the work is finished save the new changes in the local
repository you are working on. You can also create/add new files to the
repository if you need to.
Committing changes
Once the changes are made you need to stage and commit the changes so
this can be moved to the GitHub repository too. The staging process
refers to adding/preparing the files that will be committed to the
repository. The commit stage refers to taking a “snapshot” of the
changes that were made in the repository, so we can keep track of any
modifications.
GitHub Desktop
In GitHub Desktop when there is no modifications the screen will look
like the image below, indicating there are “No local changes”, you can
also observe that the left area indicates “0 changed files”.

However, if we add a file or modify some pieces of the script, those
will be highlighted in the screen (green for all the information we add,
and red for the information we remove). For this example, we created a
new R project called github_tutorial that holds a hello function that
simply prints “Hello” + the name of the person. If we save this file, we
will observe that everything is in green because there are now new files
that did not exist before.

You can see that on the left area (image above) everything is
automatically selected, this indicates that all the files are now in the
staging phase, you can uncheck any boxes in case you do not want to
stage (and commit) all the files.
We then can proceed to commit the file. First, write a message (in
present tense) on the left-bottom area indicating the modifications that
were made, and an optional description to detail everything you did, so
all the users can better understand the changes that were made. In this
example due to it is the first commit, we can put something like
“Initial commit”. After this, click on “Commit to main” to commit the
changes from the local repository to the remote repository in
GitHub.

Git
To move all the files to stage mode in Git simply run:
git add .
Where “.” refers to selecting all the files, but you could also add
specific files. For example, if you just want to move a file called
“test.R” you could run this command:
git add test.R
After this, you could run git status just to verify that the files
are in stage mode.
Once the desired files are in stage mode, you need to commit them. To
do so, run the below command, note that you need to add a message (in
present tense) to the commit indicating the changes that were made, so
all the users that work on the project can relate to what the new
changes refer to.
git commit -m "Add your commit message here"
RStudio
In RStudio, files that have been modified will appear in the
top-right area. In this area is possible to simply select the files to
stage them and then click on “Commit”.

Once you click on “Commit”, a new window will popup allowing to add
the commit message, and from here simply commit the files and a small
pop up message will appear indicating the modifications made. As you may
observe new modifications are highlighted in green while old version of
the files are in red. The non-highlighted data indicates that they were
not modifications.

Pushing changes to GitHub
Once files have been committed it is necessary to push them to
GitHub, so all the new modifications can be visible to the users that
have access to the repository.
GitHub Desktop
Once you commit the changes in GitHub Desktop, a new message will
appear asking to push the changes to the remote repository, simply click
on “Publish branch” to push the files to the remote repository. You can
also observe that now it says “0 changed files”, as all of them are now
committed.

If you now go to the GitHub repository, you will see the new
files/changes and all the information about the commit (i.e., commit
message, date of the commit, etc.).

Git
In Git, to push the local changes into the GitHub repository simply
run the following command:
git push origin main
RStudio
In RStudio, once files have been committed simply click on “Push” to
move the files to the remote repository, a small popup will appear
indicating the details of this action.

Please note, you can do all these operations in the pop up window
that appeared after clicking on commit, or can be done directly in the
“Git” tab (top-right area) in RStudio.
Pulling changes from GitHub
It is common that when working on a collaborative repository,
multiple users work and modify the files constantly. So, to keep track
of the changes made by other users (or a file that is updated
automatically), you need to pull those changes from the remote
repository to the local machine you are working on before making a new
modification.
For this example, assume there is another user working on the same
repository who modified the hello function and added a welcome message
after the hello sentence. To do that, the user first of course cloned
the repository locally, and then committed and pushed the files to the
remote repository, so now we can see in GitHub the changes that this
person made.

GitHub Desktop
Whenever there are new changes in the GitHub repository, GitHub
desktop would indicate there are committed changes that are not in the
local machine (if you are sure there are new changes, but do not see any
message in GitHub desktop simply click on “Fetch” in the ribbon area to
update the connection). To get the newest changes from the remote
repository to the local one, click on “Pull origin” and you will see
that the new modifications/files are now added to the local
repository.

Git
In Git, to pull changes simply run:
git pull origin main
RStudio
To pull the files simply click on “Pull” either in the top-right
corner or in the Git popup window, a new message will appear indicating
the files that changed, once you return to Rstudio main interface you
will observe that the script has bee updated.
Working with branches
Because GitHub is a collaborative platform, it may be cases when more
than one person is working on the same file at the same time or when the
new modifications need to be first validated by a third person to ensure
everything works as expected. In those cases, it is necessary to push
the new files and modifications into an independent “branch”, which
basically is a clone of the main repository, plus the new modifications
that were made in the local machine. In that way, if someone modifies
something, the new modifications are in an independent branch and the
software running is not affected until is fully tested and ensure
modifications from other users are not overwritten until they have been
tested or approved.
In GitHub, you can create an empty branch by clicking on the “main”
branch filed and then a small window will pop up so you can put the name
of the new branch.

GitHub Desktop
In GitHub Desktop you can simply go to “Current branch” in the upper
ribbon area and click on “New branch”, from there put the name of the
new branch and click on “Create branch”.

To create a new branch and add directly the new modifications, you
need of course to have new changes that have not been committed yet.
Let’s say we change the “Hello Human Welcome to FIND” sentence, so we
include some exclamation signs and change the “W” to lower case, which
will lead to the following message “Hello Human! Welcome to FIND!”. As
you can see in the image below, GitHub will then highlight in red the
pieces of code that will no longer exist and in green the pieces that
were added. Pieces or files that were not changed will not be
highlighted.

Then, you can of course commit those new changes to the remote
repository, but before that, we need to create a new branch, so these
new modifications do not overwrite anything from the main branch. In
this example, we created a branch called “example-1”. To do that click
on main (see image below) and type the name of the new branch and click
on “Create new branch”.

Select “Bring my changes to -name of the new branch-” and click on
“Switch branch”

Now you will see that the current branch is set to the new branch
name, and then you can commit and push the changes to the remote
repository as we already saw in the corresponding section. If you now go
to the GitHub repository you will see that the new branch is now
live.

Git
In Git, there are two ways to create a new branch:
git branch \<branch-name\>
creates a new branch,
but it does not switch to it.
git checkout -b \<branch-name\>
creates a new
branch and switches to it.
But if you have already modified some files and want to switch and
move those uncommitted changes directly to a new branch, run the
following command:
git switch -c \<branch-name\> -m
RStudio
In RStudio, click on the icon highlighted in the image below a type
the name for the new branch, you can select “Sync branch with remote” to
add the branch automatically in the remote repository.

Merging branches
GitHub
Once you decided the information from the new branch is without
issues and ready to be part of the main code, you can go to the GitHub
repository and merge it to the “main” or “master” branch. Usually this
is done by a third user, but it can also be made by the same user who
created the new changes if the settings of the repository allow it. To
merge a new branch with the main branch, go to the remote repository and
you will see a message indicating “Compare & pull request” and
simply click on it.

A new window will appear with several pieces of information (see
image below). It will indicate whether those branches are “Able to
merge” (i.e., there are no conflicts) and you will also be able to write
a message (if any) about the merger process. Before clicking on “create
pull request”, if you scroll down you will see that GitHub is comparing
the modifications made as we previously saw when we created this new
branch in GitHub Desktop (new additions are in green and deleted
information is in red).

After clicking on “Create pull request”, a new window will appear
validating this merge and if no issues arises, it would indicate there
are not conflicts and you will be free to merge (“Merge pull request”
button and then confirm merge).

Once the merging is finished, you will be asked if you want to delete
the temporal branch, simply click on “Delete branch”.

Note that GitHub keeps track of any change, so if you click on the
main branch field you can access all closed (and open) branches and
restore them if necessary.
Merging a branch locally
GitHub Desktop
To merge a branch locally, go to the branch (Current branch tab)
where you want to merge another branch (usually main or master). Then
click again in the “Current branch” tab and click on “Choose a branch to
merge into -branch name-” .

Select the branch you want to merge into the current branch and click
on “Create a merge commit”. Then, you can simply push the commit to the
remote repository.

Git
In Git, it will be necessary to checkout the branch where you want to
merge another branch (usually main or master) using the command
git checkout -branch name(main or master)-
, and run the
merge command with the name of the branch you want to merge:
git merge -branch name to merge-
, for this example, would
be: git merge example-1
. Then, you will need to stage,
commit, and push to the remote repository.
RStudio
Rstudio does not allow to easily merge branches directly in the
platform. However, this can be achieved directly in GitHub or through
the git4r library.
Dealing with conflicts
Collaborative project involves sometimes different users working on
the same piece of code and coming up with different approaches for the
same solution. This will of course create conflict because we would have
two (or more) branches with different versions of the same file.
Sometimes, we can integrate the different solutions as part of the
main project or use some parts of each solution to create a new one, but
in most cases it will be necessary to choose only one solution and
remove the others. Whatever the case, we need to take decisions and use
just the part of the code we want to integrate into our main
project.
GitHub Desktop
When there is a conflict, GitHub will immediately highlight these
issues with the conflict markers: <<<<<<<HEAD
“conflict code branch 1” ======= “conflict code branch 2”
>>>>>> along with the name of the commit where the
issue is present (see image below).

What you need to do is to open the file where there is a conflict;
here for example: <<<<<<< HEAD print(paste("Hello",
name,"!", "welcome to FIND!")) ======= print(paste("Hello",name))
>>>>>>> parent of 11a1c61 (Add welcome sentence to
hello_function.R) }, and to solve it, we simply keep the piece of code
we want to retain: print(paste("Hello", name,"!", "welcome to FIND!"))
and remove the part of the script that we want to remove:
<<<<<<< HEAD ======= print(paste("Hello",name))
>>>>>>> parent of 11a1c61 (Add welcome sentence to
hello_function.R) }.

Then you will need to commit the new changes and push them to the
remote repository as we already saw.
Git
In Git, you will need to identify the files that have conflict using
git status
, and as was mentioned for GitHub Desktop, open
the conflicted file to look for the conflict markers
(<<<<<<<HEAD “conflict code branch 1” =======
“conflict code branch 2” >>>>>>). Edit the file,
keeping the piece of code you want to retain, and then simply stage,
commit and push the changes to the remote repository.
Git
RStudio will highlight (in green) the pieces of code that have a
conflict ( <<<<<<< HEAD conflict code branch 1
======= conflict code branch2 >>>>>>>), and you
will need to manually open that file and keep part of the script you
want to preserve. COnflict markers also need to be removed manually.
Once, resolved you simply need to commit and push the files to the
remote repository.

Restoring a file on a specific commit
(checkout)
As mentioned throughout this tutorial, it is always possible to go
back to a previous version of a commit in case the new version of the
code does not work as expected or simply because you want to check a
previous version for any specific reasons.
GitHub Desktop
Go to the left sidebar and click on the “History” tab, right click on
the commit you want to restore and click on “Revert changes in commit”.
As you may see, it is also possible to create a new branch from a
specific commit (Create branch from commit).

Git
In git, you would need to first find the “hash” (id) of the commit
you want to restore, you can do this using:
git log \-- file-name
Then you need to use
git checkout commit-hash -- filename
to restore the file to
the state on that specific commit.
Note that the “checkout” command is also used to switch between
branches in Git (see Working with branches of this tutorial)
RStudio
In RStudio open the Git window and select “History” in the ribbon
area. You will see all the commits history and some related metadata.
Click on the commit you want to restore and open the file, then you can
simply click on “Save as” to overwrite the file to the old version
selected. Please note that you can also click on the “Revert” button,
but this will only retrieve the previous non-committed file.
Tagging users and creating issues in GitHub
As we already saw, other users can work on a specific project and
make modifications to the repository. But to have a better control of
this we can add or restrict the access to specific users, create issues
so users can create or fix a specific piece of code, or tag them to
notify about certain issues they need to check.
To add/restrict access, go to the GitHub repository, and then to the
settings tab (please note, to do this you will need to have admin
rights). Go to members and teams, and search for the GitHub username you
want to add. You can assign specific roles to the users such as Admin,
Write, and Maintain. It is also possible to add Teams or remove
members.

It is also possible to create a new issue, so we can have better
control over what each user is doing. To do that, go to the issues tab
and simply click on “New issue”. In this area, it is possible to add a
title for the issue, write a message, but also to assign specific users
to the user so they can be notified of specific task they need to work
on.

Once the issue is resolved (and the user committed and pushed the
changes to the remote repository), the user can simply click go to the
issue and click on close issue, so it is no longer listed as open. But
remember, because Git keeps track of all the changes, the issue can be
re-open if necessary.
Final Recap
The image below provides a general recap of how the Git process
workload. The first step is to move (add) the files from the local
working area to the staging phase, so we can commit and push the files
from the local repository to the remote repository, and in the opposite
way, we need to pull the files from the remote repository to the local
repository to keep all the local files up to date. Checkout allows
switching between repository versions or branches.

Image source: https://github.com/szalam/gitwintut