Week 2 [Mon, Aug 18th] - Topics

Detailed Table of Contents



Guidance for the item(s) below:

Given this is a first course in SE, tradition demands that we start by defining the subject. However, let's not spend a lot of time going through lengthy/formal definitions of SE. Instead, let's look at an extract from the very first chapter of a very famous SE book, with the aim of providing some inspiration, but also an appreciation of the challenges ahead.

[W2.1] SE: Intro

W2.1a

Software Engineering → Introduction → Pros and cons

Software engineering: Software Engineering is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software" -- IEEE Standard Glossary of Software Engineering Terminology

The following description of the Joys of the Programming Craft was taken (and emphasis added) from Chapter 1 of the famous book The Mythical Man-Month, by Frederick P. Brooks.

Why is programming fun? What delights may its practitioner expect as his reward?

First is the sheer joy of making things. As the child delights in his mud pie, so the adult enjoys building things, especially things of his own design. I think this delight must be an image of God's delight in making things, a delight shown in the distinctness and newness of each leaf and each snowflake.

Second is the pleasure of making things that are useful to other people. Deep within, you want others to use your work and to find it helpful. In this respect the programming system is not essentially different from the child's first clay pencil holder "for Daddy's office."

Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts and watching them work in subtle cycles, playing out the consequences of principles built in from the beginning. The programmed computer has all the fascination of the pinball machine or the jukebox mechanism, carried to the ultimate.

Fourth is the joy of always learning, which springs from the nonrepeating nature of the task. In one way or another the problem is ever new, and its solver learns something: sometimes practical, sometimes theoretical, and sometimes both.

Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by the exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures....

Yet the program construct, unlike the poet's words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.

Programming then is fun because it gratifies creative longings built deep within us and delights sensibilities you have in common with all men.

Not all is delight, however, and knowing the inherent woes makes it easier to bear them when they appear.

First, one must perform perfectly. The computer resembles the magic of legend in this respect, too. If one character, one pause, of the incantation is not strictly in proper form, the magic doesn't work. Human beings are not accustomed to being perfect, and few areas of human activity demand it. Adjusting to the requirement for perfection is, I think, the most difficult part of learning to program.

Next, other people set one's objectives, provide one's resources, and furnish one's information. One rarely controls the circumstances of his work, or even its goal. In management terms, one's authority is not sufficient for his responsibility. It seems that in all fields, however, the jobs where things get done never have formal authority commensurate with responsibility. In practice, actual (as opposed to formal) authority is acquired from the very momentum of accomplishment.

The dependence upon others has a particular case that is especially painful for the system programmer. He depends upon other people's programs. These are often maldesigned, poorly implemented, incompletely delivered (no source code or test cases), and poorly documented. So he must spend hours studying and fixing things that in an ideal world would be complete, available, and usable.

The next woe is that designing grand concepts is fun; finding nitty little bugs is just work. With any creative activity come dreary hours of tedious, painstaking labor, and programming is no exception.

Next, one finds that debugging has a linear convergence, or worse, where one somehow expects a quadratic sort of approach to the end. So testing drags on and on, the last difficult bugs taking more time to find than the first.

The last woe, and sometimes the last straw, is that the product over which one has labored so long appears to be obsolete upon (or before) completion. Already colleagues and competitors are in hot pursuit of new and better ideas. Already the displacement of one's thought-child is not only conceived, but scheduled.

This always seems worse than it really is. The new and better product is generally not available when one completes his own; it is only talked about. It, too, will require months of development. The real tiger is never a match for the paper one, unless actual use is wanted. Then the virtues of reality have a satisfaction all their own.

Of course the technological base on which one builds is always advancing. As soon as one freezes a design, it becomes obsolete in terms of its concepts. But implementation of real products demands phasing and quantizing. The obsolescence of an implementation must be measured against other existing implementations, not against unrealized concepts. The challenge and the mission are to find real solutions to real problems on actual schedules with available resources.

This then is programming, both a tar pit in which many efforts have floundered and a creative activity with joys and woes all its own. For many, the joys far outweigh the woes....


Exercises:

SE vs Civil Engineering


Software vs Bridges


Coding as a manufacturing activity


List pros and cons of SE


Which one of these is not included in Brook’s list of ‘Woes of the Craft’?




Guidance for the item(s) below:

Now, let's switch our focus to the project management aspect of SE.

Broadly speaking, there are two approaches to doing a software project. Those two approaches are also highly relevant to the way this course is run, and how it is different from most SE courses elsewhere.

Let's learn about those two approaches early so that we can better understand how this course works.

[W2.2] SDLC Process Models: Basics

W2.2a

Project Management → SDLC Process Models → Introduction → What

Software development goes through different stages such as requirements, analysis, design, implementation and testing. These stages are collectively known as the software development lifecycle (SDLC). There are several approaches, known as software development lifecycle models (also called software process models), that describe different ways to go through the SDLC. Each process model prescribes a 'roadmap' for the software developers to manage the development effort. The roadmap describes the aims of the development stages, the outcome of each stage, and the workflow i.e., the relationship between stages.


W2.2b

Project Management → SDLC Process Models → Introduction → Sequential models

The sequential model, also called the waterfall model, views software development as a linear process, in which the project is seen as progressing through the development stages. The name waterfall stems from how the model is drawn to look like a waterfall (see below).

When one stage of the process is completed, it produces some artifacts to be used in the next stage. For example, the requirements stage produces a comprehensive list of requirements, to be used in the design phase.

A strict sequential model project moves only in the forward direction i.e., each stage is completed before starting the next. For example, once the requirements stage is over, there is no provision for revising the requirements later.

This model can work well for a project that produces software to solve a well-understood problem, in which case the requirements can remain stable and the effort can be estimated accurately. Furthermore, as each stage has a well-defined outcome, it is easy to track the progress of the project because one can gauge the project progress by monitoring which stage the project is in.

However, real-world projects often tackle problems that are not well-understood at the beginning, making them unsuitable for this model. For example, target users of a software product may not be able to state their requirements accurately at the start of the project, if they have not used a similar product before.


W2.2c

Project Management → SDLC Process Models → Introduction → Iterative models

The iterative model advocates producing the software by going through several iterations. Each of the iterations could potentially go through all the stages of the SDLC, from requirements gathering to deployment.

Each iteration produces a new version of the product, building upon the version produced in the previous iteration. Feedback from each iteration is factored into the subsequent iterations. For example, if an implementation task took longer than expected, the effort estimate for a similar tasks in future iterations can be adjusted accordingly. Similarly, if a feature introduced in the current iteration was not well-received by target users, it can be removed or tweaked in the next iteration.

The iterative model can be done in breadth-first or depth-first approach.

  • In the breadth-first approach, an iteration evolves all major components and all functionality areas in parallel i.e., most features and most will be updated in each iteration, producing a working product at the end of each iteration.
  • In the depth-first approach, an iteration focuses on fleshing out only some components or some functionality area. Accordingly, early depth-first iterations might not produce a working product.

Taking a Minesweeper game as an example,

  • breadth-first iterations will deliver a fully playable version early. These early versions may have primitive functionality, for example, a rudimentary text based UI, fixed board size, limited minefield layouts, etc. These functionalities (and corresponding components) will then be improved in later releases.
  • an early depth-first iteration could deliver the full user interface (UI) but with no game logic at all. Alternatively, an early iteration could focus on just the logic for generating initial layouts of the minefield. Neither will be a playable version of the game but both can be used to collect early feedback (about the UI, and the initial minefield layouts, respectively) which can then be used to guide later iterations.

A project can be done as a mixture of breadth-first and depth-first iterations i.e., an iteration can contain some breadth-first work as well as some depth-first work, or, some iterations can be breadth-first while others are depth-first.



Guidance for the item(s) below:

Next, let's get started learning Java. First, a bit about the language itself.

[W2.3] Java: Intro

W2.3a

C++ to Java → About this chapter

This book chapter assumes you are familiar with basic C++ programming. It provides a crash course to help you migrate from C++ to Java.

This chapter borrows heavily from the excellent book ThinkJava by Allen Downey and Chris Mayfield. As required by the terms of reuse of that book, this chapter is released under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License and not under the MIT license as the rest of this book.

Some conventions used in this chapter:

icon marks the description of an aspect of Java that works mostly similar to C++

icon marks the description of an aspect of Java that is distinctly different from C++

Other resources used:


W2.3b

C++ to Java → The Java World → What is Java?

Java was conceived by James Gosling and his team at Sun Microsystems in 1991.

Java is directly related to both C and C++. Java inherits its syntax from C. Its object model is adapted from C++. --Java: A Beginner’s Guide, by Oracle

Fun fact: The language was initially called Oak after an oak tree that stood outside Gosling's office. Later the project went by the name Green and was finally renamed Java, from Java coffee. --Wikipedia

Oracle became the owner of Java in 2010, when it acquired Sun Microsystems.

Java has remained the most popular language in the world for several years now (as at July 2018), according to the TIOBE index.


W2.3c

C++ to Java → The Java World → How Java works

Java is both and . Instead of translating programs directly into machine language, the Java compiler generates byte code. Byte code is portable, so it is possible to compile a Java program on one machine, transfer the byte code to another machine, and run the byte code on the other machine. That’s why Java is considered a platform independent technology, aka WORA (Write Once Run Anywhere). The interpreter that runs byte code is called a “Java Virtual Machine” (JVM).

Java technology is both a programming language and a platform. The Java programming language is a high-level object-oriented language that has a particular syntax and style. A Java platform is a particular environment in which Java programming language applications run. --Oracle


W2.3d

C++ to Java → The Java World → Java editions

According to the Official Java documentation, there are four platforms of the Java programming language:

  • Java Platform, Standard Edition (Java SE): Contains the core functionality of the Java programming language.

  • Java Platform, Enterprise Edition (Java EE): For developing and running large-scale enterprise applications. Built on top of Java SE.

  • Java Platform, Micro Edition (Java ME): For Java programming language applications meant for small devices, like mobile phones. A subset of Java SE.

  • JavaFX: For creating applications with graphical user interfaces. Can work with the other three above.

This book chapter uses the Java SE edition unless stated otherwise.



Guidance for the item(s) below:

As with any language, the first step is to install the language on your computer. After that, you write a simple HelloWorld program, and get it running.

Java 17 can be downloaded from here.

[W2.4] Java: HelloWorld

W2.4a

C++ to Java → Getting Started → Installation

To run Java programs, you only need to have a recent version of the Java Runtime Environment (JRE) installed in your device.

If you want to develop applications for Java, download and install a recent version of the Java Development Kit (JDK), which includes the JRE as well as additional resources needed to develop Java applications.


W2.4b

C++ to Java → Getting Started → HelloWorld

In Java, the HelloWorld program looks like this:

public class HelloWorld {

    public static void main(String[] args) {
        // generate some simple output
        System.out.println("Hello, World!");
    }
}

For reference, the equivalent C++ code is given below:

#include <iostream>
using namespace std;

int main() {
    // generate some simple output
    cout << "Hello, World!";
    return 0;
}

This HelloWorld Java program defines one method named main: public static void main(String[] args)

System.out.println() displays a given text on the screen.

Some similarities:

  • Java programs consists of statements, grouped , which are then grouped into classes.
  • Java is “case-sensitive”, which means SYSTEM is different from System.
  • public is an access modifier that indicates the method is accessible from outside this class. Similarly, private access modifier indicates that a method/attribute is not accessible outside the class.
  • static indicates this method is defined as a class-level member. Do not worry if you don’t know what that means. It will be explained later.
  • void indicates that the method does not return anything.
  • The name and format of the main method is special as it is the method that Java executes when you run a Java program.
  • A class is a collection of methods. This program defines a class named HelloWorld.
  • Java uses squiggly braces ({ and }) to group things together.
  • The line starting with // is a comment. You can use // for single line comments and /* ... */ for multi-line comments in Java code.

Some differences:

  • Java use the term method instead of function. In particular, Java doesn’t have stand-alone functions. Every method should belong to a class. The main method will not work unless it is inside the HelloWorld class.
  • A Java class definition does not end with a semicolon, but most Java statements do.
  • In most cases (i.e., there are exceptions), the name of the class has to match the name of the file it is in, so this class has to be in a file named HelloWorld.java.
  • There is no need for the HelloWorld code to have something like #include <iostream>. The library files needed by the HelloWorld code is available by default without having to "include" them explicitly.
  • There is no need to return 0 at the end of the main method to indicate the execution was successful. It is considered as a successful execution unless an error is signalled specifically.

W2.4c

C++ to Java → Getting Started → Compiling a program

To compile the HelloWorld program, open a command console, navigate to the folder containing the file, and run the following command.

>_ javac HelloWorld.java

If the compilation is successful, you should see a file HelloWorld.class. That file contains the byte code for your program. If the compilation is unsuccessful, you will be notified of the compile-time errors.

Notes:

  • javac is the java compiler that you get when you install the JDK.
  • For the above command to work, your console program should be able to find the javac executable (e.g., In Windows, the location of the javac.exe should be in the PATH system variable).
    This page shows how to set PATH in different OS'es.

W2.4d

C++ to Java → Getting Started → Running a program

To run the HelloWorld program, in a command console, run the following command from the folder containing HelloWorld.class file.

>_ java HelloWorld

Notes:

  • java in the command above refers to the Java interpreter installed on your computer.
  • Similar to javac, your console should be able to find the java executable.

When you run a Java program, you can encounter a . These errors are also called "exceptions" because they usually indicate that something exceptional (and bad) has happened. When a run-time error occurs, the interpreter displays an error message that explains what happened and where.

For example, modify the HelloWorld code to include the following line, compile it again, and run it.

System.out.println(5/0);

You should get a message like this:

Exception in thread "main" java.lang.ArithmeticException: / by zero
    at Hello.main(Hello.java:5)

Integrated Development Environments (IDEs) can automate the intermediate step of compiling. They usually have a Run button which compiles the code first and then runs it.

Example IDEs:

  • IntelliJ IDEA
  • Eclipse
  • NetBeans

Exercises:

[Exercise] Run HelloWorld

A. Install Java on your computer, if you haven't done so already. Java 17 is recommended.

Java installation guides for Windows/Mac/Linux can be found in this se-edu/guide.

Ensure the installation is correct, as follows.

  1. Open a command prompt.
  2. Run the command java -version. Confirm that the output indicates that you have Java 17. Sample output:
    C:\Users\john>java -version
    java version "17._._" ...
    ...
    ...
    
  3. Run the command javac and ensure it results in a help message. If it outputs an error message such as javac is not recognized as internal or external command, you need to configure the PATH variable of your OS so that the OS know where your javac executable is.

B. Write, compile and run a small Java program (e.g., a HelloWorld program) on your computer. You can use any code editor to write the program but use the command prompt to compile and run the program. Here are suggested steps:

  1. Open notepad (or any other text editor)
  2. Copy-paste this code into the editor.
    public class HelloWorld {
    
        public static void main(String[] args) {
            System.out.println("Hello, World!");
        }
    }
    
  3. Save the file in an empty folder as HelloWorld.java (ensure there the case match exactly and there is no .txt at the end).
  4. Open a command prompt, and navigate to the folder where you saved the file.
    c:> cd my-java-code
    c:\my-java-code>
    
  5. Run the command javac HelloWorld.java. Ensure there are no compile errors. Note how a file called HelloWorld.class has been created in that folder.
    c:\my-java-code>javac HelloWorld.java
    
  6. Run the command java HelloWorld (no .java at the end). The output should be Hello, World!.
    c:\my-java-code>java HelloWorld
    Hello, World!
    

C. Modify the code to print something else, save, compile, and run the program again.


[Exercise] ByeWorld

Modify the code below to print "Bye World!".

class Main {
  public static void main(String[] args) {
    // add code here
  }
}



Guidance for the item(s) below:

Now that you know how to write the simplest of Java programs, you can move onto learning about basic language concepts, starting with data types.

At the end of the next section, there is an exercise ([Key Exercise] ByeWorld) that you need to submit on Coursemology. From this point onwards, programming exercises marked as [Key Exercise] need to be submitted on Coursemology.

[W2.5] Java: Data Types

W2.5a

C++ to Java → Data Types → Primitive data types

Java has a number of primitive data types, as given below:

  • byte: an integer in the range -128 to 127 (inclusive).
  • short: an integer in the range -32,768 to 32,767 (inclusive).
  • int: an integer in the range -231 to 231-1.
  • long: An integer in the range -263 to 263-1.
  • float: a single-precision 32-bit IEEE 754 floating point. This data type should never be used for precise values, such as currency. For that, you will need to use the java.math.BigDecimal class instead.
  • double: a double-precision 64-bit IEEE 754 floating point. For decimal values, this data type is generally the default choice. This data type should never be used for precise values, such as currency.
  • boolean: has only two possible values: true and false.
  • char: The char data type is a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or 65,535 inclusive).
The String type (a peek)

Java has a built-in type called String to represent strings. While String is not a primitive type, strings are used often. String values are demarcated by enclosing in a pair of double quotes (e.g., "Hello"). You can use the + operator to concatenate strings (e.g., "Hello " + "!").

You’ll learn more about strings in a later section.


W2.5b

C++ to Java → Data Types → Variables

Java is a statically-typed language in that variables have a fixed type. Here are some examples of declaring variables and assigning values to them.

int x;
x = 5;
int hour = 11;
boolean isCorrect = true;
char capitalC = 'C';
byte b = 100;
short s = 10000;
int i = 100000;

You can use any name starting with a letter, underscore, or $ as a variable name but you cannot use Java keywords as variables names. You can display the value of a variable using System.out.print or System.out.println (the latter goes to the next line after printing). To output multiple values on the same line, it’s common to use several print statements followed by println at the end.

int hour = 11;
int minute = 59;
System.out.print("The current time is ");
System.out.print(hour);
System.out.print(":");
System.out.print(minute);
System.out.println("."); //use println here to complete the line
System.out.println("done");

The current time is 11:59.
done

Use the keyword final to indicate that the variable value, once assigned, should not be allowed to change later i.e., act like a ‘constant’. By convention, names for constants are all uppercase, with the underscore character (_) between words.

final double CM_PER_INCH = 2.54;

W2.5c

C++ to Java → Data Types → Operators

Java supports the usual arithmetic operators, given below.

Operator Description Examples
+ Additive operator 2 + 3 5
- Subtraction operator 4 - 1 3
* Multiplication operator 2 * 3 6
/ Division operator 5 / 2 2 but 5.0 / 2 2.5
% Remainder operator 5 % 2 1

The following program uses some operators as part of an expression hour * 60 + minute:

int hour = 11;
int minute = 59;
System.out.print("Number of minutes since midnight: ");
System.out.println(hour * 60 + minute);

Number of minutes since midnight: 719

When an expression has multiple operators, normal operator precedence rules apply. Furthermore, you can use parentheses to specify a precise precedence.

Examples:

  • 4 * 5 - 1 19 (* has higher precedence than -)
  • 4 * (5 - 1) 16 (parentheses ( ) have higher precedence than *)

Java does not allow .

The unary operators require only one operand; they perform various operations such as incrementing/decrementing a value by one, negating an expression, or inverting the value of a boolean.-- Java Tutorial

Operator Description -- Java Tutorial example
+ Unary plus operator; indicates positive value
(numbers are positive without this, however)
x = 5; y = +x y is 5
- Unary minus operator; negates an expression x = 5; y = -x y is -5
++ Increment operator; increments a value by 1 i = 5; i++ i is 6
-- Decrement operator; decrements a value by 1 i = 5; i-- i is 4
! Logical complement operator; inverts the value of a boolean foo = true; bar = !foo bar is false

Relational operators are used to check conditions like whether two values are equal, or whether one is greater than the other. The following expressions show how they are used:

Operator Description example true example false
x == y x is equal to y 5 == 5 5 == 6
x != y x is not equal to y 5 != 6 5 != 5
x > y x is greater than y 7 > 6 5 > 6
x < y x is less than y 5 < 6 7 < 6
x >= y x is greater than or equal to y 5 >= 5 4 >= 5
x <= y x is less than or equal to y 4 <= 5 6 <= 5

The result of a relational operator is a boolean value.

Java has three conditional operators that are used to operate on boolean values.

Operator Description example true example false
&& and true && true true true && false false
|| or true || false true false || false false
! not not false not true

Resources:

W2.5d

C++ to Java → Data Types → Arrays

Arrays are indicated using square brackets ([]). To create the array itself, you have to use the new operator. Here are some example array declarations:

int[] counts;
counts = new int[4]; // create an int array of size 4

int size = 5;
double[] values;
values = new double[size]; //use a variable for the size

double[] prices = new double[size]; // declare and create at the same time
Alternatively, you can use the shortcut syntax to create and initialize an array:
int[] values = {1, 2, 3, 4, 5, 6};

int[] anArray = {
    100, 200, 300,
    400, 500, 600,
    700, 800, 900, 1000
};

-- Java Tutorial

The [] operator selects elements from an array. Array elements .

int[] counts = new int[4];

System.out.println("The first element is " + counts[0]);

counts[0] = 7; // set the element at index 0 to be 7
counts[1] = counts[0] * 2;
counts[2]++; // increment value at index 2

A Java array is aware of its size. A Java array prevents a programmer from indexing the array out of bounds. If the index is negative or not present in the array, the result is an error named ArrayIndexOutOfBoundsException.

int[] scores = new int[4];
System.out.println(scores.length) // prints 4
scores[5] = 0; // causes an exception

4
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 5
    at Main.main(Main.java:6)

It is also possible to create arrays of more than one dimension:

String[][] names = {
    {"Mr. ", "Mrs. ", "Ms. "},
    {"Smith", "Jones"}
};

System.out.println(names[0][0] + names[1][0]); // Mr. Smith
System.out.println(names[0][2] + names[1][1]); // Ms. Jones

-- Java Tutorial

Passing arguments to a program

The args parameter of the main method is an array of Strings containing command line arguments supplied (if any) when running the program.

public class Foo{
    public static void main(String[] args) {
        System.out.println(args[0]);
    }
}

You can run this program (after compiling it first) from the command line by typing:

>_ java Foo abc

abc


Resources:

Exercises:

[Key Exercise] Compare names

Write a Java program that takes two command line arguments and prints true or false to indicate if the two arguments have the same value. Follow the sample output given below.

class Main {
  public static void main(String[] args) {
      // add your code here
  }
}

>_ java Main adam eve

Words given: adam, eve
They are the same: false

>_ java Main eve eve

Words given: eve, eve
They are the same: true

Use the following technique to compare two Strings(i.e., don't use ==). Reason: to be covered in a later topic.

String x = "foo";
boolean isSame = x.equals("bar") // false
isSame = x.equals("foo") // true

Hint


partial solution





Guidance for the item(s) below:

Next up is learning how to control the flow of a Java program.

[W2.6] Java: Control Flow

W2.6a

C++ to Java → Control Flow → Branching

if-else statements

Java supports the usual forms of if statements:

if (x > 0) {
    System.out.println("x is positive");
}
if (x % 2 == 0) {
    System.out.println("x is even");
} else {
    System.out.println("x is odd");
}
if (x > 0) {
    System.out.println("x is positive");
} else if (x < 0) {
    System.out.println("x is negative");
} else {
    System.out.println("x is zero");
}
if (x == 0) {
    System.out.println("x is zero");
} else {
    if (x > 0) {
        System.out.println("x is positive");
    } else {
        System.out.println("x is negative");
    }
}

The braces are optional (but recommended) for branches that have only one statement. So we could have written the previous example this way ( Bad):

if (x % 2 == 0)
    System.out.println("x is even");
else
    System.out.println("x is odd");
switch statements

The switch statement can have a number of possible execution paths. A switch works with the byte, short, char, and int primitive data types. It also works with enums, String.

Here is an example (adapted from -- Java Tutorial):

public class SwitchDemo {
    public static void main(String[] args) {

        int month = 8;
        String monthString;
        switch (month) {
        case 1:  monthString = "January";
            break;
        case 2:  monthString = "February";
            break;
        case 3:  monthString = "March";
            break;
        case 4:  monthString = "April";
            break;
        case 5:  monthString = "May";
            break;
        case 6:  monthString = "June";
            break;
        case 7:  monthString = "July";
            break;
        case 8:  monthString = "August";
            break;
        case 9:  monthString = "September";
            break;
        case 10: monthString = "October";
            break;
        case 11: monthString = "November";
            break;
        case 12: monthString = "December";
            break;
        default: monthString = "Invalid month";
            break;
        }
        System.out.println(monthString);
    }
}

August


Exercises:

Greeter


[Key Exercise] Grade Helper

Write a Java program that takes a letter grade e.g., A+ as a command line argument and prints the CAP (aka GPA) value for that grade.

Use a switch statement in your code.

A+ A A- B+ B B- C Else
5.0 5.0 4.5 4.0 3.5 3.0 2.5 0

Follow the sample output given below.

>_ java GradeHelper B CAP for grade B is 3.5

You can assume that the input is always in the correct format i.e., no need to handle invalid input cases.

Hint




Resources:

W2.6b

C++ to Java → Control Flow → Methods

Defining methods

Here’s an example of adding more methods to a class:

public class PrintTwice {

    public static void printTwice(String s) {
        System.out.println(s);
        System.out.println(s);
    }

    public static void main(String[] args) {
        String sentence = “Polly likes crackers”
        printTwice(sentence);

    }
}

Polly likes crackers
Polly likes crackers

By convention, method names should be named in the camelCase format.

Similar to the main method, the printTwice method is public (i.e., it can be invoked from other classes) static and void.

Parameters

A method can specify parameters. The printTwice method above specifies a parameter of String type. The main method passes the argument "Polly likes crackers" to that parameter.

The value provided as an argument must have the same type as the parameter. Sometimes Java can convert an argument from one type to another automatically. For example, if the method requires a double, you can invoke it with an int argument 5 and Java will automatically convert the argument to the equivalent value of type double 5.0.

Because a variable declared inside a method only exists inside that method, such variables are called local variables. That applies to parameters of a method too. For example, In the code above, s cannot be used inside main because it is a parameter of the printTwice method and can only be used inside that method. If you try to use s inside main, you’ll get a compiler error. Similarly, inside printTwice there is no such thing as sentence. That variable belongs to main.

return statements

The return statement allows you to terminate a method before you reach the end of it:

public static void printLogarithm(double x) {
    if (x <= 0.0) {
        System.out.println("Error: x must be positive.");
        return;
    }
    double result = Math.log(x);
    System.out.println("The log of x is " + result);
}

It can be used to return a value from a method too:

public class AreaCalculator{

    public static double calculateArea(double radius) {
        double result = 3.14 * radius * radius;
        return result;
    }

    public static void main(String[] args) {
        double area = calculateArea(12.5);
        System.out.println(area);
    }
}
Overloading

Java methods can be overloaded. If two methods do the same thing, it is natural to give them the same name. Having more than one method with the same name is called overloading, and it is legal in Java as long as each version has a different method signature (the signature of the method is the method name and ordered list of parameter types) . For example, the following overloading of the method calculateArea is allowed because the method signatures are different (i.e., calculateArea(double) vs calculateArea(double, double)).

public static double calculateArea(double radius) {
    //...
}

public static double calculateArea(double height, double width) {
    //...
}
Recursion

Methods can be recursive. Here is an example in which the nLines method calls itself recursively:

public static void nLines(int n) {
    if (n > 0) {
        System.out.println();
        nLines(n - 1);
    }
}

Resources:

Exercises:

[Key Exercise] getGradeCap Method

Add the following method to the class given below.

  • public static double getGradeCap(String grade): Returns the CAP (aka GPA) value of the given grade. The mapping from grades to CAP is given below.
A+ A A- B+ B B- C Else
5.0 5.0 4.5 4.0 3.5 3.0 2.5 0

Do not change the code of the main method!

public class Main {

    // ADD YOUR CODE HERE

    public static void main(String[] args) {
        System.out.println("A+: " + getGradeCap("A+"));
        System.out.println("B : " + getGradeCap("B"));
    }
}

A+: 5.0
B : 3.5

Hint




W2.6c

C++ to Java → Control Flow → Loops

Java has while and for constructs for looping.

while loops

Here is an example while loop:

public static void countdown(int n) {
    while (n > 0) {
        System.out.println(n);
        n = n - 1;
    }
    System.out.println("Blastoff!");
}
for loops

for loops have the form:

for (initializer; condition; update) {
    statement(s);
}

Here is an example:

public static void printTable(int rows) {
    for (int i = 1; i <= rows; i = i + 1) {
        printRow(i, rows);
    }
}
do-while loops

The while and for statements are pretest loops; that is, they test the condition first and at the beginning of each pass through the loop. Java also provides a posttest loop: the do-while statement. This type of loop is useful when you need to run the body of the loop at least once.

Here is an example (from -- Java Tutorial):

class DoWhileDemo {
    public static void main(String[] args){
        int count = 1;
        do {
            System.out.println("Count is: " + count);
            count++;
        } while (count < 11);
    }
}
break and continue

A break statement exits the current loop.

Here is an example (from -- Java Tutorial):

class Main {
    public static void main(String[] args) {
        int[] numbers = new int[] { 1, 2, 3, 0, 4, 5, 0 };
        for (int i = 0; i < numbers.length; i++) {
            if (numbers[i] == 0) {
                break;
            }
            System.out.print(numbers[i]);
        }
    }
}

123

[Try the above code on Repl.it]

A continue statement skips the remainder of the current iteration and moves to the next iteration of the loop.

Here is an example (from -- Java Tutorial):

public static void main(String[] args) {
    int[] numbers = new int[] { 1, 2, 3, 0, 4, 5, 0 };
    for (int i = 0; i < numbers.length; i++) {
        if (numbers[i] == 0) {
            continue;
        }
        System.out.print(numbers[i]);
    }
}

12345

[Try the above code on Repl.it]

Enhanced for loops

Since traversing arrays is so common, Java provides an alternative for-loop syntax that makes the code more compact. For example, consider a for loop that displays the elements of an array on separate lines:

for (int i = 0; i < values.length; i++) {
    int value = values[i];
    System.out.println(value);
}

We could rewrite the loop like this:

for (int value : values) {
    System.out.println(value);
}

This statement is called an enhanced for loop. You can read it as, “for each value in values”. Notice how the single line for (int value : values) replaces the first two lines of the standard for loop.


Resources:

Exercises:

[Key Exercise] getMultipleGradeCaps Method

Add the following method to the class given below.

  • public static double[] getMultipleGradeCaps(String[] grades): Returns the CAP (aka GPA) values of the given grades. e.g., if the input was the array ["A+", "B"], the method returns [5.0, 3.5]. The mapping from grades to CAP is given below.
A+ A A- B+ B B- C Else
5.0 5.0 4.5 4.0 3.5 3.0 2.5 0

Do not change the code of the main method!

public class Main {

    // ADD YOUR CODE HERE

    public static double getGradeCap(String grade) {
        double cap = 0;
        switch (grade) {
        case "A+":
        case "A":
            cap = 5.0;
            break;
        case "A-":
            cap = 4.5;
            break;
        case "B+":
            cap = 4.0;
            break;
        case "B":
            cap = 3.5;
            break;
        case "B-":
            cap = 3.0;
            break;
        case "C":
            cap = 2.5;
            break;
        default:
        }
        return cap;
    }

    public static void main(String[] args) {
        String[] grades = new String[]{"A+", "A", "A-"};
        double[] caps = getMultipleGradeCaps(grades);
        for (int i = 0; i < grades.length; i++) {
            System.out.println(grades[i] + ":" + caps[i]);
        }
    }
}

A+:5.0
A:5.0
A-:4.5

Hint





[W2.7] RCS: Getting Started with Git and GitHub

W2.7a

Git Learning Trail → Tour 1: Recording the History of a Folder

Tour 1: Recording the History of a Folder

Target Usage: To use Git to systematically record the history of a folder in your own computer. More specifically, to use Git to save a snapshot of the folder at specific points of time.

Motivation: Recording the history of files in a folder (e.g, code files of a software project, case notes, files related to an article/book that you are authoring) can be useful in case you need to refer to past versions.

Lesson plan:

Before learning about Git, let us first understand what revision control is.

   T1L1. Introduction to Revision Control covers that part.

Before you start learning Git, you need to install some tools on your computer.

   T1L2. Preparing to Use Git covers that part.

To be able to save snapshots of a folder using Git, you must first put the folder under Git's control by initialising a Git repository in that folder.

   T1L3. Putting a Folder Under Git's Control covers that part.

To save a snapshot, you start by specifying what to include in it, also called staging.

   T1L4. Specifying What to Include in a Snapshot covers that part.

After staging, you can now proceed to save the snapshot, aka creating a commit.

   T1L5. Saving a Snapshot covers that part.

It is useful to be able to visualise the commits timeline, aka the revision graph.

   T1L6. Examining the Revision History covers that part.

T1L1. Introduction to Revision Control


Before learning about Git, let us first understand what revision control is.

This lesson covers that part.

Given below is a general introduction to revision control, adapted from bryan-mercurial-guide:

Revision control is the process of managing multiple versions of a piece of information. In its simplest form, this is something that many people do by hand: every time you modify a file, save it under a new name that contains a number, each one higher than the number of the preceding version.

Manually managing multiple versions of even a single file is an error-prone task, though, so software tools to help automate this process have long been available. The earliest automated revision control tools were intended to help a single user to manage revisions of a single file. Over the past few decades, the scope of revision control tools has expanded greatly; they now manage multiple files, and help multiple people to work together. The best modern revision control tools have no problem coping with thousands of people working together on projects that consist of hundreds of thousands of files.

There are a number of reasons why you or your team might want to use an automated revision control tool for a project.

  • It will track the history and evolution of your project, so you don't have to. For every change, you'll have a log of who made it; why they made it; when they made it; and what the change was.
  • It makes it easier for you to collaborate when you're working with other people. For example, when people more or less simultaneously make potentially incompatible changes, the software will help you to identify and resolve those conflicts.
  • It can help you to recover from mistakes. If you make a change that later turns out to be an error, you can revert to an earlier version of one or more files. In fact, a good revision control tool will even help you to efficiently figure out exactly when a problem was introduced.
  • It will help you to work simultaneously on, and manage the drift between, multiple versions of your project.

Most of these reasons are equally valid, at least in theory, whether you're working on a project by yourself, or with a hundred other people.

A revision is the state of a piece of information at a specific point in time, resulting from changes made to it e.g., if you modify the code and save the file, you have a new revision (or a new version) of that file. Some seem to use this term interchangeably with version while others seem to distinguish the two -- here, let us treat them as the same, for simplicity.
Revision Control Software (RCS) are the software tools that automate the process of Revision Control i.e., managing revisions of software . RCS are also known as Version Control Software (VCS), and by a few other names.

Git is the most widely used RCS today. Other RCS tools include Mercurial, Subversion (SVN), Perforce, CVS (Concurrent Versions System), Bazaar, TFS (Team Foundation Server), and Clearcase.

Github is a web-based project hosting platform for projects using Git for revision control. Other similar services include GitLab, BitBucket, and SourceForge.


T1L2. Preparing to Use Git


Before you start learning Git, you need to install some tools on your computer.

This lesson covers that part.

Installing Git

Git is a free and open source software used for revision control. To use Git, you need to install Git on your computer.

PREPARATION: Install Git

Download the Git installer from the official Git website.
Run the installer and make sure to select the option to install Git Bash when prompted.

Screenshots given below provide some guidance on the dialogs you might encounter when installing Git. In other cases, go with the default option.






When running Git commands, we recommend Windows users to use the Git Bash terminal that comes with Git. To open Git Bash terminal, hit the key and type git-bash.

It may be possible that the installation didn't add a shortcut to the Start Menu. You can navigate to the directory where git-bash.exe is (most likely C:\Program Files\Git\git-bash.exe), double click git-bash.exe to open Git Bash.
You can also right-click it and choose Pin to Start or Pin to taskbar.

SIDEBAR: Git Bash Terminal

Git Bash is a terminal application that lets you use Git from the command line on Windows. Since Git was originally developed for Unix-like systems (like Linux and macOS), Windows does not come with a native shell that supports all the commands and utilities commonly used with Git.

Git Bash provides a Unix-like command-line environment on Windows. It includes:

  • A Bash shell (Bash stands for Bourne Again SHell), which is a widely used command-line interpreter on Linux and macOS.
  • Common Unix tools and commands (like ls, cat, ssh, etc.) that are useful when working with Git and scripting.

When copy-pasting text onto a Git Bash terminal, you will not be able to use the familiar Ctrl+V key combo to paste. Instead, right-click on the terminal and use the Paste menu option.


Install homebrew if you don't already have it, and then, run brew install git


Use your Linux distribution's package manager to install Git. Examples:

  • Debian/Ubuntu, run sudo apt-get update and then sudo apt-get install git.

  • Fedora: run sudo dnf update and then sudo dnf install git.


Verify Git is installed, by running the following command in a terminal.

git --version
git version 2._._

The output should display the version number.

On Windows, you might need to close and open the terminal again for it to recognise changes done elsewhere in the computer (e.g., newly-installed software, changes to system variables, etc.).


Configuring user.name and user.email

Git needs to know who you are to record changes properly. When you save a snapshot of your work in Git, it records your name and email as the author of that change. This ensures everyone working on the project can see who made which changes. Accordingly, you should set the config settings user.name and user.email before you start Git for revision control.

PREPARATION: Set user.name and user.email

To set the two config settings, run the following commands in your terminal window:

git config --global user.name "<your-name>"
git config --global user.email "<your-email@example.com>"

To check if they are set as intended, you can use the following two commands:

git config --global user.name
git config --global user.email

Interacting with Git: CLI vs GUI

Git is fundamentally a command-line tool. You primarily interact with it through its by typing commands. This gives you full control over its features and helps you understand what’s really happening under the hood.

clients for Git also exist, such as Sourcetree, GitKraken, and the built-in Git support in editors like Intellij IDEA and VS Code. These tools provide a more visual way to perform some Git operations.

If you're new to Git, it's best to learn the CLI first. The CLI is universal, always available (even on servers), and helps you build a solid understanding of Git’s concepts. You can use GUI clients as a supplement — for example, to visualise complex history structures.

Mastering the CLI gives you confidence and flexibility, while GUI tools can serve as helpful companions.

PREPARATION: [Optional] Install a GUI client

Optionally, you can install a Git GUI client. e.g., Sourcetree (installation instructions).

Our Git lessons show how to perform Git operations in Git CLI, and in Sourcetree -- the latter just to illustrate how Git GUIs work. It is perfectly fine for you to learn the CLI only.


[image credit: https://www.sourcetreeapp.com]



T1L3. Putting a Folder Under Git's Control


To be able to save snapshots of a folder using Git, you must first put the folder under Git's control by initialising a Git repository in that folder.

This lesson covers that part.

Normally, we use Git to manage a revision history of a specific folder, which gives us the ability to revision-control any file in that folder and its subfolders.

To put a folder under the control of Git, we initialise a repository (short name: repo) in that folder. This way, we can initialise repos in different folders, to revision-control different clusters of files independently of each other e.g., files belonging to different projects.

You can follow the hands-on practical below to learn how to initialise a repo in a folder.

What is this? HANDS-ON panels contain hands-on activities you can do as you learn Git. If you are new to Git, we strongly recommend that you do them yourself (even if they appear straightforward), as hands-on usage will help you internalise the concepts and operations better.

HANDS-ON: Initialise a git repo in a folder

1 First, choose a folder. The folder may or may not have any files in it already. For this practical, let us create a folder named things for this purpose.

cd my-projects
mkdir things

2 Then cd into it.

cd things

3 Run the git status command to check the status of the folder.

git status
fatal: not a git repository (or any of the parent directories): .git

Don't panic. The error message is expected. It confirms that the folder currently does not have a Git repo.

4 Now, initialise a repository in that folder.

Use the command git init which should initialise the repo.

git init
Initialized empty Git repository in things/.git/

The output might also contain a hint about a name for an initial branch (e.g., hint: Using 'master' as the name for the initial branch ...). You can ignore that for now.

Note how the output mentions the repo being created in things/.git/ (not things/). More on that later.


  • Windows: Click FileClone/New… → Click on + Create button on the top menu bar.

    Enter the location of the directory and click Create.

  • Mac: New...Create Local Repository (or Create New Repository) → Click ... button to select the folder location for the repository → click the Create button.


done!

Initialising a repo results in two things:

  • First, Git now recognises this folder as a Git repository, which means it can now help you track the version history of files inside this folder.
HANDS-ON: Verifying a folder is a Git repo

To confirm, you can run the git status command. It should respond with something like the following:

git status
On branch master

No commits yet

nothing to commit (create/copy files and use "git add" to track)

Don't worry if you don't understand the output (we will learn about them later); what matters is that it no longer gives an error message as it did before.

done!

  • Second, Git created a hidden subfolder named .git inside the things folder. This folder will be used by Git to store metadata about this repository.

A Git-controlled folder is divided into two main parts:

  1. The repository – stored in the hidden .git subfolder, which contains all the metadata and history.
  2. The working directory – everything else in that folder, where you create and edit files.

What is this? DETOUR panels contain related directions you can optionally explore. We recommend that you only skim them the first time you are going through a tour (i.e., just to know what each detour covers); you can revisit them later, to deepen your knowledge further, or when you encounter a use case related to the concepts covered by the detour.

DETOUR: Undoing a Repo Initialisation

When Git initialises a repo in a folder, it does not touch any files in the folder, other than create the .git folder its contents. So, reversing the operation is as simple as deleting the newly-created .git folder.

git status #run this to confirm a repo exists

rm -rf .git  #delete the .git folder

git status #this should give an error, as the repo no longer exists


T1L4. Specifying What to Include in a Snapshot


To save a snapshot, you start by specifying what to include in it, also called staging.

This lesson covers that part.

Git considers new files that you add to the working directory as 'untracked' i.e., Git is aware of them, but they are not yet under Git's control. The same applies to files that existed in the working folder at the time you initialised the repo.

A Git repo has an internal space called the staging area which it uses to build the next snapshot. Another name for the staging area is the index.

We can stage an untracked file to tell Git that we want its current version to be included in the next snapshot. Once you stage an untracked file, it becomes 'tracked' (i.e., under Git's control). A staged file can be unstaged to indicate that we no longer want it to be included in the next snapshot.

In the example below, you can see how staging files change the status of the repo as you go from (a) to (c).

Working Directory
.git Folder

staging area

[empty]

other metadata ...


├─ fruits.txt (untracked!)
└─ colours.txt (untracked!)


(a) State of the repo, just after initialisation, and creating two files. Both are untracked.
Working Directory
.git Folder

staging area

└─ fruits.txt

other metadata ...


├─ fruits.txt (tracked)
└─ colours.txt (untracked!)


(b) State after staging fruits.txt.
Working Directory
.git Folder

staging area

├─ fruits.txt
└─ colours.txt

other metadata ...


├─ fruits.txt (tracked)
└─ colours.txt (tracked)


(c) State after staging colours.txt.
HANDS-ON: Adding untracked files

1 First, add a file (e.g., fruits.txt) to the things folder.

Here is an easy way to do that with a single terminal command.

echo -e "apples\nbananas\ncherries" > fruits.txt
things/fruits.txt
apples
bananas
cherries

Windows users: Use the git-bash terminal to run the above command (and all commands given in these lessons). Some of them might not work in other terminals such as the PowerShell.

To see the content of the file, yoy can use the cat command:

cat fruits.txt

2 Stage the new file.

2.1 Check the status of the folder using the git status command.

git status
On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

  fruits.txt
nothing added to commit but untracked files present (use "git add" to track)

2.2 Use the git add <file(s)> command to stage the file.

git add fruits.txt

You can replace the add with stage (e.g., git stage fruits.txt) and the result is the same (they are synonyms).

Windows users: When using the echo command to write to text files from Git Bash, you might see a warning LF will be replaced by CRLF the next time Git touches it when Git interacts with such a file. This warning is caused by the way line endings are handled differently by Git and Windows. You can simply ignore it, or suppress it in future by running the following command:

git config --global core.safecrlf false

2.3 Check the status again. You can see the file is no longer 'untracked'.

git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

      new file:   fruits.txt

As before, don't worry if you don't understand the content of the output (we'll unpack it in a later lesson). The point to note is that the file is no longer listed as 'untracked'.


2.1 Note how the file is shown as ‘unstaged’. The question mark icon indicates the file is untracked.

If the newly-added file does not show up in Sourcetree UI, refresh the UI (: F5
| +R)

Sourcetree screenshots/instructions: vs

Note that Sourcetree UI can vary slightly between Windows and Mac versions. Some of the screenshots given in our lessons are from the Windows version while some are from the Mac version.

In som cases, we have specified how they differ.
In other cases, you may need to adapt if the given screenshots/instructions are slightly different from what you are seeing in your Sourcetree.

2.2 Stage the file:

Select the fruits.txt and click on the Stage Selected button.

Staging can be done using tick boxes or the ... menu in front of the file.

2.3 Note how the file is staged now i.e., fruits.txt appears in the Staged files panel now.

If Sourcetree shows a \ No newline at the end of the file message below the staged lines (i.e., below the cherries line in the above screenshot), that is because you did not hit enter after entering the last line of the file (hence, Git is not sure if that line is complete). To rectify, move the cursor to the end of the last line in that file and hit enter (like you are adding a blank line below it). This new change will now appear as an 'unstaged' change. Stage it as well.


done!

If you modify a staged file, it goes into the 'modified' state i.e., the file contains modifications that are not present in the copy that is waiting (in the staging area) to be included in the next snapshot. If you wish to include these new changes in the next snapshot, you need to stage the file again, which will overwrite the copy of the file that was previously in the staging area.
The example below shows how the status of a file changes when it is modified after it was staged.

Working Directory
.git Folder

staging area

names.txt
Alice

other metadata ...


names.txt
Alice

(a) The file names.txt is staged. The copy in the staging area is an exact match to the one in the working directory.
Working Directory
.git Folder

staging area

names.txt
Alice

other metadata ...


names.txt (modified)
Alice
Bob

(b) State after adding a line to the file. Git indicates it as 'modified' because it now differs from the version in the staged area.
Working Directory
.git Folder

staging area

names.txt
Alice
Bob

other metadata ...


names.txt
Alice
Bob

(c) After staging the file again, the staging area is updated with the latest copy of the file, and it is no longer marked as 'modified'.
HANDS-ON: Re-staging 'modified' files

1 First, add another line to fruits.txt, to make it 'modified'.

Here is a way to do that with a single terminal command.

echo "dragon fruits" >> fruits.txt
things/fruits.txt
apples
bananas
cherries
dragon fruits

2 Now, verify that Git sees that file as 'modified'.

Use the git status command to check the status of the working directory.

$ git status
On branch master

No commits yet

Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file:   fruits.txt

Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified:   fruits.txt

Note how fruits.txt now appears twice, once as new file: ... (representing the version of the file we staged earlier, which had only three lines) and once as modified: ... (representing the latest version of the file which now has a fourth line).


Note how fruits.txt appears in the Staged files panel as well as 'Unstaged files'.


3 Stage the file again, the same way you added/staged it earlier.

4 Verify that Git no longer sees it as 'modified', similar to step 2.

done!

Staging applies regardless of whether a file is currently tracked.

  • Staging a tracked file will both begin tracking the file and include it in the next snapshot.
  • Staging an already tracked file will simply mark its current changes for inclusion in the next commit.

Git also supports fine-grained selective staging i.e., staging only specific changes within a file while leaving other changes to the same file unstaged. This will be covered in a later lesson.

Git does not track empty folders. It tracks only folders that contain tracked files.
You can test this by adding an empty subfolder inside the things folder (e.g., things/more-things) and checking if it shows up as 'untracked' (it will not). If you add a file to that folder (e.g., things/more-things/food.txt) and then staged that file (e.g., git add more-things/food.txt), the folder will now be included in the next snapshot.

PRO-TIP: Applying a Git command to multiple files in one go

When a Git command expects a list of files or paths as a parameter (as the git add command does), there are several notations you can use to specify them. Given below are some of them, with example using the git add command:

Specify multiple files, separated by spaces:

git add f1.txt f2.txt data/lists/f3.txt  # stages the specified three files

Use a glob pattern:

git add *.txt  # stages all .txt files in the current directory

Use . to indicate 'all in the current directory and subdirectories':

git add .  # stages all files in current directory and its subdirectories

Specific directory, to indicate 'this directory and its subdirectories':

git add path/to/dir  # stages all files in path/to/dir and its subdirectories

DETOUR: Staging File Deletions

When you delete a tracked file from your working directory, Git doesn’t automatically assume you want that change to be part of your next commit. To tell Git you intend to record a file deletion in the repository’s history, you need to stage the deletion explicitly.

When you stage a deleted file, you’re adding the removal of the file to the staging area, just like you’d stage a modified or newly created file. After staging, the next commit will reflect that the file was removed from the project.

To delete a file and stage the deletion in one go, you can use the git rm <file-name(s)> command. It removes the file from the working directory and stages the deletion at the same time.

git rm data/list.txt plan.txt

If you’ve already deleted the file manually (for example, using rm or deleting it in your file explorer), you can still stage the deletion using the git add <file-name(s)> command. Even though the file no longer exists, git add records its deletion into the staging area.

git add data/list.txt

Staging a file deletion is done similar to staging other changes.



DETOUR: Unstaging Changes

You can unstage a staged file, which simply removes it from the staging area but keeps the changes in your working directory. This is useful if you later realise that you don’t actually want to include a staged file in the next commit — perhaps you staged it by mistake, or you want to include that change in a later commit.

  • To unstage a file you added or modified, run git restore --staged <file(s)>. This command removes the file from the staging area, leaving your working directory untouched.

    git restore --staged plan.txt budget.txt data/list.txt
    

    If your repo does not have any commits yet, git restore --staged will fail with the error fatal: could not resolve HEAD.
    The remedy is to use git reset <file(s)> instead.

    git reset plan.txt
    

    In fact, git reset is an alternative way of unstaging files, and it works regardless of whether you have any commits.

    Wait. Then why does git restore --staged exists at all, given it is more verbose and doesn't even work in some special cases?
    Answer: It is still considered the "modern" way of unstaging files (it was introduced more recently), because it is more intuitive and purpose-specific -- whereas the git reset serves multiple purposes and, if used incorrectly, can cause unintended consequences.

    The restore command can accept multiple files/paths as input, which means you can use the notation for specifying multiple files. For example, to unstage all changes you've staged, you can use the git restore --staged .

  • To unstage a file deletion (staged using git rm), use the same command as above. It will unstage the deletion and restore the file in the staging area.
    If you also deleted the file from your working directory, you may need to recover it separately with git restore <file-name(s)>

    git restore data/list.txt data/plan.txt
    

To unstage a file, locate the file among the staged files section, click the ... in front the file, and choose Unstage file:




T1L5. Saving a Snapshot


After staging, you can now proceed to save the snapshot, aka creating a commit.

This lesson covers that part.

Saving a snapshot is called committing and a saved snapshot is called a commit.

A Git commit is a full snapshot of your working directory based on the files you have staged, more precisely, a record of the exact state of all files in the staging area (index) at that moment -- even the files that have not changed since the last commit. This is in contrast to other revision control software that only store the in a commit. Consequently, a Git commit has all the information it needs to recreate the snapshot of the working directory at the time the commit was created.
A commit also includes metadata such as the author, date, and an optional commit message describing the change.

A Git commit is a snapshot of all tracked files, not simply a delta of what changed since the last commit.

HANDS-ON: Creating your first commit

Assuming you have previously staged changes to the fruits.txt, go ahead and create a commit.

1 First, let us do a sanity check using the git status command.

git status
On branch master

No commits yet

Changes to be committed:
(use "git rm --cached <file>..." to unstage)
  new file:   fruits.txt

2 Now, create a commit using the commit command. The -m switch is used to specify the commit message.

git commit -m "Add fruits.txt"
[master (root-commit) d5f91de] Add fruits.txt
 1 file changed, 5 insertions(+)
 create mode 100644 fruits.txt

3 Verify the staging area is empty using the git status command again.

git status
On branch master
nothing to commit, working tree clean

Note how the output says nothing to commit which means the staging area is now empty.


Click the Commit button, enter a commit message (e.g. add fruits.txt) into the text box, and click Commit.


done!

Related DETOUR: Updating the Last Commit

Git allows you to amend the most recent commit. This is useful when you realise there’s something you’d like to change — e.g., fix a typo in the commit message, or to exclude some unintended change from the commit.

That aspect is covered in a detour in the lesson T5L3. Reorganising Commits.


Related DETOUR: Resetting Uncommitted Changes

At times, you might need to get rid of uncommitted changes so that you have a fresh start to the next commit.

That aspect is covered in a detour in the lesson T4L5. Rewriting History to Start Over.


Related DETOUR: Undoing/Deleting Recent Commits

How do you undo or delete the last few commits if you realise they were incorrect, unnecessary, or done too soon?

That aspect is covered in a detour in the lesson T4L5. Rewriting History to Start Over.



T1L6. Examining the Revision History


It is useful to be able to visualise the commits timeline, aka the revision graph.

This lesson covers that part.

Git commits form a timeline, as each corresponds to a point in time when you asked Git to take a snapshot of your working directory. Each commit links to at least one previous commit, forming a structure that we can traverse.
A timeline of commits is called a branch. By default, Git names the initial branch master -- though many now use main instead. You'll learn more about branches in future lessons. For now, just be aware that the commits you create in a new repo will be on a branch called master (or main) by default.

gitGraph
    %%{init: { 'theme': 'default', 'gitGraph': {'mainBranchName': 'master (or main)'}} }%%
    commit id: "Add fruits.txt"
    commit id: "Update fruits.txt"
    commit id: "Add colours.txt"
    commit id: "..."

Git can show you the list of commits in the Git history.

HANDS-ON: Viewing the list of commits

Preparation Use the things repo you created in a previous lesson. Alternatively, you can use the commands given below to create such a repo from scratch.

mkdir things  # create a folder for the repo
cd things
git init
echo -e "apples\nbananas\ncherries" > fruits.txt
git add fruits.txt
git commit -m "Add fruits.txt"

You can copy-paste a list of commands (such as commands given above), including any comments, to the terminal. After that, hit enter to run them in sequence.

1 View the list of commits, which should show just the one commit you created just now.

You can use the git log command to see the commit history.

git log
commit ... (HEAD -> master)
Author: ... <...@...>
Date:   ...

Add fruits.txt

Use the Q key to exit the output screen of the git log command.

Note how the output has some details about the commit you just created. You can ignore most of it for now, but notice it also shows the commit message you provided.


Expand the BRANCHES menu and click on the master to view the history graph, which contains only one node at the moment, representing the commit you just added. For now, ignore the label master attached to the commit.


2 Create a few more commits (i.e., a few rounds of add/edit files → stage → commit), and observe how the list of commits grows.

Here is an example list of bash commands to add two commits while observing the list of commits

echo "figs" >> fruits.txt  # add another line to fruits.txt
git add fruits.txt  # stage the updated file
git commit -m "Insert figs into fruits.txt"  # commit the changes
git log  # check commits list

echo "a file for colours" >> colours.txt  # add a colours.txt file
echo "a file for shapes" >> shapes.txt  # add a shapes.txt file
git add colours.txt shapes.txt  # stage both files in one go
git commit -m "Add colours.txt, shapes.txt"  # commit the changes
git log  # check commits list

The output of the final git log should be something like this:

commit ... (HEAD -> master)
Author: ... <...@...>
Date:   ...

    Add colours.txt, shapes.txt

commit ...
Author: ... <...@...>
Date:   ...

    Insert figs into fruits.txt

commit ...
Author: ... <...@...>
Date:   ...

    Add fruits.txt

SIDEBAR: Working with the 'less' pager

Some Git commands — such as git log— may show their output through a pager. A pager is a program that lets you view long text one screen at a time, so you don’t miss anything that scrolls off the top. For example, the output of git log command will temporarily hide the current content of the terminal, and enter the pager view that shows output one screen at a time. When you exit the pager, the git log output will disappear from view, and the previous content of the terminal will reappear.

command 1
output 1

git log


commit f761ea63738a...
Author: ... <...@...>
Date:   Sat ...

    Add colours.txt

By default, Git uses a pager called less. Given below are some useful commands you can use inside the less pager.

Command Description
q Quit less and return to the terminal
or j Move down one line
or k Move up one line
Space Move down one screen
b Move up one screen
G Go to the end of the content
g Go to the beginning of the content
/pattern Search forward for pattern (e.g., /fix)
n Repeat the last search (forward)
N Repeat the last search (backward)
h Show help screen with all less commands

If you’d rather see the output directly, without using a pager, you can add the --no-pager flag to the command e.g.,

git --no-pager log

It is possible to ask Git to not use less at all, use a different pager, or fine-tune how less is used. For example, you can reduce Git's use of the pager (recommended), using the following command:

git config --global core.pager "less -FRX"

Explanation:

  • -F : Quit if the output fits on one screen (don’t show pager unnecessarily)
  • -R : Show raw control characters (for coloured Git output)
  • -X : Keep content visible after quitting the pager (so output stays on the terminal)

To see the list of commits, click on the History item (listed under the WORKSPACE section) on the menu on the right edge of Sourcetree.

After adding two more commits, the list of commits should look something like this:


done!

The Git data model consists of two types of entities: objects and refs (short for references). In this lesson, you will encounter examples of both.

A Git revision graph is a visualisation of a repo's revision history, consisting of one or more branches. First, let us learn to work with simpler revision graphs consisting of one branch, such as the one given below.

C3
|
C2
|
C1

  • Nodes in the revision graph represent commits. A commit is one of four main types of Git objects. For completeness, the other three are:
    • blob (short for binary large object): stores the contents of a file
    • tree: represents a directory and records the hierarchy of its contents by referencing blobs and other trees
    • tag (specifically, annotated tag): a label-like object that can store additional metadata and point to a specific commit
  • A commit is identified by its SHA value. A SHA (Secure Hash Algorithm) value is a unique identifier generated by Git to represent each commit. It is produced by using SHA-1 (i.e., one of the algorithms in the SHA family of cryptographic hash functions) on the entire content of the commit. It's a 40-character hexadecimal string (e.g., f761ea63738a67258628e9e54095b88ea67d95e2) that acts like a fingerprint, ensuring that every commit can be referenced unambiguously. That is, every commit has a unique SHA-1 hash value.
  • A commit is a full snapshot of the working directory, constructed based on the previous commit, and the changes staged. That means each commit (except the initial commit) is based on a another 'parent' commit. Some commits can have multiple parent commits -- we’ll cover that later.

Given every commit has a unique hash, the commit hash values you see in our examples will be different from the hash values of your own commits, for example, when following our hands-on practicals.

Edges in the revision graph represent links between a commit and its parent commit(s). In some revision graph visualisations, you might see arrows (instead of lines) showing how each commit points to its parent commit.

C3
C2
C1

Git uses refs to name and keep track of various points in a repository’s history. These refs are essentially 'named-pointers' that can serve as bookmarks to reach a certain point in the revision graph using the ref name.

C3 masterHEAD
|
C2
|
C1

In the revision graph above, there are two refs master and  HEAD.

  • master is a branch ref. A branch ref points to the latest commit on a branch. In this visualisation, the commit shown alongside the ref is the one it points to i.e., C3.
    When you create a new commit, the branch ref of the branch moves to the new commit.
  • HEAD is a special ref that typically points to the current branch and moves along with that branch ref. In this example, it is pointing to the master branch.
    In certain cases, the HEAD may point directly to a specific commit instead of a branch. This situation is called a "detached HEAD", which will be covered in a later lesson.
HANDS-ON: View the revision graph

Target Use Git features to examine the revision graph of a simple repo.

Preparation Use a repo with just a few commits and only one branch.

1 First, use a simple git log to view the list of commits.

git log
commit f761ea63738a... (HEAD -> master)
Author: ... <...@...>
Date:   Sat ...

    Add colours.txt, shapes.txt

commit 2bedace69990...
Author: ... <...@...>
Date:   Sat ...

    Add figs to fruits.txt

commit d5f91de5f0b5...
Author: ... <...@...>
Date:   Fri ...

    Add fruits.txt

Given below the visual representation of the same revision graph. As you can see, the log output shows the refs slightly differently, but it is not hard to see what they mean.

C3 masterHEADAdd colours.txt, shapes.txt
|
C2Add figs to fruits.txt
|
C1Add fruits.txt

2 Use the --oneline flag to get a more concise view. Note how the commit SHA has been truncated to first seven characters (first seven characters of a commit SHA is enough for Git to identify a commit).

git log --oneline
f761ea6 (HEAD -> master, origin/master) Add colours.txt, shapes.txt
2bedace Add figs to fruits.txt
d5f91de Add fruits.txt

3 The --graph flag makes the result closer to a graphical revision graph. Note the * that indicates a node in a revision graph.

git log --oneline --graph
* f761ea6 (HEAD -> master, origin/master) Add colours.txt, shapes.txt
* 2bedace Add figs to fruits.txt
* d5f91de Add fruits.txt

The --graph option is more useful when examining a more complicated revision graph consisting of multiple parallel branches.


Click the History to see the revision graph.

  • In some versions of Sourcetree, the HEAD ref may not be shown -- it is implied that the HEAD ref is pointing to the same commit the currently active branch ref is pointing.


done!


At this point: You should now be able to initialise a Git repository in a folder and commit snapshots of its files at times of your choice. So far, you did not learn how to actually make use of those snapshots (other than to show a list of them) -- we will do that in later tours.

What's next: Tour 2: Backing up a Repo on the Cloud


W2.7b

Git Learning Trail → Tour 2: Backing up a Repo on the Cloud

Tour 2: Backing up a Repo on the Cloud

Target Usage: To back up a Git repository on a cloud-based Git service such as GitHub.

Motivation: One (of several) benefits of maintaining a copy of a repo on a cloud server: it acts as a safety net (e.g., against the folder becoming inaccessible due to a hardware fault).

Lesson plan:

To back up your Git repo on the cloud, you’ll need to use a remote repository service, such as GitHub.

   T2L1. Remote Repositories covers that part.

To use GitHub, you need to sign up for an account, and configure related tools/settings first.

   T2L2. Preparing to use GitHub covers that part.

The first step of backing up a local repo on GitHub: create an empty repository on GitHub.

   T2L3. Creating a Repo on GitHub covers that part.

The second step of backing up a local repo on GitHub: link the local repo with the remote repo on GitHub.

   T2L4. Linking a Local Repo With a Remote Repo covers that part.

The third step of backing up a local repo on GitHub: push a copy of the local repo to the remote repo.

   T2L5. Updating the Remote Repo covers that part.

Git allows you to specify which files should be omitted from revision control.

   T2L6. Omitting Files from Revision Control covers that part.

T2L1. Remote Repositories


To back up your Git repo on the cloud, you’ll need to use a remote repository service, such as GitHub.

This lesson covers that part.

A repo you have on your computer is called a local repo. A remote repo is a repo hosted on a remote computer and allows remote access. Some use cases for remote repositories:

  • as a backup of your local repo
  • as an intermediary repo to work on the same files from multiple computers
  • for sharing the revision history of a codebase among team members of a multi-person project

It is possible to set up a Git remote repo on your own server, but an easier option is to use a remote repo hosting service such as GitHub.


T2L2. Preparing to use GitHub


To use GitHub, you need to sign up for an account, and configure related tools/settings first.

This lesson covers that part.

GitHub is a web-based service that hosts Git repositories and adds collaboration features on top of Git. Two other similar platforms are GitLab and Bitbucket. While Git manages version control locally, such platforms provide additional features such as shared access to repositories, issue tracking, code reviews, and permission controls. They are widely used in software development projects, for both open-source software (OSS) and closed-source software projects.

On GitHub, a Git repo can be put in one of two spaces:

  • A GitHub user account represents an individual user. It is created when you sign up for GitHub and includes a username, profile page, and personal settings. With a user account, you can create your own repositories, contribute to others’ projects, and manage collaboration settings for any repositories you own.
  • A GitHub organisation (org for short) is a shared account used by a group such as a team, company, or open-source project. Organisations can own repositories and manage access to them through teams, roles, and permissions. Organisations are especially useful when managing repositories with shared ownership or when working at scale.

Every GitHub user must have a user account, even if they primarily work within an organisation.

PREPARATION: Create a GitHub account

Create a personal GitHub account as described in GitHub Docs → Creating an account on GitHub, if you don't have one yet.

Choose a sensible GitHub username as you are likely to use it for years to come in professional contexts e.g., in job applications.

[Optional, but recommended] Set up your GitHub profile, as explained in GitHub Docs → Setting up your profile.


Before you can interact with GitHub from your local Git client, you need to set up authentication. In the past, you could simply enter your GitHub username and password, but GitHub no longer accepts passwords for Git operations. Instead, you’ll use a more secure method — such as a Personal Access Token (PAT) or SSH keys — to prove your identity.

A Personal Access Token (PAT) is essentially a long, random string that acts like a password, but it can be scoped to specific permissions (e.g., read-only or full access) and revoked at any time. This makes it more secure and flexible than a traditional password.

Git supports two main protocols for communicating with GitHub: HTTPS and SSH.

  • With HTTPS, you connect over the web and authenticate using your GitHub username and a Personal Access Token.
  • With SSH, you connect using a cryptographic key pair you generate on your machine. Once you add your public key to your GitHub account, GitHub recognises your machine and lets you authenticate without typing anything further.

PREPARATION: Set up authentication with GitHub

Set up your computer's GitHub authentication, as described in the se-edu guide Setting up GitHub Authentication.


GitHub associates a commit to a user based on the email address in the commit metadata. When you push a commit, GitHub checks if the email matches a verified email on a GitHub account. If it does, the commit is shown as authored by that user. If the email doesn’t match any account, the commit is still accepted but won’t be linked to any profile.

GitHub provides a no-reply email (e.g., 12345678+username@users.noreply.github.com) that you can use as your Git user.email to hide your real email while still associating commits with your GitHub account.

PREPARATION: [Optional] Configure user.email to use the no-reply email from GitHub

If you prefer not to include your real email address in commits, you can do the following:

  1. Find your no-reply email provided by GitHub: Navigate to the email settings of your GitHub account and select the option to Keep my email address private. The no-reply address will then be displayed, typically in the format ID+USERNAME@users.noreply.github.com.

  2. Update your user.email with that email address e.g.,

    git config --global user.email "12345678+username@users.noreply.github.com"
    

GitHub offers its own clients to make working with GitHub more convenient.

  • The GitHub Desktop app provides a GUI for performing GitHub operations from your desktop, without needing to visit the GitHub web UI.
  • The GitHub CLI (gh) brings GitHub-specific commands to your terminal, letting you perform operations on GitHub from your command line.

T2L3. Creating a Repo on GitHub


The first step of backing up a local repo on GitHub: create an empty repository on GitHub.

This lesson covers that part.

You can create a remote repository based on an existing local repository, to serve as a remote copy of your local repo. For example, suppose you created a local repo and worked with it for a while, but now you want to upload it onto GitHub. The first step is to create an empty repository on GitHub.

HANDS-ON: Creating an empty remote repo

1 Login to your GitHub account and choose to create a new repo.

2 In the next screen, provide a name for your repo. Refer the screenshot below on some guidance on how to provide the required information.

Click Create repository button to create the new repository.

If you enable any of the three Add _____ options shown above, GitHub will not only create a repo, but will also initialise it with some initial content. That is not what we want here. To create an empty remote repo, keep those options disabled.

3 Note the URL of the repo. It will be of the form
https://github.com/{your_user_name}/{repo_name}.git.
e.g., https://github.com/johndoe/foobar.git (note the .git at the end)

done!


T2L4. Linking a Local Repo With a Remote Repo


The second step of backing up a local repo on GitHub: link the local repo with the remote repo on GitHub.

This lesson covers that part.

A Git remote is a reference to a repository hosted elsewhere, usually on a server like GitHub, GitLab, or Bitbucket. It allows your local Git repo to communicate with another remote copy — for example, to upload locally-created commits that are missing in the remote copy.

By adding a remote, you are informing the local repo details of a remote repo it can communicate with, for example, where the repo exists and what name to use to refer to the remote.

The URL you use to connect to a remote repo depends on the protocol — HTTPS or SSH:

  • HTTPS URLs use the standard web protocol and start with https://github.com/ (for GitHub users). e.g.,
    https://github.com/username/repo-name.git
    
  • SSH URLs use the secure shell protocol and start with git@github.com:. e.g.,
    git@github.com:username/repo-name.git
    

A Git repo can have multiple remotes. You simply need to specify different names for each remote (e.g., upstream, central, production, other-backup ...).

HANDS-ON: Add a remote to a repo

Add the empty remote repo you created on GitHub as a remote of a local repo you have.

1 In a terminal, navigate to the folder containing the local repo things you created earlier.

2 List the current list of remotes using the git remote -v command, for a sanity check. No output is expected if there are no remotes yet.

3 Add a new remote repo using the git remote add <remote-name> <remote-url> command.
i.e., if using HTTPS, git remote add origin https://github.com/{YOUR-GITHUB-USERNAME}/things.git

git remote add origin https://github.com/JohnDoe/things.git  # using HTTPS
git remote add origin git@github.com:JohnDoe/things.git  # using SSH

4 List the remotes again to verify the new remote was added.

git remote -v
origin  https://github.com/johndoe/things.git (fetch)
origin  https://github.com/johndoe/things.git (push)

The same remote will be listed twice, to show that you can do two operations (fetch and push) using this remote. You can ignore that for now. The important thing is the remote you added is being listed.


1 Open the local repo in Sourcetree.

2 Open the dialog for adding a remote, as follows:

Choose RepositoryRepository Settings menu option.
Choose RepositoryRepository Settings... → Choose Remotes tab.

3 Add a new remote to the repo with the following values.

  • Remote name: the name you want to assign to the remote repo i.e., origin
  • URL/path: the URL of your remote repo
    i.e., https://github.com/{YOUR-GITHUB-USERNAME}/things.git
  • Username: your GitHub username

4 Verify the remote was added by going to RepositoryRepository Settings again.


5 Add another remote, to verify that a repo can have multiple remotes. You can use any name (e.g., backup and any URL for this).

done!

DETOUR: Managing Details of a Remote

To change the URL of a remote (e.g., origin), use git remote set-url <remote-name> <new-url> e.g.,

git remote set-url origin https://github.com/user/repo.git

To rename a remote, use git remote rename <old-name> <new-name> e.g.,

git remote rename origin upstream

To delete a remote from your Git repository, use git remote remove <remote-name> e.g.,

git remote remove origin

To check the current remotes and their URLs, use:

git remote -v


T2L5. Updating the Remote Repo


The third step of backing up a local repo on GitHub: push a copy of the local repo to the remote repo.

This lesson covers that part.

You can push content of one repository to another, usually from your local repo to a remote repo. Pushing transfers recorded Git history (such as past commits), but it does not transfer unstaged changes or untracked files.

  • To push, you need to have to the remote repo.
  • Pushing is performed one branch at a time; you must specify which branch you want to push.

You can configure Git to track a pairing between a local branch and a remote branch, so in future you can push from the same local branch to the corresponding remote branch without needing to specify them again. For example, you can set your local master branch to track the master branch on the remote repo origin i.e., local master branch will track the branch origin/master.

C3 masterHEAD origin/master
|
C2
|
C1

In the revision graph above, you see a new type of ref ( origin/master). This is a remote-tracking branch ref that represents the state of a corresponding branch in a remote repository (if you previously set up the branch to 'track' a remote branch). In this example, the master branch in the remote origin is also at the commit C3 (which means you have not created new commits after you pushed to the remote).

If you now create a new commit C4, the state of the revision graph will be as follows:

C4 masterHEAD
|
C3 origin/master
|
C2
|
C1

Explanation: When you create C4, the current branch master moves to C4, and HEAD moves along with it. However, the master branch in the remote origin remains at C3 (because you have not pushed C4 yet). That is, the remote-tracking branch origin/master is one commit behind the local branch master (or, the local branch is one commit ahead). The origin/master ref will move to C4 only after you push your local branch to the remote again.

HANDS-ON: Pushing a local repo to an empty remote repo

Preparation Use a local repo that is connected to an empty remote repo e.g., the things repo from previous hands-on practicals:

1 Push the master branch to the remote. Also instruct Git to track this branch pair.

Use the git push -u <remote-repo-name> <local-branch-name> to push the commits to a remote repository.

git push -u origin master

Explanation:

  • push: the Git sub-command that pushes the current local repo content to a remote repo
  • origin: name of the remote
  • master: branch to push
  • -u (or --set-upstream): the flag that tells Git to track that this local master is tracking origin/master branch

Click the Push button on the buttons ribbon at the top.

Sourcetree top menu

In the next dialog, ensure the settings are as follows, ensure the Track option is selected, and click the Push button on the dialog.

push to empty remote

2 Observe the remote-tracking branch origin/master is now pointing at the same commit as the master branch.

Use the git log --oneline --graph to see the revision graph.

* f761ea6 (HEAD -> master, origin/master) Add colours.txt, shapes.txt
* 2bedace Add figs to fruits.txt
* d5f91de Add fruits.txt

Click the History to see the revision graph.

  • In some versions of Sourcetree, the HEAD ref may not be shown -- it is implied that the HEAD ref is pointing to the same commit the currently active branch ref is pointing.
  • If the remote-tracking branch ref (e.g., origin/master) is not showing up, you may need to enable the Show Remote Branches option.


done!

The push command can be used repeatedly to send further updates to another repo e.g., to update the remote with commits you created since you pushed the first time.

HANDS-ON: Pushing to send further updates to a repo

Target Add a few more commits to the same local repo, and push those commits to the remote repo.

1 Commit some changes in your local repo.

Use the git commit command to create commits, as you did before.

Optionally, you can run the git status command, which should confirm that your local branch is 'ahead' by one commit (i.e., the local branch has commits that are not present in the corresponding branch in the remote repo).

git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean

You can also use the git log --oneline --graph command to see where the branch refs are. Note how the remote-tracking branch origin/master is one commit behind the local master.

e60deae (HEAD -> master) Update fruits list
f761ea6 (origin/master) Add colours.txt, shapes.txt
2bedace Add figs to fruits.txt
d5f91de Add fruits.txt

Create commits as you did before.

Before pushing the new commit, Sourcetree will indicate that your local branch is 'ahead' by one commit (i.e., the local branch has one new commit that is not in the corresponding branch in the remote repo).


2 Push the new commits to your fork on GitHub.

To push the newer commit(s) to the remote, any of the following commands should work:

  • git push origin master
  • git push origin
    (due to tracking you set up earlier, Git will assume you are pushing the master branch)
  • git push
    (due to tracking, Git will assume you are pushing to the remote origin and to the branch master i.e., origin/master)

After pushing, the revision graph should look something like the following (note how both local and remote-tracking branch refs are pointing to the same commit again).

e60deae (HEAD -> master, origin/master) Update fruits list
f761ea6 Add colours.txt, shapes.txt
2bedace Add figs to fruits.txt
d5f91de Add fruits.txt

To push, click the Push button on the top buttons ribbon, ensure the settings are as follows in the next dialog, and click the Push button on the dialog.

After pushing the new commit to the remote, the remote-tracking branch ref should move to the new commit:


done!

Note that you can push between two repos only if those repos have a shared history among them (i.e., one should have been created by copying the other).

DETOUR: Pushing to Multiple Repos

You can push to any number of repos, as long as the target repos and your repo have a shared history.

  1. Add the GitHub repo URL as a remote while giving a suitable name (e.g., upstream, central, production, backup ...), if you haven't done so already.
  2. Push to the target repo -- remember to select the correct target repo when you do.

e.g., git push backup master





T2L6. Omitting Files from Revision Control


Git allows you to specify which files should be omitted from revision control.

This lesson covers that part.

You can specify which files Git should ignore from revision control. While you can always omit files from revision control simply by not staging them, having an 'ignore-list' is more convenient, especially if there are files inside the working folder that are not suitable for revision control (e.g., temporary log files) or files you want to prevent from accidentally including in a commit (files containing confidential information).

A repo-specific ignore-list of files can be specified in a .gitignore file, stored in the root of the repo folder.

The .gitignore file itself can be either revision controlled or ignored.

  • To version control it (the more common choice – which allows you to track how the .gitignore file changes over time), simply commit it as you would commit any other file.
  • To ignore it, simply add its name to the .gitignore file itself.

The .gitignore file supports file patterns e.g., adding temp/*.tmp to the .gitignore file prevents Git from tracking any .tmp files in the temp directory.

SIDEBAR: .gitignore File Syntax

  • Blank lines: Ignored and can be used for spacing.

  • Comments: Begin with # (lines starting with # are ignored).

     # This is a comment
    
  • Write the name or pattern of files/directories to ignore.

    log.txt          # Ignores a file named log.txt
    
  • Wildcards:

    • * matches any number of characters, except / (i.e., for matching a string within a single directory level):
      abc/*.tmp     # Ignores all .tmp files in abc directory
      
    • ** matches any number of characters (including /)
      **/foo.tmp    # Ignores all foo.tmp files in any directory
      
    • ? matches a single character
      config?.yml   # Ignores config1.yml, configA.yml, etc.
      
    • [abc] matches a single character (a, b, or c)
      file[123].txt # Ignores file1.txt, file2.txt, file3.txt
      
  • Directories:

    • Add a trailing / to match directories.
      logs/         # Ignores the logs directory
      
    • Patterns without / match files/folders recursively.
      *.bak         # Ignores all .bak files anywhere
      
    • Patterns with / are relative to the .gitignore location.
      /secret.txt   # Only ignores secret.txt in the root directory
      
  • Negation: Use ! at the start of a line to not ignore something.

    *.log           # Ignores all .log files
    !important.log  # Except important.log
    

Example:

.gitignore
# Ignore all log files
*.log

# Ignore node_modules folder
node_modules/

# Don’t ignore main.log
!main.log
HANDS-ON: Adding a file to the ignore-list

1 Add a file into your repo's working folder that you presumably do not want to revision-control e.g., a file named temp.txt. Observe how Git has detected the new file.
Add a few other files with .tmp extension.

2 Configure Git to ignore those files:

Create a file named .gitignore in the working directory root and add the text temp.txt into it.

echo "temp.txt" >> .gitignore
.gitignore
temp.txt

Observe how temp.txt is no longer detected as 'untracked' by running the git status command (but now it will detect the .gitignore file as 'untracked'.

Update the .gitignore file as follows:

.gitignore
temp.txt
*.tmp

Observe how .tmp files are no longer detected as 'untracked' by running the git status command.


The file should be currently listed under Unstaged files. Right-click it and choose Ignore.... Choose Ignore exact filename(s) and click OK.
Also take note of other options available e.g., Ignore all files with this extension etc. They may be useful in future.

Note how the temp.text is no longer listed under Unstaged files. Observe that a file named .gitignore has been created in the working directory root and has the following line in it. This new file is now listed under Unstaged files.

.gitignore
temp.txt

Right-click on any of the .tmp files you added, and choose Ignore... as you did previously. This time, choose the option Ignore files with this extension.

Note how .temp files are no longer shown as unstaged files, and the .gitignore file has been updated as given below:

.gitignore
temp.txt
*.tmp

3 Optionally, stage and commit the .gitignore file.

done!

Files recommended to be omitted from version control

  • Binary files generated when building your project e.g., *.class, *.jar, *.exe
    Reasons:
    1. no need to version control these files as they can be generated again from the source code
    2. Revision control systems are optimized for tracking text-based files, not binary files.
  • Temporary files e.g., log files generated while testing the product
  • Local files i.e., files specific to your own computer e.g., local settings of your IDE (.idea/)
  • Sensitive content i.e., files containing sensitive/personal information e.g., credential files, personal identification data (especially if there is a possibility of those files getting leaked via the revision control system).

DETOUR: Ignoring Previously-Tracked Files

Adding a file to the .gitignore file is not enough if the file was already being tracked by Git in previous commits. In such cases, you need to do both of the following:

  1. Untrack the file (i.e., remove the file from the staging area and stop tracking it in future), using the git rm --cached <file(s)> command.
    git rm --cached data/ic.txt
    
  2. Add it to the .gitignore file, as usual.


At this point: You should now be able to create a copy of your repo on GitHub, and keep it updated as you add more commits to your local repo. If something goes wrong with your local repo (e.g., disk crash), you can now recover the repo using the remote repo (this tour did not cover how exactly you can do that -- it will be covered in a future tour).

What's next: Tour 3: Working Off a Remote Repo


W2.7c

Git Learning Trail → Tour 3: Working Off a Remote Repo

Tour 3: Working Off a Remote Repo

Target Usage: To work with an existing remote repository.

Motivation: Often, you will need to start with an existing remote repository. In such cases, you may have to create your own copies of that repository, and keep those copies updated when more changes appear in the remote repository.

Lesson plan:

GitHub allows you to create a remote copy of another remote repo, called forking.

   T3L1. Duplicating a Remote Repo on the Cloud covers that part.

The next step is to create a local copy of the remote repo, by cloning the remote repo.

   T3L2. Creating a Local Copy of a Repo covers that part.

When there are new changes in the remote, you need to pull those changes down to your local repo.

   T3L3. Downloading Data Into a Local Repo covers that part.

T3L1. Duplicating a Remote Repo on the Cloud


GitHub allows you to create a remote copy of another remote repo, called forking.

This lesson covers that part.

A fork is a copy of a remote repository created on the same hosting service such as GitHub, GitLab, or Bitbucket. On GitHub, you can fork a repository from another user or organisation into your own space (i.e., your user account or an organisation you have sufficient access to). Forking is particularly useful if you want to experiment with a repo but don’t have write permissions to the original -- you can fork it and work on your own remote copy without affecting the original repository.

HANDS-ON: Forking a repo on GitHub

Preparation Create a GitHub account if you don't have one yet.

1 Go to the GitHub repo you want to fork e.g., samplerepo-things

2 Click on the button in the top-right corner. In the next step,

  • choose to fork to your own account or to another GitHub organization that you are an admin of.
  • un-tick the [ ] Copy the master branch only option, so that you get copies of other branches (if any) in the repo.

done!

Forking is not a Git feature, but a feature provided by hosted Git services like GitHub, GitLab, or Bitbucket.

GitHub does not allow you to fork the same repo more than once to the same destination. If you want to re-fork, you need to delete the previous fork.


T3L2. Creating a Local Copy of a Repo


The next step is to create a local copy of the remote repo, by cloning the remote repo.

This lesson covers that part.

You can clone a repository to create a full copy of it on your computer. This copy includes the entire revision history, branches, and files of the original, so it behaves just like the original repository. For example, you can clone a repository from a hosting service like GitHub to your computer, giving you a complete local version to work with.

Cloning a repo automatically creates a remote named origin which points to the repo you cloned from.

The repo you cloned from is often referred to as the upstream repo.

HANDS-ON: Cloning a remote repo

1 Clone the remote repo to your computer. For example, you can clone the samplerepo-things repo, or the fork you created from it in a previous lesson.

Note that the URL of the GitHub project is different from the URL you need to clone a repo in that GitHub project. e.g.

https://github.com/se-edu/samplerepo-things  # GitHub project URL
https://github.com/se-edu/samplerepo-things.git # the repo URL

You can use the git clone <repository-url> [directory-name] command to clone a repo.

  • <repository-url>: The URL of the remote repository you want to copy.
  • [directory-name] (optional): The name of the folder where you want the repository to be cloned. If you omit this, Git will create a folder with the same name as the repository.
git clone https://github.com/se-edu/samplerepo-things.git  # if using HTTPS
git clone git@github.com:se-edu/samplerepo-things.git  # if using SSH

git clone https://github.com/foo/bar.git my-bar-copy  # also specifies a dir to use

For exact steps for cloning a repo from GitHub, refer to this GitHub document.


FileClone / New ... and provide the URL of the repo and the destination directory.


FileNew ... → Choose as shown below → Provide the URL of the repo and the destination directory in the next dialog.



2 Verify the clone has a remote named origin pointing to the upstream repo.

Use the git remote -v command that you learned earlier.


Choose RepositoryRepository Settings menu option.


done!


T3L3. Downloading Data Into a Local Repo


When there are new changes in the remote, you need to pull those changes down to your local repo.

This lesson covers that part.

There are two steps to bringing over changes from a remote repository into a local repository: fetch and merge.

  • Fetch is the act of downloading the latest changes from the remote repository, but without applying them to your current branch yet. It updates metadata in your repo so that it knows what has changed in the remote repo, but your own local branch remains untouched.
  • Merge is what you do after fetching, to actually incorporate the fetched changes into your local branch. It combines your local branch with the changes from the corresponding branch from the remote repo.
HANDS-ON: Fetch and merge from a remote

1 Clone the repo se-edu/samplerepo-finances. It has 3 commits. Your clone now has a remote origin pointing to the remote repo you cloned from.

2 Change the remote origin to point to samplerepo-finances-2. This remote repo is a copy of the one you cloned, but it has two extra commits.

git remote set-url origin https://github.com/se-edu/samplerepo-finances-2.git

Go to RepositoryRepository settings ... to update remotes.


3 Verify the local repo is unaware of the extra commits in the remote.

git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

The revision graph should look like the below:

If it looks like the below, it is possible that Sourcetree is auto-fetching data from the repo periodically.


4 Fetch from the new remote.

Use the git fetch <remote> command to fetch changes from a remote. If the <remote> is not specified, the default remote origin will be used.

git fetch origin
remote: Enumerating objects: 8, done.
... # more output ...
   afbe966..cc6a151  master     -> origin/master
 * [new tag]         beta       -> beta

Click on the Fetch button on the top menu:

Sourcetree top menu

5 Verify the fetch worked i.e., the local repo is now aware of the two missing commits. Also observe how the local branch ref of the master branch, the staging area, and the working directory remain unchanged after the fetch.

Use the git status command to confirm the repo now knows that it is behind the remote repo.

git status
On branch master
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)

nothing to commit, working tree clean

Now, the revision graph should look something like the below. Note how the origin/master ref is now two commits ahead of the master ref.


6 Merge the fetched changes.

Use the git merge <remote-tracking-branch> command to merge the fetched changes. Check the status and the revision graph to verify that the branch tip has now moved by two more commits.

git merge origin/master

git status
git log --oneline --decorate

To merge the fetched changes, right-click on the latest commit on origin/remote branch and choose Merge.

In the next dialog, choose as follows:

The final result should be something like the below (same as the repo state before we started this hands-on practical):


Note that merging the fetched changes can get complicated if there are multiple branches or the commits in the local repo conflict with commits in the remote repo. We will address them when we learn more about Git branches, in a later lesson.

done!

Pull is a shortcut that combines fetch and merge — it fetches the latest changes from the remote and immediately merges them into your current branch. In practice, Git users typically use the pull instead of the fetch-then-merge.

pull = fetch + merge

HANDS-ON: Pull from a remote

1 Similar to the previous hands-on practical, clone the repo se-edu/samplerepo-finances (to a new location).
Change the remote origin to point to samplerepo-finances-2.

2 Pull the newer commits from the remote, instead of a fetch-then-merge.

Use the git pull <remote> <branch> command to pull changes.

git pull origin master

The following works too. If the <remote> and <branch> are not specified, Git will pull to the current branch from the remote branch it is tracking.

git pull

Click on the Pull button on the top menu:

Sourcetree top menu

In the next dialog, choose as follows:

3 Verify the outcome is same as the fetch + merge steps you did in the previous hands-on practical.

done!

You can pull from any number of remote repos, provided the repos involved have a shared history. This can be useful when the upstream repo you forked from has some new commits that you wish to bring over to your copies of the repo (i.e., your fork and your local repo).

HANDS-ON: Sync your repos with the upstream repo

Preparation Fork se-edu/samplerepo-finances to your GitHub account.
Clone your fork to your computer.
Now, let's pretend that there are some new commits in upstream repo that you would like to bring over to your fork, and your local repo. Here are the steps:

1 Add the upstream repo se-edu/samplerepo-finances as remote named upstream in your local repo.

2 Pull from the upstream repo. If there are new commits (in this case, there will be none), those will come over to your local repo. For example:

git pull upstream master

3 Push to your fork. Any new commits you pulled from the upstream repo will now appear in your fork as well. For example:

git push origin master

The method given above is the more 'standard' method of synchronising a fork with the upstream repo. In addition, platforms such as GitHub can provide other ways (example: GitHub's Sync fork feature).

4 For good measure, let's pull from another repo.

  • Add the upstream repo se-edu/samplerepo-finances-2 as remote named other-upstream in your local repo.
  • Pull from it to your local repo; this will bring some new commits.
  • Now, you can push those new commits to your fork.
git remote add other-upstream https://github.com/se-edu/samplerepo-finances-2.git
git pull other-upstream master
git push origin master

done!

DETOUR: Pulling from Multiple Remotes

You can pull from any number of repos, provided the repos involved have a shared history.

  1. Add the GitHub repo URL as a remote while giving a suitable name (e.g., upstream, central, production, backup ...), if you haven't done so already.
  2. Pull (or fetch) from the remote repo -- remember to select the correct remote repo when you do.

e.g., git pull backup master


Similar to before, but remember to choose the intended remote to pull from.




At this point: Now you can create your own remote and local copies of any repo on GitHub, and update your copy when there are new changes in the upstream repo.

What's next: Tour 4: Using the Revision History of a Repo



Guidance for the item(s) below:

As you are likely to be using an IDE for the iP, let's learn at least enough about IDEs to get you started using one.

[W2.8] IDEs: Basic Features

W2.8a

Implementation → IDEs → What

Professional software engineers often write code using Integrated Development Environments (IDEs). IDEs support most development-related work within the same tool (hence, the term integrated).

An IDE generally consists of:

  • A source code editor that includes features such as syntax coloring, auto-completion, easy code navigation, error highlighting, and code-snippet generation.
  • A compiler and/or an interpreter (together with other build automation support) that facilitates the compilation/linking/running/deployment of a program.
  • A debugger that allows the developer to execute the program one step at a time to observe the run-time behavior in order to locate bugs.
  • Other tools that aid various aspects of coding e.g., support for automated testing, drag-and-drop construction of UI components, version management support, simulation of the target runtime platform, modeling support, AI-assisted coding help, collaborative coding with others.

Examples of popular IDEs:

  • Java: Eclipse, IntelliJ IDEA, NetBeans
  • C#, C++: Visual Studio
  • Swift: XCode
  • Python: PyCharm
  • Multiple languages: VS Code

Some web-based IDEs have appeared in recent times too e.g., Amazon's Cloud9 IDE.

Some experienced developers, in particular those with a UNIX background, prefer lightweight yet powerful text editors with scripting capabilities (e.g., Emacs) over heavier IDEs.


Exercises:

Which of these are features available in IDEs?



W2.8b

Tools → IntelliJ IDEA → Project setup

Refer to these se-edu guides:


W2.8c

Tools → IntelliJ IDEA → Code navigation


Guidance for the item(s) below:

As you know, one of the objectives of the iP is to raise the quality of your code. We'll be learning about various ways to improve the code quality in the next few weeks, starting with coding standards.

[W2.9] Code Quality: Coding Standards

W2.9a

Implementation → Code Quality → Introduction → What

Always code as if the person who ends up maintaining your code will be a violent psychopath who knows where you live. -- Martin Golding

Production code needs to be of high quality. Given how the world is becoming increasingly dependent on software, poor quality code is something no one can afford to tolerate.


W2.9b

Implementation → Code Quality → Style → Introduction

One essential way to improve code quality is to follow a consistent style. That is why software engineers usually follow a strict coding standard (aka style guide).

The aim of a coding standard is to make the entire codebase look like it was written by one person. A coding standard is usually specific to a programming language and specifies guidelines such as the locations of opening and closing braces, indentation styles and naming styles (e.g., whether to use Hungarian style, Pascal casing, Camel casing, etc.). It is important that the whole team/company uses the same coding standard and that the standard is generally not inconsistent with typical industry practices. If a company's coding standard is very different from what is typically used in the industry, new recruits will take longer to get used to the company's coding style.

IDEs can help to enforce some parts of a coding standard e.g., indentation rules.


Exercises:

What is the recommended approach regarding coding standards?