-
Data Structures
Definition:
- A data structure consists of
- A base storage method (e.g. an array, a list, a tree, a …)
- One or more algorithms that are used to access or modify that data
Data Abstraction
Data abstraction is a key concept for data structures.
Example - Dice
Here is a program demonstrating a dice data structure. Dice is a useful example because we all understand what a set of dice are and how they work. You can roll/spin them to produce a total value and you can get the face up values of the dice being rolled.
Simple procedural approach - the data in the program is simply stored as two variables in the program, which calculates values for these variables using a method that generates random numbers.
public static void main(String[] args) { int die1, die2, total; // randInt()user generated method to // return a random number between 1 and 6 die1 = randInt( ); die2 = randInt( ); total = die1 + die2; System.out.println("first die: " + die1); System.out.println("second die: " + die2); System.out.println(); System.out.println("total for spin is: " + total); }
Object oriented approach - defines a class called Dice that encapsulates two individual dice, and that has methods to spin the dice and to return the values of the first die and the second die. The roll method encapsulates the calculation of the dice values, so that the calculation is the responsibility of the Dice object, not the responsibility of the program – the program doesn't need to know how that calculation is done or how the dice values are stored. The Dice class is an example of a data structure that abstracts, or hides the details of, the data and the way the data is manipulated.
public static void main(String[] args) { System.out.println("using class approach"); int die1, die2, total; Dice dice = new Dice(); // instantiate the dice total = dice.roll( ); // roll the dice die1 = dice.die1(); // get die1 value die2 = dice.die2( ); // get die1 value System.out.println("first die: " + die1); System.out.println("second die: " + die2); System.out.println( ); System.out.println("total is: "+ total); }
- We haven't looked inside the Dice class to see how the code is written, and we don't need to here. All we need to know in order to use the Dice class in our program is what it does. Dice is a data structure that stores data and provides operations that acts on the data.
- Data - two integers, each between 1 and 6
- Operations - one operation to roll the dice and return the total value, and operations to return the value of each die
The Dice class could be written to provide further operations. For example it could return a text representation of the result.
More complex data
The Dice class is a very simple data structure. The data and the operation(s) are so simple that there is little, if any, advantage over the procedural approach – the procedural program is itself simple. What if the data were more complex?
- Example - deck of cards
- Data – 4 suites of 13 cards each
- Operations – deal, cut, shuffle, and so on
Writing a program to implement a card game would become quite complicated, much simpler if these data and operations are encapsulated in a Deck data structure.
Example - air traffic control system
Very complex data and operations - need to develop a model, requires high level of software engineering expertise and expert knowledge of the problem domain
-
Abstract Data Type (ADT)
ADTs give us a tool to use to control complexity
An ADT is a collection of data and operations
- Abstract – does not say anything about the details of how it actually works
- ADT defines the data characteristics but DOES NOT include method of representation
- ADT defines the operations that may be applied to the data but DOES NOT include how these operations are implemented
The idea of the ADT was proposed by Barbara Liskov at MIT in a paper published in 1974.
Definition of ADT
- An Abstract Data Type (ADT) is :
- a well-specified collection of data
- known as the characteristics, or data state
- and a set of operations that can be performed upon the data
- known as behaviours or operations
The set of operations that can be performed is known as the interface
Implementation
In order to have a data structure that can actually be used you need to have an implementation of the ADT that actually stores the data and provides working operations. In Java, operations are implemented as methods. There may be many possible ways of implementing the same ADT. In the Dice example, the two die values could be stored as two integer fields in the Dice class, or perhaps as an array of integers as a field in the class. The details of the spin operation would be slightly different in each case. The program doesn't need to know anything about these details, though, in order to use the Dice.
- This is a bit like driving a car - you don’t need a mechanic’s understanding of what’s under a car’s hood in order to drive it.
- What’s the car’s interface?
- What’s the implementation?
When working with an ADT, the program that uses it need not know about the implementation of the ADT. It interacts with the ADT using only public methods. This is an example of reusable code – the same ADT implementation can be used in many different programs.
ADTs facilitate storage, organization, and processing of information
- ADT Design Process:
- Identify “kinds” or “types” of Information
- Encapsulate information1.1 and provide a public Interface consisting of methods
- Hide information and method code in the private implementation
Language-independent ADTs
- Important to note that an ADT is a formal description, not code; independent of any programming language. Why is code independence a good idea? It promotes design by contract:
- A contract in business typically specifies responsibilities of suppliers and clients explicitly, so they can be enforced by law, if necessary
- A contract in software specifies the responsibility of the code that performs an operation and the code that calls it. What do you think enforces a software contract?
- In either case, the contract doesn't specify the details of:
- How the supplier will actually create the product or service, or
- How the code implements the operation
This is not a Java programming course; however you will use Java to help you to learn about data structures, in a couple of ways:
Note that the Java Collections Framework provides data structures that implement many common ADTs, and you will learn to make use of these during this module.
You will be expected to use Java in the practical activities to implement your own version of ADTs
-
Interfaces
- As you have learned, an ADT’s specification describes:
- what data can be stored (the characteristics of the ADT),
- and how it can be used (the operations),
but not how it is implemented or represented in the program. ADTs do not specify how the data structure will be implemented - there may be many ways. The data specification has been decoupled from the implementation.
This means that software development can be less complex (fewer details to consider). Also, software development is more flexible (the actual structure is not fixed) – easier to change!
Object-oriented programming (OOP) emphasizes data abstraction, and provides a basis for defining ADTs and creating data structures to implement them. Note that data structures are not dependent on OOP – you can also create them in non-OO languages.
A Java interface is a way to specify but not implement an ADT. An interface can specify the names, parameters and return types of the ADT methods without specifying how the methods carry out their operations and without specifying how the data is internally represented
Interface Definition
FORM:
public interface interfaceName { abstract method headings constant declarations }
EXAMPLE:
public interface Payable { double calcSalary(); boolean salaried(); static final double DEDUCTIONS = 25.5; }
This example specifies two methods, each of which takes no parameters and has the specified return type. Note that constants can also be defined in the interface, but not instance variables. In this example, the constant DEDUCTIONS is accessible in classes that implement the interface
-
Contracts and Interfaces
A java interface is a contract between the interface designer and the programmer of a class that implements the interface. The programmer MUST code the methods that carry out the operations identified in the interface. The screenshot below shows a class definition for a class SayHello that implements an interface IMustBeHere. There is a problem with this class, though, and the compiler is reporting an error. The interface specifies that a method mustBeHere must be present in an implementing class, but SayHello does not yet provide an implementation for this method, hence the error message. To fulfil the contract specified by the interface, SayHello must implement a method mustBeHere with a signature that matches that specified in the interface ( it must return a Boolean value and take no parameters).
public interface IMustBeHere { boolean mustBeHere(); }
There can be many different implementations of an interface:
public class SayBye implements IMustBeHere { ...
A class may implement more than one interface — their names are separated by commas:
public interface ICanSleepIfIWantTo { Boolean canSleep(); } public class SayHello implements IMustBeHere, ICanSleepIfIWantTo { ...
As long as the implementation satisfies the contract, the programmer may implement it as he or she chooses.
- In addition to implementing all data fields and methods in the interface, the programmer may add:
- data fields not in the implementation
- methods not in the implementation
- constructors (an interface cannot contain constructors because it cannot be instantiated)
Here is a complete SayHello class that implements the interfaces IMustBeHere and ICanSleepIfIWantTo:
public class SayHello implements IMustBeHere, ICanSleepIfIWantTo { public Boolean mustBeHere() { return true; } public Boolean canSleep(){ return false; } public void sayHello() { if (mustBeHere() && !canSleep()) System.out.println(“Hello DSA Students!”); } }
-
The Dice ADT
Now let’s look at how the Dice class referred to earlier can be implemented as an ADT. The Dice ADT represents two simple dice that can be rolled to get a random total that lies between 2 and 12. The value of each die can also be retrieved after rolling.
Characteristics (data state):
Represents a pair of 6-sided dice, each of which can have a value between 1 & 6.
- Operations:
- roll( )
- Precondition: none.
- Postcondition: A random value between 1 and 6 is stored for each of the 2 dies.
- Returns: The sum of the two die values, lying between 2 and 12.
- die1( )
- Precondition: Dice must exist and has been rolled at least once
- Postcondition: Die #1 will have value in range 1 & 6
- Returns: The value of die number 1 lying between 1 & 6
- die2( )
- Precondition: Dice must exist and has been rolled at least once
- Postcondition: Die #2 will have value in range 1 & 6
- Returns: The value of die number 2 lying between 1 & 6
ADT interface - IDice
public interface IDice { int roll(); int die1(); int die2(); }
ADT Class version 1 – Dice implements IDice
public class DiceV1 implements IDice { public int roll() { return die1() + die2(); } public int die1() { return randInt(); } public int die2() { return randInt(); } // private implementation of randInt() ... } // end of Class
ADT Class V2 – Dice implements IDice
It is often possible to implement an ADT in more than one way, as long as it provides the required operations to implement the interface. Here is a variation on the implementation of the Dice ADT:
public class DiceV2 implements IDice { public int roll() { die1Value = randInt(); // roll die #1 die2Value = randInt(); // roll die #2 return die1Value + die2Value; } public int die1() { return die1Value; } public int die2() { return die2Value; } // private section private int die1Value; private int die2Value; // private implementation of randInt() ... } // end of Class
Declaring a Variable of an Interface Type
While you cannot instantiate an interface, you can declare a variable that has an interface type
/* interface type */ IDice dice1 = new DiceV1(); IDice dice2 = new DiceV2();
This is useful when dealing with polymorphic code, so that the code that uses a Dice object does not have to specify which implementation of the ADT will be used at runtime, but simply relies on the fact that the ADT will fulfil the contract specified by the ADT interface.
-
Abstract Classes
Abstract Class
An interface can declare methods but does not provide an implementation of the methods. In an interface, all the methods are abstract – i.e. they do not include any code that implements any action. An interface also cannot have any data fields or constructors.
- An abstract class, on the other hand, can have:
- abstract methods
- data-fields
- concrete implementations of methods
- constructors that initialize data-fields
However, like an interface, it still cannot itself be instantiated
Useful for situations where you want to provide implementations of some behaviour that will be common to a set of classes that will otherwise implement their own behaviour.
Here is an example of a simple Abstract Class that includes a data field, a concrete method and an abstract method. The data field could be initialised by a constructor, although that is not illustrated in this example.
public abstract class Animal { String name; void eat(String food) { System.out.println("Ate some " + food); } abstract void talk(); }
This can be extended by concrete classes that inherit the eat method and implement the talk method:
class Dog extends Animal { void talk( ) { System.out.println("Woof"); } } class Cat extends Animal { void talk( ) { System.out.println("Miaow"); } }
Like interfaces, abstract classes enable polymorphism. This example creates a collection of animals. The type of the collection objects is declared as Animal, but the actual object types are Dog and Cat.
public class AbstractDemo { public static void main(String[] args) { ArrayList<Animal> animals = new ArrayList<Animal>(); animals.add(new Dog()); animals.add(new Cat()); for(Animal a: animals) { a.eat("nice food"); a.talk(); } } }
-
The Dice ADT Using An Abstract Class
You can base the Dice implementation on an abstract class instead of an interface. Here is the abstract class. It represents the same data characteristics and operations that we defined previously for the Dice ADT. Note that it has data fields representing the values of each die, and it implements the die1 and die2 operations as getter methods for these fields. It does not, however, implement the roll method, so in order to create and use a dice you need to have a concrete class that extends this abstract class and provides an implementation of roll.
public abstract class ADice { public int die1Value; public int die2Value; public int die1(){ return die1Value; } public int die2(){ return die2Value; } public abstract int roll(); }
Here is another version of the Dice ADT, this time as a class that extends the ADice abstract class. Note that because it does so, it only needs to provide a concrete implementation of roll, which as in the previous versions, makes use of a private method to calculate a random integer, and this private method is used in the roll method.
public class DiceV3 extends ADice { public int roll() { die1Value = randInt(); // roll die #1 die2Value = randInt(); // roll die #2 return die1Value + die2Value; } // private implementation of randInt() ... } // end of Class
The following code creates and tests instances of each of the three versions of the Dice ADT. Note that they are all used in exactly the same way.
public class DiceDemo1 { public static void main(String[] args) { System.out.println("Using DiceV1"); int total1; IDice dice1 = new DiceV1(); total1 = dice1.roll(); System.out.println("first die: " + dice1.die1()); System.out.println("second die: " + dice1.die2()); System.out.println("total is: " + total1); System.out.println("Using DiceV2"); int total2; IDice dice2 = new DiceV2(); total1 = dice2.roll(); System.out.println("first die: " + dice2.die1()); System.out.println("second die: " + dice2.die2()); System.out.println("total is: " + total1); System.out.println("Using DiceV3"); int total3; ADice dice3 = new DiceV3(); total3 = dice3.roll(); System.out.println("first die: " + dice3.die1()); System.out.println("second die: " + dice3.die2()); System.out.println("total is: " + total3); } }
-
Interface and Abstract Class
Note that the ADice abstract class actually fulfils the contract specified by the IDice interface. It provides all the operations (die1, die2, roll) in the interface. This means that we can make ADice implement IDice. This works, even though the implementation of roll in ADice is abstract.
public abstract class ADice implements IDice {
- This combination of an interface and an abstract class is quite commonly used when designing implementations of ADTs. In this approach:
- The interface specifies what operations the ADT must provide – it is the contract
- The abstract class provides some operations that will be common to some or all implementations of the ADT, so that these do not need to be written in every class that implements it.
With this modification, the test class can be simplified, using polymorphism, as all the versions of the Dice ADT now implement the same interface.
public class DiceDemo2 { public static void main(String[] args) { ArrayList<IDice> allDice = new ArrayList<IDice>(); allDice.add(new DiceV1()); allDice.add(new DiceV2()); allDice.add(new DiceV3()); for (IDice dice : allDice) { int total; total = dice.roll(); System.out.println("first die: " + dice.die1()); System.out.println("second die: " + dice.die2()); System.out.println("total is: " + total + "\n"); } } }
-
Summary of Features of Classes, Abstract Classes, and Interfaces
Property Class Abstract Class Interface Instances of this can be created Yes No No This can define instance variables and methods Yes Yes No This can define constants Yes Yes Yes The number of these a class can extend 0 or 1 0 or 1 0 The number of these a class can implement 0 0 Any number This can extend another class Yes Yes No This can declare abstract methods No Yes Yes Values of this type can be declared Yes Yes Yes -
Representing Time
It is a common requirement in programs to be able to deal with information that involves time. Example: in online systems you often want to keep track of the amount of time a user spends online doing something, or waiting, and so on. Time is relatively complicated as the same point in time can be expressed in a number of different ways.
Clock and calendar representations
So how can we represent time? Clocks show time in hours, minutes, and some clocks show seconds also. Some clocks show 12hr time, some show 24hr time, and some give a choice between these.
12 hr – 8:31:30
24 hr – 20:31:30
Wed, 02 December 2015, 20:31:30
A simpler representation
The easiest way for a program to handle time, as an integer representing the number of seconds or minutes since a defined point in time, can be hard for the user to work with. Imagine a hospital receptionist saw a prompt like this each time a patient’s arrival time needs to be recorded:
Enter the patient arrival time as the number of minutes since midnight:
However, if time is represented as an integer (whole number) then many operations become easy to perform.
Arrival time = 1234 mins
Seen by doctor at = 1266 minsWaiting time = Time seen by doctor – Arrival time = 32 mins
-
A Program That Uses Time
Here is a simple Java program that records patient visits and calculates the average waiting time that patients have before seeing a doctor.
public static void main(String[] args) { int numberOfVisits = 0; int totalWaitingTime = 0; Scanner in = new Scanner(System.in); String answer; do { int arrivalTime, seenByDoctor, waitingTime; numberOfVisits++; System.out.print("Enter arrival time:"); arrivalTime = in.nextInt(); in.nextLine(); System.out.print("Enter time seen by doctor:"); seenByDoctor = in.nextInt(); in.nextLine(); // assume that subtracting one time from another yields the // difference in minutes as an int waitingTime = seenByDoctor - arrivalTime; totalWaitingTime += waitingTime; System.out.print("Done? Enter 'Q' to quit, anything else to continue: "); answer = in.nextLine(); } while (!answer.equals("Q")); System.out.println("Number of visits: " + numberOfVisits); System.out.println("Total waiting time: " + totalWaitingTime + "minutes"); System.out.println("Average waiting time: " + totalWaitingTime/numberOfVisits + " minutes"); }
An example of a user’s interaction with this program looks like this:
Enter arrival time: 037 Enter time seen by doctor: 0045 Done? Enter 'Q' to quit, anything else to continue: . Enter arrival time: 0123 Enter time seen by doctor: 0145 Done? Enter 'Q' to quit, anything else to continue: Q Number of visits: 2 Total waiting time: 30 minutes. Average wait is 15 minutes.
Although this program is simple to write and easy to understand, it requires the user to transform the times into minutes before entering them.
This is a BAD idea.
Users should be able to enter data in the form they are most comfortable with. For time this might be the form HH:MM (AM/PM). If the program is written to allow this then the user’s interaction with the program becomes more convenient.
Enter arrival time: 12:17PM Enter time seen by doctor: 12:25PM Done? Enter 'Q' to quit, anything else to continue: . Enter arrival time: 1:10PM Enter time seen by doctor: 2:02PM Done? Enter 'Q' to quit, anything else to continue: Q Number of visits: 2 Total waiting time: 60 minutes. Average wait is 30 minutes.
-
Time and Program Design
The program designer’s goal is to meet both the needs of user and of programmers.
We want our program to work in a manner that our users would expect and user input to be entered in the most natural way possible. We also want to make it convenient for the programmer to implement the required functionality and for this to work in an efficient way.
Designing a Time ADT
One way to handle Time would be to provide a Time ADT and a suitable implementation of this. Note that the Java API does provide a number of classes to represent dates and times (a useful set of these are in the package java.time) and all the complexity that these concepts involve, but here we will look at our own, somewhat simpler, representation of Time, as an example of developing a useful ADT.
- The definition of our Time ADT might include properties to hold hour, minute, seconds and am/pm values:
- probably an integer for hour, an integer for minutes and an integer for seconds
- and possibly use an enumerated type for AM/PM or just true/false
The definition could also include operations for reading data as a user might input it and using this to set the properties of a Time object, and for subtracting them to get the difference in minutes. In fact, there could be many more operations – can you think of any others that might be useful.
Using the Time ADT
The users would not have to know anything about the actual class definition. They just enter time values in the form they are used to. The actual implementation of the Time ADT is hidden from them. As you saw previously, this is the idea of abstraction
Back to the design
- Our design goal is to come up with a Time ADT that is:
- A good abstraction
- the concept is familiar to the user
- Safe
- automatically checks for bad input
- Modifiable
- changes made to the ADT won’t hurt client code
- Reusable
- many apps requiring Time can use it
- What are the
- Characteristics of ‘time’
- Operations associated with ‘time’
-
Simple Time ADT Characteristics
- Our Time ADT will be relatively simple. It will represent only hours and minutes, not seconds, and does not contain any information about the date. It will have the following characteristics:
- A Time consists of some number of hours and minutes, and is either before noon (AM) or after noon (PM)
- Twelve Noon is 12:00 PM and Twelve Midnight is 12:00 AM
- All Times are assumed to fall on the same day
- It will have operations to:
- Read a string and parse it, and to return an indication of whether this was successful
- Return the difference in minutes between a Time object and another Time object
- Return the values of each of the hours and minutes components, and whether AM or PM
Simple Time ADT – interface
public interface ITime { int getHour(); int getMinute(); boolean isAm(); boolean readTime(String inputTime); int subtractTime(ITime t); }
-
Simple Time ADT Implementation
The methods getHour, getMinute and isAm will simply return the values of each of the hours and minutes components of the time, and whether the time is AM or PM. The other methods are a bit more complicated, and can be specified as follows:
boolean readTime(String input)
Precondition: input is a string representing a point in time
Postconditions: Leading whitespace characters are ignored; readTime attempts to parse the input as a time in the format <HH>:<MM><A>, where <HH> is an integer between 1 and 12, <MM> is an integer between 0 and 59, and <A> is either “AM” or “PM”. If a properly formatted time can be read, currentTime holds read in values; otherwise, method returns arbitrary time.
Returns: If can parse input as valid time; returns true otherwise, returns false
int subtractTime(Time t)
Precondition: t is a well-defined time.
Postcondition: None.
Returns: The difference, in minutes, between this time and t. If this time occurs before t, this difference is negative.
We need to create the data structure class that will implement the Time ADT. This class will implement the ITime interface. We need to implement storage for the date and operations which act on the data in the form it is stored. One way to store the data that we need is to use three fields to store the hour, minute and AM/PM values.
public class Time implements ITime { private int hour; private int minute; private boolean am;
The methods getHour, getMinute and isAm will be simple getter methods for these fields.
-
Simple Time ADT Operations
- The readTime method has quite a lot of work to do. It has to:
- Split the input into parts (tokens)
- Convert the hour/minute data to integers
- Check that these integers are within range
- Check that the AM/PM value is valid
- Use the obtained values to set the field values
- Return true if all input was valid, false otherwise
Here's a possible implementation of this:
public boolean readTime(String inputTime) { StringTokenizer st = new StringTokenizer(inputTime, ": "); try { while (st.hasMoreElements()) { hour = Integer.parseInt((String) st.nextElement()); minute = Integer.parseInt((String) st.nextElement()); if ((0 > getHour()) || (getHour() > 12) || (0 > getMinute()) || (getMinute() > 59)) { invalidate(); // private method to set default values return false; } String ampm = (String) st.nextElement(); if (ampm.equals("AM")) { am = true; } else if (ampm.equals("PM")) { am = false; } else { invalidate(); return false; } } } catch (NumberFormatException e) { invalidate(); return false; } return true; }
- The subtractTime method also has some work to do. It needs to:
- Convert the hours value into a 24 hour clock value
- Get the 24 hour clock value from the Time object to be subtracted
- Calculate the total difference in minutes, taking the hours and minutes into account
Here is a possible implementation of this:
public int subtractTime(Time t) { int hour24 = hour; if (!isAm()) { hour24 += 12; } Time ta = (Time) t; int taHour24 = ta.getHour(); if (!ta.isAm()) { taHour24 += 12; } return (hour24 - ta.getHour()) * 60 + (this.minute - ta.getMinute()); }
Simple Time ADT – alternative implementation
This is not the only way that the Time class could be implemented. For example, we could have chosen a different way to store the data. In this alternative version, there is only one field, containing the time data as a number of minutes since midnight. Note that the getter methods in this case have to do a little bit more work:
public class TimeV2 implements ITime { private int minutes; public int getHour() { return getMinutes() / 60; } public int getMinute() { return getMinutes() % 60; } public boolean isAm() { return getHour()<12; }
The implementations of readTime and subtractTime will also be somewhat different. These are not shown here, but how do you think they would work?
-
Encapsulation and Information Hiding
Note that a program that uses your Time class (the client program) does not need to know how either the data storage or the operations are implemented. It just needs to know that the operations are available. The ADT class encapsulates the data and hides the information about the implementation from client code.
- Public
- Declares just those operations and properties the client will need
- Member functions (methods)
- Private
- Definitions of the ADT representation
- Not accessible to client code
- Data members
- There may be good reasons for implementing an ADT in a particular way:
- May be more convenient for the programmer writing the ADT code
- May give better performance
Which version of the Time class do you think is "better"?
Client program
Here is a version of the patient waiting time program you saw previously, now written as a client program that makes use of our Time ADT class. Only the do-while loop is shown here as the rest of the code is unchanged. Note that this version supports the way that we want users to be able to input times.
do { ITime arrivalTime = new Time(); ITime seenByDoctor = new Time(); int waitingTime; numberOfVisits++; System.out.print("Enter arrival time:"); arrivalTime.readTime(in.nextLine()); System.out.print("Enter time seen by doctor:"); seenByDoctor.readTime(in.nextLine()); waitingTime = seenByDoctor.subtractTime(arrivalTime); totalWaitingTime += waitingTime; System.out.print("Done? Enter 'Q' to quit, anything else to continue: "); answer = in.nextLine(); } while (!answer.equals("Q"));
Code reuse and Java
A good implementation of an ADT like Time should be reusable. Any program that imports your source class and has access to the implementation can use the Time ADT
- Java terminology:
- Abstraction
- interface /abstract class / class
- Information hiding
- Encapsulation
- Code reuse
These are the key terms used so far. You should know what each of them means and why it is important and where they can be used
-
Algorithms
An algorithm is a plan, a set of step-by-step instructions to solve a problem.
If you can tie shoelaces, make a cup of tea, get dressed or prepare a meal then you already know how to follow an algorithm.
In an algorithm, each instruction is identified and the order in which they should be carried out is planned. Algorithms are often used as a starting point for creating a computer program, and they are sometimes written as a flowchart / pseudocode / structure chart / UML
If we want to tell a computer to do something, we have to write a computer program that will tell the computer, step-by-step, exactly what we want it to do and how we want it to do it.
This step-by-step program will need planning, and to do this we use an algorithm.
Computers are only as good as the algorithms they are given. If you give a computer a poor algorithm, you will get a poor result – hence the phrase:
‘Garbage in, garbage out’
Algorithms are used for many different things including calculations, data processing and automation.
Representing an algorithm
- Two simple ways that algorithms can be represented –
- Pseudocode
- Flowcharts
Most programs are developed using programming languages. These languages have specific syntax that must be used so that the program will run properly. Pseudocode is not a programming language; it is a simple way of describing a set of instructions that does not have to use specific syntax.
Pseudo Code (resembling) (programming language) - Writing in pseudocode is similar to writing in a programming language.
- Each step of the algorithm is written on a line of its own in sequence
- No real rules as it does not need to be interpreted by a computer
- In the presentations you’ll see a mix of pseudocode and Java implementations
- Pseudocode examples often conform closely to Java syntax, but typically contain one or more lines of higher-level description.
- Java implementations have actually been compiled and tested. You can tell which examples represent Java in the presentations as these are shown with coloured syntax highlighting
Pseudocode is used to describe an algorithm which is then implemented using Java.
In pseudocode, INPUT asks a question. OUTPUT prints a message on screen.
A simple program could be created to take a temperature value and print a message to indicate whether this is above or below freezing. This program represented in pseudocode might look like this:
OUTPUT 'Please enter the temperature' INPUT user inputs the temperature STORE the user's input in the temperature variable IF temperature < 0 THEN OUTPUT 'Below freezing!' ELSE OUTPUT 'Above freezing!'
-
Flowcharts
A flowchart is a diagram that represents a set of instructions.
Flowcharts normally use standard symbols to represent the different instructions. There are few real rules about the level of detail needed in a flowchart. Sometimes flowcharts are broken down into many steps to provide a lot of detail about exactly what is happening. Sometimes they are simplified so that a number of steps occur in just one step.
Flowchart Symbols Name Symbol Usage Start or Stop Beginning and end points in a sequence Process An instruction or command Decision A decision, yes or no Input or Output Input – data received by the computer, Output – data sent by the computer Connector Connects the symbols, showing the direction of flow -
Complexity of Algorithms
We are often interested in how complex an algorithm is – in other words, how long it will take to run
Here is a simple algorithm, in pseudocode, that counts the number of people, N, in a room:
LET N = 0 FOR each person in room SET N = N+1
Q: How long will this take?
A: Depends how big N isHere is another algorithm which also counts the number of people, N, but does it by by counting pairs of people:
LET N = 0 FOR each pair of people in room SET N = N+2
Q: How long will this take?
A: Depends how big N is
Q: Will it take as long as the first version?
A: No
Q: Does it give the correct answer?
A: Not always? Why not? -
Evaluating Solutions
Before an algorithm can be programmed, it is important to make sure that it properly satisfies the problem, and that it does so efficiently.
Think back to previous counting pairs example, although quicker to get the answer it wouldn’t be the right answer all the time
What would you prefer: code that runs quickly or code that gives you the right answer ?
Evaluation is the process that allows us to make sure our solution does the job it has been designed to do. Then we can think about how it could be improved.
- Once written, an algorithm should be checked to make sure it:
- is easily understood – is it fully decomposed?
- is complete – does it solve every aspect of the problem?
- is efficient – does it solve the problem, making best use of the available resources (eg as quickly as possible/using least space)?
- meets any design criteria we have been given
If an algorithm meets these four criteria it is likely to work well. The algorithm can then be programmed. Failure to evaluate can make it difficult to write a program.
Evaluation helps to make sure that as few difficulties as possible are faced when programming the solution.
Dry runs
One of the best ways to test a solution is to perform what’s known as a ‘dry run’. With pen and paper, work through the algorithm and trace a path through it. A dry run (or a practice run) is a testing process where the effects of a possible failure are intentionally mitigated.
Dry runs are also used with completed programs.
Programmers use dry runs to help find errors in their program code.
For example, we have looked at a simple algorithm was created to ask for a temperature value and report whether this is below or above freezing. You could try out this algorithm – give it a dry run. Try two temperatures, 1 and -1. When using temperature 1, where does the algorithm go? Does it give the right output? If you use temperature -1, does it take you down a different path? Does it still give the correct output?
If the dry run doesn’t give the right answer, there is something wrong that needs fixing. Recording the path through the algorithm will help show where the error occurs.
Always have test cases for every boundary in your code.
Test Instruction Output temperature Test 1 Output 'Please enter the temperature' Please enter the temperature User inputs the temperature Store the user's input in the temperature variable -1 Check whether temperature < 0 Output message Below freezing Stop Test 2 Output 'Please enter the temperature' Please enter the temperature User inputs the temperature Store the user's input in the temperature variable 1 Check whether temperature < 0 Output message Above freezing Stop Test 3 Output 'Please enter the temperature' Please enter the temperature User inputs the temperature Store the user's input in the temperature variable 0 Check whether temperature < 0 Output message Above freezing Stop Always test around the Boundary values of all conditions . In the example dry run above, we chose a value of temperature just above 0, one just below, and 0 itself. Is the output what you would expect for a temperature of 0? Does this suggest an improvement that could be made to the design of the algorithm.
-
Why are Data Structures and Algorithms Important
Programmers need an understanding of how to assess costs and benefits to be able to adapt to new design challenges.
Data structures provide a way to organise the information that programs need to store to allow them to perform their required functions. Programmers need to know how to select the best data structure for a particular application, and to implement it if it doesn't already exist.
Algorithms provide ways for programs to perform tasks. Programmers need to know how to design an algorithm to accomplish a particular task, which may include making use of well-known algorithms for common operations.
Data structures and algorithms are often studied together as they are closely related. The operations of a data structure implement algorithms, and the way these work depend on the way the data structure stores information. Knowledge of data structure and algorithms can allow you to improve the performance and efficiency of your programs.
- This requires an understanding of
- The principles of algorithm analysis
- How to weigh up tradeoffs
- e.g. it is quite common to reduce time requirements at the expense of an increase in space requirements, or vice versa – common in many phases of software design and implementation
- When and when not to reinvent common practice
- e.g. use of certain well known algorithms or design patterns