• Introduction

    A tree is a hierarchical structure with its data (nodes) arranged according to some ordering property.

    Searching linearly in a list is O(n) and can be expensive for a large data collection. However a binary search for sorted collection is of the order O(log2 n) providing improved performance for large collections of data, for example, finding an item from billion items arranged in sorted order requires at most only 30 comparisons. Similarly, binary search tree (BST) organizes the data such that most operations are O(log2 n) on average.

    This week we will:
    Introduce the Tree ADT and its usefulness for solving problems
    Learn about tree insertion and deletion strategies
    Learn about tree traversals
    Understand the difference between binary trees, binary search trees and general trees

    Trees can have different types:
    General trees – each node can have any number of children
    Binary trees – each node has at most two children
    Binary search tree (BST)– the values are ordered
  • Definitions

    A tree is a collection of nodes.
    the collection can be empty; otherwise,
    it consists of a special node, called the root, and zero or more subtrees

    Tree Example 1

    Node - an element of a tree
    Root node - first node in a tree structure
    level - The root node is at level 0. The children of root node are at level 1 and so on.
    Parent node - a node which is predecessor of another node.
    Child node - A node which is a descendant of another node. A node can have only one parent and children of the same parent are called siblings.
    Leaf - Nodes without children are called leaves or external nodes. Nodes that have one or more children are known as internal nodes.
    Subtree - Every node of a tree is the root of a subtree.
    An edge is a connection between two nodes
    A path between two nodes is defined as a sequence of nodes from one node to the other.. The length of this path is the number of edges on the path.
    Depth of a node is the length of the unique path from the root to that node. . Thus, the root is at depth 0.
    Height of a node is the length of the longest path from that node to a leaf. Height of tree is equal to the height of the root node. All leaves are at height 0.
    The depth of a tree is equal to the depth of the deepest leaf; this is always equal to the height of the tree.
    Diagram of a tree structure is highly recursive. Therefore, as you would expect, the algorithms that manipulate trees are also recursive in nature.
  • ADT Operations

    create(root)		// add root node to tree
    isEmpty( )		// check for empty tree
    insert (root, node)	// inserts new node
    delete (root, node)	//remove a node
    traverse(root)		// visit all nodes in tree
    retrieve(root, node)	// returns pointer to a subtree
    search (root, node)	//find a node
    

  • Binary Tree

    A binary tree is an ordered tree with the following properties:
    every node has at most two children
    each child node is either a left child or a right child
    a left child precedes a right child in the order of children nodes

    A tree is ordered if there is some meaningful linear order among the children of each node.

    An example of a binary tree (i.e. one in which each node has no more than two children)

    Tree Example 2

    This tree also has the property that each node is greater than either of its children.

    Consider the hierarchical ordering for the following tree. In this tree, each node is greater than all of the nodes in its left subtree and less than all of the nodes in its right subtree. This is called its ordering property and defines a BST.

    Tree Example 3

    Trees representing arithmetic expressions

    For example: (A*B)+(C-(D*E))

    Tree Example 4

    Reverse Polish Notation

    Recall that expressions are easily evaluated using a stack if they are available in postfix form: AB*CDE*-+

    Trees can be traversed in such a way as to produce the postfix form, e.g. a post order traversal yields the postfix form of the expression.

    Decision trees

    Decision trees can be used to represent a number of different outcomes that result from answering a series of yes-no questions

    Decision Tree Example

    Huffman trees

    Huffman codes compress data and are used in JPEG still image compression. A Huffman tree is a binary tree used to represent Huffman codes for text/characters that might appear in a message /file.

    Huffman Code Example

    (For more details see Section 16.3, Introduction to Algorithms)

  • Binary Tree Representation

    A binary tree could be represented using:
    1. Linked representation
    2. Arrays representation

    Linked Representation

    A tree node could be represented with a data and pointer fields for left and right subtree.

    Linked Representation Example

    For example, consider the following tree

    Tree Example 5

    Tradeoffs

    1. Faster updates (insertions and deletions are O(log2n) )
    2. Extra memory for pointers

    Array representation e.g. Depth (d)=3

    1. Allocate an array of size 2(d + 1) -1
    2. Store root in location 1
    3. For node in location n,
    store left child in location 2n,
    right child in location 2n+1


    1 +
    2 *
    3 -
    4 A
    5 B
    6 C
    7 *
    8
    9
    10
    11
    12
    13
    14 D
    15 E

    Tradeoffs

    1. Fast access (given a node, its children and parent can be found very quickly)
    2. Slow updates (insertions and deletions require physical reordering)
    3. Wasted space (partially filled trees)
  • Traversing Binary Tree

    A tree traversal visits each node of a tree exactly once.

    Modes of traversal

    When visiting a node we have three choices
    1. Process the data
    2. Process nodes in the left subtree
    3. Process nodes in the right subtree

    Example arithmetic expression tree

    Tree Example 6

    Steps for a Preorder traversal of a binary tree

    1. Process root
    2. Do a preorder traversal of the nodes in the root's left subtree (recursively)
    3. Do a preorder traversal of the nodes in the root's right subtree (recursively)

    In processing a node if we simply write out the data contained in the node, then a preorder traversal of tree T produces +*AB-CD

    Steps for an Inorder traversal of a binary tree

    1. Recursively visit left subtree
    2. Process root
    3. Recursively visit right sub-tree

    An inorder traversal of T produces (A*B)+ (C-D) (the infix form)

    Steps for a Postorder traversal of a binary tree

    1. Recursively visit left subtree
    2. Recursively visit right subtree
    3. Process Root

    A postorder traversal of T produces          AB*CD-+ (the postfix form)

    These traversals can be done on any binary tree, not just expression trees.
  • Inserting into a BST

    In order to insert a new item, we first need to determine the appropriate location where the new item can be added while preserving the ordering property.

    Consider the following tree

    Tree Example 7

    Inorder traversal of this tree gives 19, 21, 23, 25, 49, 77, 83, that is, the inorder traversal visits the nodes in ascending order.

    To find the location to insert a new value, we start at root, and then either go left, if new value is less than root, otherwise go right.

    Example: Adding 10 to the previous tree.

    1. Compare 10 to 25 (the root).
    Since 10 < 25, add 10 to the left subtree of 25
    2. Compare 10 to 21 (the root of the left subtree).
    Since 10 < 21, add 10 to the left subtree of 21
    3. Compare 10 to 19.
    Since 10 < 19, add 10 to the left subtree of 19
    4. As the left subtree is NULL, add 10 at this location. Left pointer of 19 will point to this new node.

    Efficiency of adding to a binary tree

    The efficiency of adding a value to a binary tree is related to the depth of tree. A tree with only one node, that is, the root node is considered to have a depth 0. For the tree considered above, a total of 3 (depth + 1) comparisons were performed.

    Clearly the number of comparisons is determined by the shape/depth of the tree.

    Differing tree configurations

    For example, consider the tree below containing the same data but in a different arrangement.

    Tree Example 8

    The number of comparisons to add 10 would be 5 (depth + 1)

    If the tree is full, or balanced then the depth would be minimum and so would be the comparisons.

    A tree of height h is a full tree if all leaves are at level h. A balanced binary tree is one in which every node above level depth-1 has exactly two children.

  • Deleting from a BST

    Deleting from a tree

    The tree after deletion must preserve the ordering property, i.e., it must remain a BST. Consider the following tree.

    Tree Example 9

    Deletion Strategies

    There are three cases to be considered:

    1. Node to be deleted has no children, that is, a leaf
    2. Node to be deleted has either a left or right subtree
    3. Node to be deleted has both a left and right subtree

    Case 1: Node to be deleted has no children, that is, a leaf
    Example: deleting node 83

    Set the right subtree of 77 to null. The resulting tree is given below.

    Tree Example 10

    Case 2: Node to be deleted has either a left or right subtree
    Example: deleting node 19

    To delete 19, we replace 19 by its left subtree (node 10). The resulting tree is given below.

    Tree Example 11

    Case 3: Node to be deleted has both a left and right subtree
    Example: deleting node 25

    To delete 25, we find its inorder successor (49), that is, the smallest node in the right subtree and copy its value to node 25. This ensures that the ordering property is preserved. The resulting tree is given below.

    Tree Example 12

  • General Trees

    So far we have been studying binary trees, general trees however can have more than two children for each node.

    Tree Example 13

    If we convert a general tree to a binary tree then we can apply what we have learned for binary trees to it.

    Converting a general tree to a binary tree

    This could be achieved by:

    1. Retain edge (connection) from a parent node to only its left-most child at next level
    2. Insert an edge from left-most child (selected above) to connect all siblings from left to right at the same level

    Imagine the nodes have a layout

    General Tree to Binary Tree Example

    Binary version of a general tree

    The general tree shown above can be converted to the following binary version.

    General Tree to Binary Tree Example 2

    This can be represented as

    Tree Example 14

    Preorder Traversal

    (Visit Root, Left, Right)
    A B G H C D E M V F

    Processing order

    1. Process root

    2. Recursively traverse the left-most subtree

    3. Recursively traverse the next subtree to the right

    4. Repeat step 3 until there are no more subtrees

    Postorder Traversal

    (Left, Right, Visit Root)
    H, G, V, M, F, E, D, C, B, A

    Inorder Traversal

    (Left, Visit Root, Right)

    This is left as an exercise.
  • Another Example of General Tree Traversal

    Sample tree to illustrate tree traversal

    Tree Example 15

    Tree after four nodes visited in preorder traversal

    Tree Example 16

    Tree after left subtree visited using preorder traversal

    Tree Example 17

    Tree after completed preorder traversal

    Tree Example 17

    Tree visited using inorder traversal

    Tree Example 18

    Tree visited using postorder traversal

    Tree Example 19