Week 4: Trees! (including Binary Search Trees and Tries)

Core Trees

  • Trees can naturally represent data in the real world eg family tree, decision tree etc.
  • Organisation of tree can define type of tree:
  • Root most important -> heap
  • Organised by char frequency -> Huffman Tree
  • Organised by node ordering -> search trees

Core: Defining Trees

  • Immutable properties of tree:
  • Must have only 1 parent node (node with no other parents).
  • Child are any nodes that aren't the parent.
  • Leaf nodes are any nodes that don't have children.
  • No cycles in a tree.
  • Each node can only have one (or less if parent) parents.

Core: Binary Trees

  • Generic Tree: a parent can have any number of children.
  • Binary Tree: a parent can have at most 2 children.
  • Each nodes need: a value, a parent, a left child (could be null) and a right child (could be null).
  • Example code:

    public class BinaryTree { TreeNode root; }

    private class TreeNode { private E value; private TreeNode parent; private TreeNode left; private TreeNode right;

    public TreeNode(E val, TreeNode<E> par) {
      this.value = val;
      this.parent = par;
      this.left = null;
      this.right = null;
    }
    
    public TreeNode<E> addLeftChild(E val) {
      this.left = new TreeNode<E>(val, this);
      return this.left;
    }
    

    }

Core: Pre-Order Traversals

  • Pre-order traversal (aka depth first search):
  • Visit yourself
  • Visit all of your left subtree.
  • Visit all of your right subtree.
      private void preOrder(TreeNode<E> node) {
        if (node != null) {
          node.visit();
          preOrder(node.getLeftChild());
          preOrder(node.getRightChild());
        }
      }
      public void preOrder() {
        preOrder(root);
      }
    

Core: Post-Order, In-Order and Level-Order Traversals

  • Post-order traversal:
  • Visit your left subtree.
  • Visit your right subtree.
  • Visit yourself.

  • In-order traversal:

  • Visit left subtree.
  • Visit yourself.
  • Visit your right subtree.

  • Level-order traversal (breadth-first traversal):

  • Keep a "to visit" list.
  • When you visit a node, place its child on the list.
  • Pull a node off the list, and place its child on the list and so on.
      public class BinaryTree<E> {
        TreeNode<E> root;
    
        public void levelOrder() {
          Queue< TreeNode<E> > q = new LinkedList< TreeNode<E> >();
          q.add(root);
          while(!q.isEmpty()) {
            TreeNode<E> curr = q.remove();
            if (curr != null) {
              curr.visit();
              q.add(curr.getLeftChild());
              q.add(curr.getRightChild());
            }
          }
        }
      }
    

Core: Introduction to Binary Search Trees

  • Binary search
  • Requires sorted array of items.
  • Start at middle, if middle is more than what you want, go for right-half, otherwise left.
  • Slow to insert into array.
  • Binary search tree
  • Get O(log n) search
  • Get (O(1)) insertion.
  • Max 2 children per node (same as binary tree).
  • Left subtrees must be less than parent.
  • Right subtree must be greater than parent.
  • Searching in a BST: start at root. Are you at the node? If not, throw away one half of the tree and start again.

Run Time Analysis of BSTs

Core: Performance of BSTs and Balancing, Part 1

  • BSTs structure depend on insertion order.
  • Algorithm for isDictionaryWord(String wordToFind)
  • Start at root.
  • Compare word to current node.
    • If null, return false.
    • If less than current node, search left subtree.
    • If greater, search right subtree.
    • If equal, return true.
  • Best case: O(1) - word is at root node.
  • Worst case: O(n) - all nodes are less than root node, so you have to iterate through every node.
  • Note: the data structure can be setup such that this situation doesn't happen.

Core: Performance of BSTs and Balancing, Part 2

  • Consider maximum distance until leaf: unbalanced trees may have really long paths on one side.
  • Enter: Balanced BSTs.
  • Balanced BSTs: height on one side should be no more than 1 more than the other side.
  • abs(left height - right height) <= 1
  • height = log(n)
  • When using a BST, worst case search is O(log n).

Tries

Core: Introduction to Tries

  • Takes advantage of the structure of the keys it stores.
  • Comes from word "reTRIEval".
  • Tries can have more than 2 children at each node (depends on size of the key alphabet).
  • Find word by walking through the tree, a character at a time:
  • Finding hey:
    1. Start at root. Look for path for h. Follow node.
    2. Find path for e at next node.
    3. Find path for y at next node.
    4. Next node either has hey or it's missing from trie.

Core: Performance of Tries

  • Run time of finding that a word is not in the dictionary in balanced BST: O(log n)
  • Run time of the same in Trie depends on length of word you're looking up.

Core: Implementing a Trie

  • Node keeps track of whether it stores a word and what the word is:

    class TrieNode { boolean isWord; HashMap children; String text; }

  • Actual Trie structure should store the root node and some methods to help with insertion, deletion etc.

  • Autocomplete algorithm:
  • Start at root node.
  • Perform a level-order (aka breadth first search) tree traversal down tree until you find node or not.