Challenge 1: Augmented AVL Trees
- 5 minsIn order to follow along with this problem and my solution, the reader should be familiar with both Binary Search Trees and AVL Trees.
Problem
Consider an abstract data type that consists of a set of integers, \(S\), on which the following operations can be performed:
-
Add(\(i\)) : Adds the integer \(i\) to \(S\). If this integer is already in \(S\), then \(S\) does not change
-
Average(\(t\)) : Returns the average of all elements of \(S\) that are less than or equal to the integer \(t\). If all the elements of \(S\) are greater than \(t\), then return 0.
Describe how to implement this abstract data type using an augmented AVL tree.
Each operation should run in \(O(log_2(n))\) worst-case time, where \(n = |S|\).
My Solution
Consider a standard, self-balancing AVL BST, with each node \(x\) modified to store the following values:
-
Let each integer \(i \in S\) serve as the key for node \(x\) in the AVL tree.
-
Let \(x\text{.count}\) equal the number of nodes in the subtree rooted at \(x\) whose keys are less than or equal to \(x \text{.key}\) (\(x\) included).
-
Let \(x \text{.keysum}\) equal the sum of the keys of the nodes in the subtree rooted at \(x\) that are less than or equal to \(x \text{.key}\) (\(x\) included).
Note that for a node \(z\) with no children, we have \(z \text{.count} = 1\) and \(z \text{.keysum} = z \text{.key}\).
With these modifications in place, the two specified operations can be performed in \(O(log_2(n))\) worst-case time, in the following ways.
Add(i):
-
Perform BST search on the AVL tree to discern whether \(i\) is already contained in the set. Continue only if \(i\) is not already contained in the set.
-
Create node \(x\) with key \(i\).
-
Insert \(x\) as in the same manner as Binary Search Tree insert, and after insertion, go up all of \(x\)’s ancestor nodes, and for each ancestor node whose key is greater than or equal to \(x \text{.key}\), add \(x \text{.key}\) to the keysum of the ancestor and increment the ancestor’s count attribute by 1.
-
Do AVL rebalancing rotations, except:
- Whenever left subtree rooted at \(y\) needs to be disconnected from node \(x\) during rebalancing rotations, perform subroutine disconnect_left_subtree(\(x\)).
- Whenever subtree rooted at \(y\) must become the new left child of node \(z\) during rebalancing rotations, perform subroutine connect_left_subtree(y, z).
Note that the while loops in both connect_left_subtree and disconnect_left_subtree iterate at most once up the height of the AVL tree, which for an AVL tree with \(n\) nodes is \(\in O(log_2(n))\).
Because this is the same as the standard upper bound on the worst case running time of the unmodified Add(i) AVL operation, we can conclude that the upper bound on the worst case running time of the modifiedAdd(i) AVL operation is also \(O(log_2(n))\).
Average(t):
-
Find the node \(x\) in the AVL BST whose key is the largest integer in \(S\) that is less than or equal to \(t\). Return 0 if no keys in \(S\) are less than \(t\).
-
Begin tallying a cumulitive keysum and count by including \(x\)’s keysum and count.
-
Go up the ancestor nodes of \(x\) until the tree’s root is reached, and for each ancestor node whose key is less than or equal to \(x \text{.key}\), add that ancestor’s keysum and count to the cumulitive keysum and count.
-
When the AVL tree’s root is reached, divide keysum by count and return the result as the average.
Average\((t) \in O(log_2(n))\) because the number of constant time operations is limited by the number of nodes between \(x\) and the root of the tree, which for an AVL tree with \(n\) nodes is limited above by \(\text{floor}(log_2(n))\).