Top k Largest Element and kth Largest Element Problem
This article analysis the top k largest/smallest element problem and the kth largest element problem. One of the method is based on QuickSelect and we will discuss the essential of this method then use it to solve extended problems.
All the algorithms to find top k elements introduced in this article also works for the kth largest problem. Therefore, we only focus on the top k largest problem from now. The problem of finding the kth largest/smallest number in a list or array is also called selection algorithm in computer science; such a number is called the kth order statistic. There are some statistic analysis on CLRS textbook so please refer to the textbook for more discussion.
1. The classic problem: find top k Largest elements
Question: Write an efficient program for printing k largest elements in an array. Elements in array can be in any order.
For example, if given array is [1, 23, 12, 9, 30, 2, 50] and you are asked for the largest 3 elements i.e., k = 3 then your program should print 50, 30 and 23.
Method 1: Nested loop (Bubble k times)
Loop k times. For the ith time, the ith largest element is selected. Such process is similar to Bubble Sort. Some other sorting algorithms like Selection Sort can also be modified to get the k largest elements.
Procedure:
- Modify Bubble Sort to run the outer loop at most k times.
- Print the last k elements of the array obtained in step 1.
Time Complexity: O(nk)
Method 2: Sort the Array
A very simple solution is to sort the given array using a O(nlogn) sorting algorithm like Merge Sort, Heap Sort, etc and return the element at index k-1 in the sorted array.
Procedure
- Sort the elements in descending order in O(n logn)
- Print the first k numbers of the sorted array O(k).
Time complexity: O(nlogn)
Method 3: Max Heap
- Build a Max Heap tree in O(n)
- Use Extract Max k times to get k maximum elements from the Max Heap O(klogn)
Time complexity: O(n + klogn)
Method 4: Min Heap
A better solution than Method 3. We only need to maintain k candidates as the largest elements. Every time when we see a number n that is larger than the smallest element m among the candidates, we remove m and insert n. m should be in top of the heap so we will quickly get it and remove when necessary.
Procedure
- Build a Min Heap MH of the first k elements (arr[0] to arr[k-1]) of the given array. Time complexity: O(k)
- For each element, after the kth element (arr[k] to arr[n-1]), compare it with root of MH.
a) If the element is greater than the root then make it root and call heapify for MH
b) Else ignore it. Time complexity: O((n-k)*logk) - Finally, MH has k largest elements and root of the MH is the kth largest element.
Time Complexity: O(k + (n-k)Logk) without sorted output. If sorted output is needed then O(k + (n-k)Logk + kLogk)
Method 5: QuickSelect (Oder Statistics)
- Use order statistic algorithm to find the kth largest element. Please see the topic selection in worst-case linear time O(n)
- Use QuickSort Partition algorithm to partition around the kth largest number O(n).
- Sort the k-1 elements (elements greater than the kth largest element) O(kLogk). This step is needed only if sorted output is required.
Time complexity: O(n) if we don’t need the sorted output, otherwise O(n+kLogk)
Metohd 6: Blum-Floyd-Pratt-Rivest-Tarjan Algorithm
It is also worth mentioning that the Blum-Floyd-Pratt-Rivest-Tarjan algorithm that has a guaranteed O(n) running time.
2. Follow up problems
The essential idea of QuickSelect
In QuickSelect, the partition procedure group a list (ranging from indices left to right) into two parts, those less than a certain element, and those greater than or equal to the element. After each partition, the unnecessary parts of the array is discarded. Therefore the performance of the algorithm is improved. We may also use this idea in other scenarios.
For example, in Leetcode 230. Kth Smallest Element in a BST, we may also use similar idea. Here is the solution:
1 | public int kthSmallest(TreeNode root, int k) { |
Kth element in 2-Dimension
See Leetcode 378. Kth Smallest Element in a Sorted Matrix. In this problem we want to find the kth smallest element in a matrix where each of the rows and columns are sorted in ascending order.
3. The kth Largest Element Problem
All of the above methods can also be used to find the kth largest (or smallest) element.
Example code using QuickSelect:
LeetCode 215. Kth Largest Element in an Array
Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.
For example,
Given [3,2,1,5,6,4] and k = 2, return 5.
Note:
You may assume k is always valid, 1 ≤ k ≤ array’s length.
1 | public class Solution { |
In the above code, Hoare partition is used. We may also use Lomuto partition by replacing the partition function using the following code:
1 | private int partition(int[] nums, int left, int right) { |
Reference
https://en.wikipedia.org/wiki/Selection_algorithm
http://www.geeksforgeeks.org/kth-smallestlargest-element-unsorted-array/
http://www.ardendertat.com/2011/05/30/my-favorite-interview-question/