My Advice: Data Structure and Algorithms for Software and Computer Engineer

The Heroic Tales of Sorting Algorithms

Notation:
O(x) = Worst Case Running Time
W(x) = Best Case Running Time
Q(x) = Best and Worst case are the same.

Page numbers refer to the Preiss text book Data Structures and Algorithms with Object-Orientated Design Patterns in Java.

This page was created with some references to Paul's spiffy sorting algorithms page which can be found here. Most of the images scans of the text book (accept the code samples) were gratefully taken from that site.

Sorting Algorithm	Page	Implementation Summary	Comments	Type	Stable?	Asymptotic Complexities
Straight Insertion	495	On each pass the current item is inserted into the sorted section of the list. It starts with the last position of the sorted list, and moves backwards until it finds the proper place of the current item. That item is then inserted into that place, and all items after that are shuffled to the left to accommodate it. It is for this reason, that if the list is already sorted, then the sorting would be O(n) because every element is already in its sorted position. If however the list is sorted in reverse, it would take O(n²) time as it would be searching through the entire sorted section of the list each time it does an insertion, and shuffling all other elements down the list..	Good for nearly sorted lists, very bad for out of order lists, due to the shuffling.	Insertion	Yes	Best Case: O(n). Worst Case: O(n²)
Binary Insertion Sort	497	This is an extension of the Straight Insertion as above, however instead of doing a linear search each time for the correct position, it does a binary search, which is O(log n) instead of O(n). The only problem is that it always has to do a binary search even if the item is in its current position. This brings the cost of the best cast up to O(n log n). Due to the possibility of having to shuffle all other elements down the list on each pass, the worst case running time remains at O(n²).	This is better than the Strait Insertion if the comparisons are costly. This is because even though, it always has to do log n comparisons, it would generally work out to be less than a linear search.	Insertion	Yes	Best Case: O(n log n). Worst Case: O(n²)
Bubble Sort	499	On each pass of the data, adjacent elements are compared, and switched if they are out of order. eg. e₁ with e₂, then e₂ with e₃ and so on. This means that on each pass, the largest element that is left unsorted, has been "bubbled" to its rightful place at the end of the array. However, due to the fact that all adjacent out of order pairs are swapped, the algorithm could be finished sooner. Preiss claims that it will always take O(n²) time because it keeps sorting even if it is in order, as we can see, the algorithm doesn't recognise that. Now someone with a bit more knowledge than Preiss will obviously see, that you can end the algorithm in the case when no swaps were made, thereby making the best case O(n) (when it is already sorted) and worst case still at O(n²).	In general this is better than Insertion Sort I believe, because it has a good change of being sorted in much less than O(n²) time, unless you are a blind Preiss follower.	Exchange	Yes.	NOTE: Preiss uses a bad algorithm, and claims that best and worst case is O(n²). We however using a little bit of insight, can see that the following is correct of a better bubble sort Algorithm (which does Peake agree with?) Best Case: O(n). Worst Case: O(n²)
Quicksort	501	I strongly recommend looking at the diagram for this one. The code is also useful and provided below (included is the selectPivot method even though that probably won't help you understanding anyway). The quick sort operates along these lines: Firstly a pivot is selected, and removed from the list (hidden at the end). Then the elements are partitioned into 2 sections. One which is less than the pivot, and one that is greater. This partitioning is achieved by exchanging values. Then the pivot is restored in the middle, and those 2 sections are recursively quick sorted.	A complicated but effective sorting algorithm.	Exchange	No	Best Case: O(n log n). Worst Case: O(n²) Refer to page 506 for more information about these values. Note: Preiss on page 524 says that the worst case is O(n log n) contradicting page 506, but I believe that it is O(n²), as per page 506.
Straight Selection Sorting.	511	This one, although not very efficient is very simply. Basically, it does n² linear passes on the list, and on each pass, it selects the largest value, and swaps it with the last unsorted element. This means that it isn't stable, because for example a 3 could be swapped with a 5 that is to the left of a different 3.	A very simple algorithm, to code, and a very simple one to explain, but a little slow. Note that you can do this using the smallest value, and swapping it with the first unsorted element.	Selection	No	Unlike the Bubble sort this one is truly Q(n²), where best case and worst case are the same, because even if the list is sorted, the same number of selections must still be performed.
Heap Sort	513	This uses a similar idea to the Straight Selection Sorting, except, instead of using a linear search for the maximum, a heap is constructed, and the maximum can easily be removed (and the heap reformed) in log n time. This means that you will do n passes, each time doing a log n remove maximum, meaning that the algorithm will always run in Q(n log n) time, as it makes no difference the original order of the list.	This utilises, just about the only good use of heaps, that is finding the maximum element, in a max heap (or the minimum of a min heap). Is in every way as good as the straight selection sort, but faster.	Selection	No	Best Case: O(n log n). Worst Case: O(n log n). Ok, now I know that looks tempting, but for a much more programmer friendly solution, look at Merge sort instead, for a better O(n log n) sort .
2 Way Merge Sort	519	It is fairly simple to take 2 sorted lists, and combine the into another sorted list, simply by going through, comparing the heads of each list, removing the smallest to join the new sorted list. As you may guess, this is an O(n) operation. With 2 way sorting, we apply this method to an single unsorted list. In brief, the algorithm recursively splits up the array until it is fragmented into pairs of two single element arrays. Each of those single elements is then merged with its pairs, and then those pairs are merged with their pairs and so on, until the entire list is united in sorted order. Noting that if there is every an odd number, an extra operation is added, where it is added to one of the pairs, so that that particular pair will have 1 more element than most of the others, and won't have any effect on the actual sorting.	Now isn't this much easier to understand that Heap sort, its really quite intuitive. This one is best explain with the aid of the diagram, and if you haven't already, you should look at it.	Merge	Yes	Best and Worst Case: Q(n log n)
Bucket Sort	526	Bucket sort initially creates a "counts" array whose size is the size of the range of all possible values for the data we are sorting, eg. all of the values could be between 1 and 100, therefore the array would have 100 elements. 2 passes are then done on the list. The first tallies up the occurrences of each of number into the "counts" array. That is for each index of the array, the data that it contains signifies the number of times that number occurred in list. The second and final pass goes though the counts array, regenerating the list in sorted form. So if there were 3 instance of 1, 0 of 2, and 1 of 3, the sorted list would be recreated to 1,1,1,3. This diagram will most likely remove all shadows of doubt in your minds.	This sufferers a limitation that Radix doesn't, in that if the possible range of your numbers is very high, you would need too many "buckets" and it would be impractical. The other limitation that Radix doesn't have, that this one does is that stability is not maintained. It does however outperform radix sort if the possible range is very small.	Distribution	No	Best and Worst case:Q(m + n) where m is the number of possible values. Obviously this is O(n) for most values of m, so long as m isn't too large. The reason that these distribution sorts break the O(n log n) barrier is because no comparisons are performed!
Radix Sort	528	This is an extremely spiffy implementation of the bucket sort algorithm. This time, several bucket like sorts are performed (one for each digit), but instead of having a counts array representing the range of all possible values for the data, it represents all of the possible values for each individual digit, which in decimal numbering is only 10. Firstly a bucked sort is performed, using only the least significant digit to sort it by, then another is done using the next least significant digit, until the end, when you have done the number of bucket sorts equal to the maximum number of digits of your biggest number. Because with the bucket sort, there are only 10 buckets (the counts array is of size 10), this will always be an O(n) sorting algorithm! See below for a Radix Example. On each of the adapted bucket sorts it does, the count array stores the numbers of each digit. Then the offsets are created using the counts, and then the sorted array regenerated using the offsets and the original data.	This is the god of sorting algorithms. It will search the largest list, with the biggest numbers, and has a is guaranteed O(n) time complexity. And it ain't very complex to understand or implement. My recommendations are to use this one wherever possible.	Distribution	Yes	Best and Worst Case: Q(n) Bloody awesome!

Radix Sort Example:

First Pass:

Data: 67 50 70 25 93 47 21

Buckets on first pass (least significant digit): index 0 1 2 3 4 5 6 7 8 9 count 2 1 0 1 0 1 0 2 0 0 offset0 2 3 3 4 4 5 5 7 7

Data after first pass 50 70 21 93 25 67 47That data is created by doing a single pass on the unsorted data, using the offsets to work out at where each item belongs.
For example, it looks at the first one 67, then at the offsets for the digit 7, and inserts it into the 5th position. The offset at 7 is then incremented, so that the next value encountered which has a least significant digit of 7 is placed into the 6th position. Continuing the example, the number 50 would then be looked at, and inserted into the 0th position, its offset incremented, so that the next value which is 70 would be inserted into the 1st position, and so on until then end of the list.
As you can see, this data is sorted by its least significant digit.

Second Pass:

Data after first pass 50 70 21 93 25 67 47

Buckets on first pass (most significant digit): index 0 1 2 3 4 5 6 7 8 9 count 0 0 2 0 1 1 1 1 0 1 offset0 2 3 3 4 4 5 5 7 7

Data after second pass (sorted) 21 25 47 50 67 70 93

Look at this diagram for another example, noting that the "offsets" array is unnecessary

Images: Straight Insertion - Figure 15.2