Tuesday, 13 May 2014

What's Up With Insertion Sorts?!

Insertion sorts are a little bit like bubble sorts and a lot like selection sorts. In selection sorts you had a growing sublist of elements which were chosen from the entire list and thus set in stone in their current positions. In insertion sorts you still have a growing sublist but the places of its elements are not set in stone. Instead of scanning the whole list for the next smallest element to place in the sorted sublist, the insertion sort only cares about the next element outside the sublist (we'll call this the 'current' element). It gets that element, puts it to one side (a temporary storage) and then looks at the sublist. Going from right-to-left, if an element is larger than the current element it gets shifted to the right and as soon as an element which is larger than current element shows up, you can insert the current element into the sublist. So the sublist will always be sorted but it won't be in its final form until the very end when all the elements of the list have been considered.


We'll use list=[4,2,5,9,1,9,7] as an example.
Clearly the first element list[0] is already a sorted list if it's the only one in it so we can start by looking at list[1], and we want to stop looking once we've looked at all the elements.

for i <-- 1 to length(list) - 1 //it's length(list)-1 because length(list) is 7 but the largest index is 6 because the                                             //list is zero-based
    temp <-- list[i]
    j <-- i-1
    while j>=0 and list[j] > temp // here we're going right-to-left from the current element to the beginning of                                                   //the sublist
            list[j+1] <-- list[j] // the element gets shifted to the right because it's bigger

     list[j+1] <-- temp // the current element gets inserted to the right of the element in the sublist which is                                         //smaller or equal to it (which is the reason the while loop broke)

the red indicates the sorted sublist, and each line is a new iteration.
4  2  5  9  1  9  7
2  4  5  9  1  9  7
2  4  5  9  1  9  7
2  4  5  9  1  9  7
1  2  4  5  9  9  7
1  2  4  5  9  9  7 // the 9's haven't been swapped here
1  2  4  5  7  9  9 // alright this is the final form

One major setback of the insertion sort is requiring a heap of numbers to shift in order for one small number to be inserted to the far left of the sublist like for example when 1 was moved from an index of 4 to 0, the 2, 4, 5, and 9 all ended up being shifted to the right by 1.

Notice that if the element to the left of the empty slot is equal to the current element, the current element gets placed in the empty slot and so never overtakes its equals. This means that unlike the selection sort, the insertion sort is stable. It's also incremental because all it does is take new elements outside the sublist and adds them to it meaning if you're done sorting and a new element gets put onto the end of your list the algorithm doesn't lose any efficiency in sorting the new list because that's what it's been doing the whole time anyway. You can refer to the post on the selection sort for an explanation of stability and incremental...ness (wait nope it's incrementality all good). In conclusion, the insertion sort does a lot of comparisons within the growing sublist whilst not caring much about the rest of the list. It also doesn't need to do any swaps (unless you want to get rid of the temporary storage and make the algorithm work in-place) and instead just slides and inserts elements. This is in comparison to the selection sort who cares a lot about the unsorted list and doesn't need to worry at all about the sublist because it's absolutely sorted in terms of the whole list.

No comments:

Post a Comment