Monday, 12 May 2014

What's Up With Selection Sorts?!

Selection sorts are my favourite sorts because they're intuitive and I think most people when they sort a list are performing selection sorts without realising it. The idea is to scan through the list and get the smallest number then bring that to the front, then you look at the list starting from the index of 1 (you started at an index of 0) and bring the smallest number to that index. You repeat the process until you get to the end of the list. Now we don't want to bother making a new list to store the sorted elements in which means we'll need to be swapping elements within the list itself. This just means that the value (call it X) at the current index which shouldn't be there will be replaced by a value (Y) that should be there and X just goes into a different place where it probably shouldn't be anyway, but it will be put in the right place eventually.

If we use the list: 5  3  8  4  8  9  2 where list[0]=5
set 'index' to 0 (referring the the current index of the 5)
set indexOfMin to 0
starting at 0 and ending at the end of the list, compare each element to list[minimum] and if the value is smaller, set minimum to the index of that element.
swap minimum and index
increment index by 1

here's what it will look like: (the in-brackets element is the one being compared and the underlined number is the smallest number yet to be put in place, also keep an eye on the 8's)
[5] 3  8  4  8  9  2
 2 [3] 8  4  8  9  5
 2  3 [8] 4  8  9  5
 2  3  4 [88  9  5
 2  3  4  5 [8] 9  8
 2  3  4  5  8 [9] 8
 2  3  4  5  8  8 [9]

That was very efficient, but I want you to notice something: the order of the 8's changed because the purple 8 was swapped with the 5 which meant it ended up to the right of the blue 8. When two elements are equal but their order can be changed by a sorting algorithm, that algorithm is considered unstable. That doesn't mean it's particularly bad but it can be detrimental in some cases. For example if you're in iTunes (which sucks by the way and I have no idea why I continue to use it) and you've sorted your songs by song and then sort them by name, it means that for each block of songs belonging to a particular album, the names are going to still be sorted because their order with respect to eachother wasn't compromised. According to the sorting algorithm, all those songs belonging to the same album were equal but the algorithm didn't change their order amongst themselves. It would be nice to have an algorithm which treats the blue and purple 8s as equal (which the selection sort does) but does not change their relative positions for no good reason.

Another disadvantage of selection sorts is that they are not incremental. This means that if you were to add another element onto the end of the list after the algorithm is already done sorting the elements, the algorithm wouldn't easily handle sort that number into the list. Let's see what happens if we add a 1 to the end of the list we ended up with above:

[2] 3  4  5  8  8  9  1
 1 [3] 4  5  8  8  9  2
 1  2 [4] 5  8  8  9  3
 1  2  3 [5] 8  8  9  4
...and so on until every number gets shifted to the right by 1.

The only way for a selection sort to work well is for it to start off with an unsorted list and sort the whole thing without needing to consider changes down the line because the sorted part that grows from the left of the list is not supposed to change, it is supposed to be set in stone. Next up, insertion sorts.

No comments:

Post a Comment