Well, it's that time again. I'm glad to see that you've all survived the broiling maelstrom that was March. I would love to make small-talk about the weather, but I'm driven by a purpose today.
This week
we’ve been focusing on sorting and efficiency. Sorting is an interesting
problem, because it has a huge number of solutions, despite its apparent
simplicity. The simple algorithms that we learned about in 108 (bubble sort,
select sort, insertion sort) all perform decently well when the list is close
to sorted, but their complexity grows exponentially in the average and worst
cases. This is because these three algorithms perform a number of redundant
operations over and over again. Bubble sort, for instance, works by making
comparisons between adjacent elements and swapping them if they are in the
wrong order. Even in small lists, this means that the algorithm will make
several passes and make the same comparisons several times. The larger the
list, the more redundant comparisons are made and the worse the performance.
More enlightened algorithms such as quicksort, heap sort and merge sort use
different methods to avoid this exponential growth when sorting large lists. Merge
sort, for example, uses a divide and conquer approach, wherein it starts from
very small (1 element) sorted sublists that are then merged. This is one method
of circumventing redundant comparisons. The result is a good (
O(n log(n))) time complexity for
both average and worst case. Quicksort
has the same complexity for the average case, but something interesting happens
in the worst case. Quicksort operates by choosing pivot values, and putting
smaller values to the left of the pivot and greater values to the right. In
many cases this proves to be faster than other O(n log(n))) algorithms, but in
the case of a list that is reverse-ordered, quicksort performs very poorly. In
the case of a reverse-ordered list, almost every element will be a pivot at
some point, and be compared to every other element, resulting in a poor O(n2)
complexity.
Anyway,
this whole efficiency thing has made me reconsider some of my programming
habits. While my “if it hurts, it works” way of doing things has been effective
enough thus far, I imagine in the future I will have to focus more on how my program works, rather than just
acknowledging that it works. So far I’ve been satisfied with achieving the
right results, but I know that in the real world the efficiency with which my
program operates will determine my degree of success. Or perhaps improvements
in processing power and memory will be so great by the completion of my degree
that efficient code will be obsolete; an artifact of a more austere past.
Perhaps we will even have codes that write themselves… Sloppy inefficient
programs making sloppier, more inefficient programs. It’s truly a utopian
dream.