Saturday 29 March 2014

Week 11 - Sorting and Efficiency

Before we get into sorting, let's do a quick recap of what I said last week about runtime.  In CSC148 we only care about O, not so much about Ω or Θ like we did in CSC165.  O is the upper bound on the worst case scenario on an input.  Formally, as we learned in CSC165, the definition of O is as follows (apologies if this brings up some scarring memories):

(Taken from the CSC165 notes from last semester)
Essentially the above states that for a function g, there exists a constant and a break point so that for all inputs of length n greater than that break point, there is a function f that, when multiplied by the constant, is greater than the function g.

The most common runtimes are as follows:

  • O(c) where the runtime is a constant positive number, rare like shiny pokemon; example: pop()
  • O(cn) runtime is some constant factor x the length of the input
  • O(cn^2) can also be to any constant exponent, often involves some nesting loops
  • O(clogn) like binary search, divides the input in halves
  • O(cn!) depends on the factorial of the input...ew
  • O(cn^n) ...double ew.

Fact: my favourite algorithm for sorting is quite possibly the most efficient one you could get.  Bogo sort.  To understand what bogo sort, picture an ordinary card deck with 52 cards.  Now picture yourself just tossing the entire deck into the air and praying that they land in a sorted order. Did they? No? Ok try again...and again and again.  Essentially, this is bogo sort.  Wow. Such efficiency. Much fast. Wow.
Anywho since you'll probably be left sorting with bogo sort until the universe implodes, people have come up with much better sorting methods.

So in CSC108, we talked about three (pretty shitty inefficient) sorting algorithms: bubble sort, selection sort, and insertion sort.

Note that all of the code examples used below are Danny's implementations of the sorting algorithms from lab 10.

Here's bubble sort:

The idea of bubble sort is pretty simple -  you go down the list index by index and swap the items if the item on the 'left' is greater than the one on the 'right' so eventually the greatest value 'bubbles' its way to the end of the list (I really don't understand how this movement can be described as 'bubbling).  To visualize, here's a gif:
Just looking at the code/gif it's pretty easy to see this sorting algorithm is super inefficient - the while loop's conditional increments for every element in the list since j = len(L)-1, but within the while loop, there's a for loop that also iterates over the entire list (and performs some constant number of steps so we won't worry about that).  So bubble sort iterates over the entire list the list for each iteration over the entire list, giving it a O(n^2) in its worst and average case.  In bubble sort's best case scenario, where the list is already sorted, it has a O(n).

Selection sort:
Selection sort is harder to explain than bubble sort.  Basically, as you go down the list you create two sections - the unsorted part and the sorted part, with a 'wall' between them.  As you iterate down the list, you select the lowest value in the unsorted section which is guaranteed to be greater than any value on the sorted side, and place it at the end of the sorted side.
 
Selection sort, like bubble sort, uses a while loop that terminates after we've iterated over an entire list.  However there is no for loop within the while loop per say.  However the helper function find_min is called within the selection_sort function. Find_min uses a list comprehension, which itself iterates over each element in the list L, so once agin selection sort at its worst case has O(n^2) in its worst case.

Insertion sort:
Personally, I think that insertion sort is the most intuitive out of all the sorts we learned in CSC108.  Once again, there are two sections, the unsorted side and the sorted side.  As you 'move the wall down' you take the next item in the list and insert it in the sorted side so that the previous item is < than it and the next item is > than it.  Unlike selection sort, insertion sort does not guarantee anything about where the item should be inserted, nor does it guarantee anything about the following items in the unsorted side.

Insertion sort also uses a while loop in the same way that bubble sort and selection sort do, and like selection sort, it uses a helper function.  The helper function insert also iterates over each item in the list, being called in each iteration of the while loop in insertion_sort_1, resulting in a O(n^2).

Quick sort:
Quick sort was the first algorithm we learned in CSC148, and essentially, it splits the list by choosing a pivot point and decides where it should be 'sorted' with respect to the rest of the list, and repeats it in every partition.  The pivot itself splits the list into two parts - less than the pivot and greater than the pivot.


It's kind of hard to take the code at face value and determine it's O, so I'll talk about it in a bit of a more general sense.  In the function itself, we recursively split the list in half, which has a runtime of O(logn) (with log being base 2, so the number of times you can divide a number in half....for more info on logs, see your grade 10 math notes).  Each time we split the list in half, we call the _partition_1 helper function, which has a while loop that iterates through every element of L, hence having a runtime of n.  So for each O(logn) partition, we run a function that's O(n), so overall, quick sort has a runtime of O(nlogn) in its best case

Let's take a moment to think about the _partition_1 function.  Like the docstring says, it rearranges L so that L[0] is the pivot point.  But what if L[0] is a really bad pivot?  In other words, what if the pivot at L[0] creates two very imbalanced lists, ie where there's one item or even no items < the pivot, and thousands of items > the pivot?  (You could say this could be fixed by picking a more random pivot by using the rand function, but the chance of choosing a really bad pivot is still there, since...well it's random).  This creates almost n work on one side of the pivot, so instead of being split in half it's being split into 'almost its entirety and a little bit', making the recursive _quicksort_1 function actually have a runtime of O(n), so in the worst case, Quicksort has a runtime of O(n^2)...ouch...let's just....move on *fades into the distance.

Finally, merge sort:


 Like Quicksort, Mergesort also recursively divides the list in half.  However after dividing the list in half, it sorts the two halves so that at each index of one half, the item is greater than the item at the same index on the other half.  Then the two halves are merged together so that they're sorted.

Here , _mergesort_1 recursively divides the list in half, having O(logn).  Unlike Quicksort, the pivot in Mergesort is exactly the midpoint of the list, so there's no chance of that worst possible O(n) runtime in the recursive division part.   The _merge_1 helper function has that while loop that iterates over each element of each half list.  Even though it iterates over two lists, it's iterating over them at the same time for each index.  The two lists will either be the same length or one will have one element more than the other, but for very large lists (which is the case we care about in runtime analysis), this hardly makes a difference.  Hence the helper function has a O(n), making the overall runtime for Mergesort O(nlogn).  So really, I personally don't understand the point of Quicksort if Mergesort has the same runtime and has a better worst case runtime...

Saturday 22 March 2014

Week 10 - Sorting Big-O

Ah. Big O.  The nightmare from CSC165 returns. What a familiar concept.  At least we're not doing proofs.

Anywho the Big O of an algorithm is defined as the upper bound on the runtime of an algorithm for the worst case scenario.  There's a lot of different Big O's out there but some of the most common ones are (in increasing horrible-ness)  constant, logarithmic, linear, quadratic, cubic, and exponential (ouch).  These factors are in terms of the size of the structure being processed by the algorithm (most commonly used at our level, I think, are lists)

We rarely (I think) get constant runtime.  Operations like assignment statements and prints have constant runtime, but they aren't really algorithms.  The simplest constant runtime algorithm I can think of is pop() for a list.  No matter the length of the list, the algorithm always goes to list[-1] and performs the same operation on it.  However, if we were to try to 'pop' the first item of the list, the runtime would be cn (where c is an arbitrary constant and n is the length of the list).  Since after deleting the first item, we would have to move every following item down an index number (ouch).

Lets look at binary search trees.  What's the big O on searching through the tree?

Since we know that each node has a left child which is less than the root and a right child that is more than the root, each time we pass a root we check if the item we're searching for is less than or greater than the root (or equal).  If it is less than or greater than, we automatically eliminate half of the items on our 'checking list'.  Hence, after each iteration, we divide the number of checks and hence the runtime, by half.  This is a O(logn) (note that this log is base 2).

Also, recall from CSC165 that since big O is an upper bound, O(logn) is in O(n) which is O(n^2) and so on. (yay subsets).

In CSC108 we learned three types of search algorithms - bubble search, selection search, and insertion sort.  All of which have pretty terrible efficiency.  Even Obama thinks so.

So far in this course, we've covered two new search algorithms: Quick sort and merge sort.

There are three basic steps of Quick sort:

  1. Choose a pivot - this is the fastest step; a random index is usually used
  2. Partition the list into two parts: less than the pivot and greater than the pivot
  3. Repeat step 1 and 2 with recursion  until the list is sorted

As we did in CSC108, I feel like I should include a video of folkdance to demonstrate quicksort...so enjoy?




How do we evaluate quick sort's performance?:

  • How many times do we choose the pivot?
  • How many steps each time we choose a pivot?

Quick sort has a performance of nlogn - on average, we carry out the pivoting operations which splits the list in half (runtime of logn) and at each pivot split we do n work.  However in the worst case, the runtime could be n^2.

What about merge sort?
...Well...since we ran out of time in lecture and Danny's implementation didn't work...I think I'll leave that for next week.

Saturday 15 March 2014

Week 9 - Binary Search Trees

Ah, BSTs. BSTs are great.  They're already organized so there are less cases to code for (yes I'm lazy, don't judge me), and searching them has a runtime of O(logn), how nice.

What exactly are BSTs? They're basically binary trees from all the way back in week 6 with one more pre-condition - in each node, the value of the root is greater than the left child and less than the right child.  Let's look at some visual examples of BSTs, again with my horrible drawings:


In the above, it may be surprising to note that I is indeed a binary search tree since it obeys the pre-condition that the left child must be less than the root and the right child must be greater than the root.  II is not a BST since 2 > 1.  III is also not a BST since each element in a binary search tree must be unique.  IV is a typical BST as we would expect.  V is also a BST since it follows the precondition.

The actual python implementation for BSTs is the same as normal binary trees, but we add the above stated pre-condition.
Note that the BST is actually a wrapper function of BSTNode so any methods made for BSTs need to be made as module level functions.  (See the insert function example from lecture).

Saturday 8 March 2014

Week 8 - Linked Lists

Linked lists are a recursive data structure in nature - they either consist of an empty node which consists of None or a cargo that has a reference to another linked list (that can either be a cargo or an empty node and so on)

On a higher level, linked lists are easy to visualize - picture a line of trains where each carriage is attached to its predecessor and the following train by a link, and each carriage contains a cargo.  These carts can be 'infinite' in number, as they can keep linking to new carts (ok fine, that'll never happen in the real world but it's relatively easy to visualize...right?)

Anywho, let's look at Danny's implementation.  The comments are basically me trying to explain what's happening to myself/you:

Just to make it more clear about what's happening in prepend, see the below (badly) drawn diagram.  Note that the star denotes a copy
Once again, see the below for a more clear (I hope), visual explanation


Let's look at the contains method to view how Linked Lists are used with recursion, once again my remarks are in the comments:

In lecture we talked about wrapper classes for lists, which I personally didn't really see the point of, but it basically ...well it wraps the entire linked list into one object that has two parameters - the linked list itself, and the length/size/number of nodes in it....moving on then.

But I got really confused at the reverse function discussed so let me dissect it - once again the comments are my remarks:
Edit:: OH I SEE WHY THE WRAPPER CLASS IS USEFUL!

See the below code and comments:


Friday 28 February 2014

Week 7 - Recursion....wait what, again?

This week's assigned slog topic is recursion.  I'm half tempted to just refer you to different sections of week 4, but I'll re-iterate and expand on what I've already written anyway.

To start, let's review; recursion is a tool used in computer science to perform a specific task recursively, ie a function that calls on itself during it's execution.  (I'm quite sure that most recursive functions take iterable objects as an input, like lists and other list-like ADTs).

I previously likened recursion to the chain rule differentiation through a pretty complicated example, but now I realize there's a simpler example.  Consider the concept of factorials in math (!).  When you take a factorial of a number, you multiply all the integers from 0 up to that number.  This can be implemented in the following python function:
The above code is easy enough to read/understand, however the factorial function can also be implemented using recursion, like so:
Lets trace it - if you input the value n = 3, then the function will check if the value is 0, since 3 != 0, it then goes to the last line, 3 * factorial(3-1).  The factorial function here calls on itself, going through the process to see if 3-1 = 0 (it doesn't), so it proceeds to the last line again to give 3 * 2 * factorial (2-1), which then repeats again, giving 3 * 2 * 1 * factorial(1-1).  This time, however when 1-1 is inputted as n into the function, 1-1 does indeed = 0, hence returning 1.  Hence the final output for factorial(3) turns out to be 3 * 2 * 1 * 1 = 6.

List comprehensions are used a lot when we deal with recursive functions since lists can be nested.  Consider the following problem - we want to find the the max of the items in a list (that can be nested)  How would we go about this?

List comprehensions make the code for this problem relatively short:

Even though the concept of recursion is simple enough to understand its implementation can get quite complicated depending on the data structure you're working with - more complicated ADTs make it even worse.  However, I find the trick to tracing recursion is to not think too much...if that makes any sense.




Tuesday 25 February 2014

Week 6 - I See Trees...of Green....Red Roses Too....

This week we covered Trees in lecture; more specifically, binary trees, and honestly I was completely lost during lecture.  Hopefully by reading up on Trees more as I write this slog, the class will become more clear to me.

Lets start with basic definitions.  This'll be easier with a diagram, so consider the following:
The above tree is a binary tree since every node (represented by the circles) has two child nodes.  The top node (here, labelled 'A'), is called the root, and the nodes that do not have any children are leaves. However, since we're in code we don't really have nodes that have nothing branching from them - Nodes can either store either values/objects or Nonetypes.  If a node has two Nonetype children then it is a leaf.

Trees are recursive data structures; they're either:
  • Empty (represented by None) or
  • A node that contains an object and/or two other tree references
So now we have the 'graphical' representation of trees, but what do they look like in actual code?  Luckily for me this was shown in-class, so consider the code taken from Monday's lecture:

As you can see from the code, Trees are a lot like the nested lists we examined in Week 4 (as well as Lab #4).  Note that if there are no children, the line self.children = [] creates indexes with the Nonetype where in place of a leaf's child nodes.



Saturday 8 February 2014

Week 5 - Scopes and Namespaces

First thing's first.  Changes to assignments should not be emailed by the profs after one has finished the assignment.  Just...no.  Needless to say that all the time spent in lecture on discussion TOAH model and assignment 1 in general would've been better spent last week (personally/for my group), but it was a good confirmation that our group hadn't gone in the completely wrong direction, but I digress; at least there will be less to write about in this week's slog.

What new material we did cover in lecture revolved around how Python handles scopes and namespaces.

But before we begin, what is a name?  Remember from CSC108 that Python names refer to a memory address which contains an object, being a value, function, class, etc.  A namespace is ...well a space that holds several names.

Consider a Python module/file (insert_name_here.py).  Each module has its own global namespace, so that all the names used in the module are in the same space.  Because of this, no two objects can have the same name, unless they are nested (but more on that later).  Even though each individual module has its own global namespace where no two objects can have the same name, you can import different modules into your program.  When you do this, there can be a name shared by both modules that refer to different objects.  For example if I have the file stack.py with the class method is_empty(), and queue.py with the class method is_empty(), they can both be called separately by importing both modules (import stack and import queue) and then typing stack.is_empty() and queue.is_empty().  However note that if the program has an is_empty() function, access to the functions of the same name from stack and queue will be lost.

Scopes are regions of a program from where a namespace can be accessed without a prefix (like the above where the prefix stack. referred to the module name).  At any time, there are several different scopes being used in a program - the scope of the current function, module, and the Python built-ins.  If the functions/modules are un-nested, they cannot access names from other functions/modules in the program.  For a simple example consider the following:
Even though the functions scream and shush have the same variable names s and new_s, the two functions still work when they are called in the shell without interfering with each other.

However, what if the functions were nested?  Consider the following example from class:

First, Python acknowledges that the functions do_local, do_nonlocal, and do_global exist before executing spam = 'test spam'.  The function do_local creates a spam that is local, meaning it only exists in the scope of the function do_local and not in any other function, even scope_test.  When the do_local is called later within scope_test, the value for spam is automatically trashed after the function finishes executing.  In do_nonlocal, the operation nonlocal makes it so that the spam that do_nonlocal uses is the spam from the encompassing function scope_test, which states that spam = 'test spam'.  However the next line of the function dictates that spam = 'nonlocal spam' which changes the spam in the encompassing scope as well.  When do_global is called, since at this point, there is no name spam in the global namespace, it creates a global variable spam, and then assigns its value as the string 'global spam'.  However, because do_global() is being called within scope test, and not the global scope, it prints the value of spam that is within the local scope of scope_test, which at this point, from the previous assignment of do_nonlocal is the string 'nonlocal spam'.  When the function finally exits scope_test, and prints spam, it first searches the local namespace, and fines no name spam in that namespace so it then checks the global namespace, to find that in the global namespace, spam = 'global spam', and hence prints that.

So in summary, python always looks for names in different scopes in the following order:
  1. The innermost scope - local variables
  2. Encompassing scopes - nonlocal variables
  3. The global scope - global variables
  4. Built-ins 
Namespaces are, in my opinion, can get quite confusing, but since we haven't encountered any really complicated code that involves switching between many layer of nonlocal scopes, it hasn't been too difficult to use.

...But I'm sure we'll be dealing with really confusing scenarios on the term test.  Which is in less than two weeks.