Study Topic: Algo: Basics

This is a Study Leave topic.

What Is It?

Basic structures and concepts

Why Study It?

Fundamental to Algos in general

Toggl code

Toggl code is ALGO-BASICS

Deliverables

Gone through the "CLRS" (Cormen, Leiserson, Rivest, Stein) "Introduction to Algorithms" book and written up (sparse) notes.

WriteUp

Algorithm Approaches

Dynamic Programming

In dynamic programming we split a problem into smaller subproblems.

Key elements:

optimal substructure (optimal solutions of the problem contain optimal solutions to subproblems)
overlapping subproblems (the subproblems should not explode in number but instead should get revisted repeatedly as we recurse)

One can approach either a top-down approach (with memoization to account for the subproblems overlapping) or bottom-up. Typically the asymptotic runtimes are the same, but bottom-up has smaller constant factors due to the lower number of procedure calls. Top-down may win if it does not have to recurse over the whole space of subproblems.

Greedy Algorithms

A greedy algorithm will always make the best local choice at any given stage.

Key elements:

optimal substructure (optimal solutions of the problem contain optimal solutions to subproblems)
greedy-choice property (we don't have to consider subproblems when making our choice at any stage).

Beneath every greedy algorithm there is generally a more cumbersome dynamic programming solution. The reverse is not so generally true. For instance "fractional knapsack" can be solved via greedy approach, while "0-1 knapsack" requires dynamic programming.

Matroids

A matroid is an ordered pair M = (S, I) such that:

S is finite
I is a nonempty set of subsets of S (called the independent subsets of S), that is hereditary (closed under subset operation)
If A, B ϵ I and |A| < |B| then ∃ x ϵ (B-A) such that A U {x} ϵ I (the exchange property)

We say a matroid M = (S, I) is weighted if there's a weight function that assigns a strictly positive weight w(x) to each w ϵ S. In the context of greedy algorithms, M may be termed a greedoid.

Many greedy-optimal problems can be formulated in terms of finding a maximum weight independent subset in a weighted matroid - such a subset is termed optimal, and the algorithm for doing so is simple:

start with A = empty set
sort M.S into decreasing order by weight
for each x ϵ M.S in the above order
- if A U {x} ϵ M.I
  - A = A U {x}

Amortized Analysis

In Amortized Analysis we are assessing the average performance in the worse case. We do this by working across a number of operations rather than looking at each in isolation. The main approaches:

aggregate analysis: we sum over a series of operations and assign the average to each individual operation
accounting method: we assign different charges (amortized costs) to different operations. The aim is to maintain a positive "credit balance" at all times, by having some earlier operations assigned more than their actual cost.
potential method: we associate a potential function with properties of the structure as a whole, aiming to build potential that can then be spent on future operations.

Trees

Practical trees keep height bounded/small in the face of arbitrary insertions/deletions

AVL Tree (1962) - first self-balancing binary tree
Red-Black Tree (1972) - also self-balancing
- A node is either red or black. NIL nodes are same colour as root. Red node children must be black. For a node, all paths to NIL must have the same number of black nodes.
- Red-Black is less rigidly balanced than AVL, so better insert/delete but can be worse on lookup
- C# Sorted Dictionary is Red-Black Tree
AA Tree is a simpler variant of Red-Black
Scapegoat Tree doesn't have to store extra information to help with balancing, so that can improve locality
Splay Tree offers self-optimisation of access: frequently-accessed nodes move closer to the root
B-Tree allows many child nodes: there is a minimum degree t and every node must have between t-1 and 2t-1 keys (between t and 2t children). These are good when accessing each node is expensive, for instance on disk-based index trees. Tree height is purposely kept very shallow, only increasing if the root node must be split.
van Emde Boas Tree offers great performance (O lg lg n) on all standard operations: search, insert, delete, max, min, successor, predecessor. Requires keys to be integers in range [0, n-1] with no duplicates to work - subtrees on square-rooting the bit representatin.

Heaps

Fibonacci Heap - compared to a binary heap, offers better insertion, union, key-decrease. But is substantially more complicated to program. Each tree level uses a circular double-linked list. Heap consolidation occurs on min-extraction.

Hashes

C# hash (e.g. Dictionary) starts off with size 3, then takes next prime after 2*n, repacking each time. Tops out just below 500,000.
When dealing with collisions, can either use separate chaining (use internal data structure) or open addressing (use the array).

Disjoint Sets

Disjoint Set Forests allow efficient representation of disjoint set of elements, supporting the following operations:

Finding which set a given element is in
Uniting two sets

Graph Algorithms

Write v.d and v.p for distance and predecessor of v. w(u, v) for weight of edge (u, v).

Basic Search

Initialise to white, gray when in queue, black when finished.

Breadth-first: Maintain a queue of vertices to search. Can calc shortest path distance.
Depth-first: Allows structure characterisation (cycle detection, topological sorting, determining strongly connected components)

Minimum Spanning Trees

Greedy strategies work well: build up a MST by adding "safe edges". There are two common algorithms for this:

Kruskal: each selection connects two distinct components. O(E lg V)
Prim: each selection builds out from the current MST. O (E lg v) or potentially O(E + V lg V) with Fibonacci Heaps

Single-Source Shortest Paths

WLOG can assume no cycles (negative cycles imply -inf, positive cycles would be removed, zero cycles can be removed).

A key concept is relaxation: can you improve the path to v by going through (u, v) - if so, updating v.d and v.p.

Bellman-Ford: General case, allowing negative weights. Make |V| - 1 passes, each pass relaxing each edge of the graph once.
Dijkstra: At each step, take the closest vertex not already in set, add it, and relax all its edges.
Dag Shortest Path: Dag only. Topologically sort vertices, then go through them in turn, relaxing every edge of each.

Difference Constraints

A LP problem where all constraints are of form x(i) - x(j) <= b(k) can be recast as a weighted graph where w(v(i), v(i)) = b(k), and solving shortest-path weights will solve the original problem.

All-Pairs Shortest Paths

While you can run Dijkstra / Bellman-Ford on each starting vertex, better solutions exist:

Floyd-Warshall: best for dense graphs, O(n^3)
Johnson's Algorithm: better for sparse graphs, O(V^2.lg V + VE). Dijkstra on every vertex is also a possibility if no negative weights.

Flow Networks

A flow network G(V,E) is a directed graph where each edge (u,v) ϵ E has a nonnegative capacity c(u,v). WLOG assume edges go one way only (at most) between any vertex pair. We distinguish source s and sink t vertices. Every vertex v lies on a path s -> v -> t. We then have the following properties for a flow in G, f: V x V -> R

capacity: ∀ u, v ϵ V we require 0 <= f(u, v) <= c(u, v)
flow conservation ∀ u ϵ V - {s,t} we require the sum of f(x, u) = sum of f(u, y): flow in = flow out

Simple transformations handle cases of

edges going both ways between a vertex pair
multiple sources and/or sinks

The Ford-Fulkerson method is an approach to maximising total flow between s and t. A central concept is the residual network Gf for a network G and flow f - the capacities of Gf represent how we can change flow in G, whether by augmenting flow or by channelling some existing flow back. The aim is to repeatedly augment flow until there is no augmenting path left in Gf. The max-flow min-cut theorem then tells us we have found a maximum flow.

Some combinatorial problems can be re-cast as maximum-flow problems: for instance, finding the maximal matching in a bipartite graph.

There are also push-relabel and relabel-to-front algorithms that I haven't gone over in any detail

Multithreading

It's useful to use the notions of "work" and "span":

work: total time to execute on one processor
span: longest single strand time

So we are lower-bounded both by span, and by work/#processors.

Matrix multiplication is an example where parallelism can really help: down from n^3 to (lg n)^2.

Sorting offers another: merge sort down from n.lg n to (lg n)^3.

Polynomials and FFT

The FFT allows transition between coefficient-based and value-based polynomial representations in O(n lg n). Because value-based representations can be multiplied as O(n), this gives us polynomial multiplication in O(n lg n).

The CLRS book discusses DFT/FFT: The discrete Fourier transform (DFT) is a value-based representation using complex roots of unity. Highly efficient DFT implementations have received a huge amount of attention over the years. One good one is the Fastest Fourier Transform in the West (FFTW), which uses a number of techniques, including host machine targeted unrolled code at the lowest level.

String Matching

For a string length n, target length m:

Naive with matching time O((n - m + 1)m)
Rabin-karp which uses hashing to pre-select candidates, so is generally far better than Naive even though it shares the same worst case
Finite Automaton construction which has matching time O(n), but can have extensive setup time of O(m * #characters)
Knuth-Morris-Pratt which has matching time O(n), but much reduced O(m) setup time by clever deferment of the work required to efficiently construct the automaton transitions.

NP-Completeness

Main concepts:

P: solvable in polynomial time
NP: verifiable in polynomial time
NPC: "NP-complete", in NP and as hard as any problem in NP

We know P ⊆ NP, but it's an open question if P is a proper subset of NP. However so many problems are NP-complete (mutually translatable to each other in polynomial time) that most researchers believe P != NP.

There is also the notion of pseudo-polynomial time, where the time is polynomial in the value of the input, but exponential in the length - the number of bits required to represent it.

Plenty of other graduations too. Note NP-complete refers to decision ("can we beat X?") rather than optimization ("how low can we go?") problems.

Other Areas Not Covered

Because I've done them in previous lives:

Matrix Operations (eg equation solving)
Linear Programming (inc simplex algorithm)
Number-Theoretic (factoring, prime testing, RSA encryption)
Computational Geometry (inc convex hull finding, closest point-pairs)
Approximating Approaches

Tests

Classic Test Algos

Word Reversal In Place

First reverse the whole string, then re-reverse each word.

Remove Duplicates From n-size Array With Elements in [0, n-1]

Sweep through from [i, n-1] each time moving elements to place of their index

Simple DP Algorithms

These both rely on the idea of holding the best answer we've got so far in our sweep through the array

Longest Increasing Subsequence

Keep track of bestLen and bestEnd: initialise bestLen = 1 and bestEnd = 0
Keep track of best-ending-at[i] and prev[i]: initialise best-ending-at[0] = 1 and prev[0] = -1
Go through i = 1,...,n-1
- Initialise best-ending-at[i] = 1 and prev[0] = -1
- Go though j in [0,i-1] and see if v[j] < v[i] with b-e-a[j] + 1 > b-e-a[i].
  - If so then update b-e-a-[i] with b-e-a[j] + 1, and set prev[i] = j
- If b-e-a[i] is best after the loop, then update bestLen with b-e-a[i] and bestEnd with i
Can then use prev to reconstruct subsequence if you want

There's a O(n logn) version with binary searches...

Let S[pos] be defined as the smallest integer that ends an increasing sequence of length pos.
Now iterate through every integer X of the input set and do the following
- If X > last element in S, then append X to the end of S. This essentialy means we have found a new largest LIS.
- Otherwise find the earliest element in S, which is > than X, and change it to X. Because S is sorted at any time, the element can be found using binary search in log(N).

Biggest Partial Sum

Keep track of best-ending-here and best-so-far. Initialise both to 0.
Go through i = 0,...,n-1
- best-ending-here = max(0, best-ending-here + v[i])
- best-so-far = max(best-so-far, best-ending-here)

(0,1)-Knapsack

We have n items, capacity W, item weights w[i] and values v[i].
We want m[n, W]. Can recurse on:
- m[0, *] = 0
- if w[i] <= j then m[i,j] = max[ m[i-1, j], m[i-1, j-w[i]] + v[i] ]
- else m[i,j] = m[i-1, j]