Study Topic: Topological Data Analysis

From Matt Morris Wiki
Jump to navigation Jump to search

This is a Study Leave topic.

What Is It?

Topological Data Analysis offers some promise in analysing point clouds for structure. The main tack is to connect the points into simplical complexes based on proximity, and work out topological invariants such as the Betti number - deciding which points are linked up to form the simplexes is parameterised by the proximity threshold from smallest separation to largest. The result is a "barcode"/"fingerprint" that allows people to pick out topological characteristics that are present over a range of the parameterisation. This packs a lot of information about the structure of the data into a single object.

Why Study It?

I've got a head start here because I've studied a fair bit of algebraic topology.

Toggl code

Toggl code is TOPIC-TDA

Deliverables

Max 7 hours of time. Start with an hour reading "Topology for Computing". Write a view up on this page of what the main concepts are, and how likely it is that further study will be worthwhile.

Concepts

Category Theory

A category C consists of:

  • a collection Ob(C) of objects
  • sets Mor(X,Y) of morphisms for each X,Y in Ob(C), including an identity morphism in Mor(X,X) for each X
  • for each triple X,Y,Z in Ob(C),a composition of morphisms function *: Mor(X,Y) x Mor(Y,Z) -> Mor(X,Z), s.t. f*1 = 1*f = f, and (f*g)*h = f*(g*h)

A (covariant) functor F from category C to category D assigns to each X in C an F(X) in D, and to each f in Mor(X,Y) a morphism F(f) in Mor(F(X), F(Y), s.t. F(1) = 1 and F(f*g) = F(f)*F(g)

Simplical Complexes

To construct topological invariants on a space, simplical complexes are key. Here are the main ideas:

  • Simplexes are points. lines, triangles, tetrahedra and their higher-dimensional analogues.
  • A face of a given simplex is a simplex on some subset of the original simplexes' vertices (the subset may be empty, and may have all the original vertices).
  • A coface is the dual of a face: A is a face of B iff B is a coface of A
  • A proper face/coface is where the set of vertices differs between the two simplexes are concerned
  • A simplical complex is a collection of simplexes that is closed under taking any face of any simplex, and on taking the intersection of any two simplexes.
  • A simplex is principal in a simplical complex if it the complex contains no proper coffees (there is nothing "bigger")
  • The dimension of a simplical complex is the greatest dimension of any member simplex. Here points are 0-dim, lines 1-dim, and so on.
  • A subcomplex L of a simplical complex K is one where L is a simplical complex and all its simplexes are also in K.
  • Cl(L), the closure of a set of simplexes L in complex K, is the smallest subcomplex containing everything in L
  • St(L), the star of a set of simplexes L in complex K, is the union of all cofaces of simplexes in L
    • That is, all simplexes in the complex that have all of the vertices of some simplex in L
  • Lk(L), the link of a set of simplexes L in complex K is the boundary of its star: Lk(L) = Cl(St(L)) - St(Cl(L) - {0})
    • That is, all simplexes in the complex that link star-simplexes to the rest of the complex
  • A triangulation of a topological space is a simplical complex that is homeomorphic to it.
  • A filtration of a simplical complex is a nested sequence of subcomplexes: K, with its filtration, is a filtered complex.
  • A filtration ordering of a simplical complex is a full ordering of its simplexes such that each prefix of the ordering is a sub complex.

You only really need the vertices and not a geometrical realisation. The collection, for a given simplical complex K of all sets of vertices that span a simplex is called its vertex scheme and forms an abstract simplical complex - essentially K shorn of its geometric imterpretation.

Group Theory

Groups

A group is a set G and binary operation *, such that:

  • * is associative
  • there's an identity element e in G
  • for each a in G, there's an inverse a' such that a * a' = a' * a = e

We are mainly interested in Abelian groups, where * is commutative. Groups can have subgroups, subsets of the original G that satisfy the group requirements in themselves.

Cosets and Homomorphisms

Let H be a subgroup of G: then you can partition G into cosets: aH = {ah: h in H} is the left coset and Ha = {ha | h in H} is the right coset. For Abelian groups, these two things are equal. Note for a given G and H, every left coset and every right coset will have the same size (the same size as H).

A map M of groups G to G' is a homomorphism' if M(ab) = M(a)M(b) for all a, b in G. You always have at least the trivial homeomorphism for any 2 groups, where all g in G get taken to e', the identity element for G'. homomorphisms preserve identities, inverse relationships and subgroups (in both directions).

The set of elements in G mapping to e' in G' is called the kernel (ker M) of the homeomorphism. M is 1-1 iff "ker M' - {e}. A 1-1 homomorphism is a monomorphism. An onto homomorphism is an epimorphism. A 1-1 and onto homomorphism is an isomorphism: isomorphisms define an equivalence relationship across groups.

A normal subgroup H of a group G is one where all the left & right cosets coincide. This is true for all H if G is Abelian, and it is also true where H is the kernel of a homeomorphism.

If H is a normal subset of G then the cosets of H form a group G/H under the binary operation (aH)(bH) = (ab)H. This is the factor group or quotient group of G modulo H. The elements in the same coset of H are said to be congruent modulo H. As an example, with group G and homomorphism M, the factor group G / (ker M) is naturally isomorphic to M(G). Write H= ker M. Then U: G/(H) -> M(G) given by U(gH) = M(g) is an isomorphism.

Fundamental Theorem of Finitely Generated Abelian Groups

A cyclic group is generated by taking successive powers of a single element. You can take direct products of groups. The smallest subgroup of G containing all elements in A is the subgroup generated by A. If that subgroup equals G then A generates G and the elements of A are the generators of G. If A is finite then G is finitely generated.

The fundamental theorem of finitely generated Abelian groups states that every finitely generated Abelian group is isomorphic to a direct product of cyclic groups, with each cyclic group being either Z (integers + addition) or Z(n) (0,1,...,n-1 and addition modulo n) with the n's being of form p^r with p prime.

The number of factors of type Z (the non-cyclic ones) are the Betti number beta(G) of G, and the subscripts of the finite cyclic factors are the torsion coefficients of G.

For example take Z6: this has {0,3} as a subgroup. Since Z6 is Abeliam, {0.3} is normal, so we can factor Z6 into the cosets {0,3},{1,4}, {2,5}. The group formed by these cosets is iso to Z3. So Z6 / {0,3) is iso to Z3. But also {0,3} is iso to Z2. So we have Z6/Z2 is iso to Z3.

Free Abelian Groups

So we have finitely generated Abelian groups being iso to a product of infinite and finite cyclic groups. We can characterise infinite factors using the notion of free Abelian groups.

An Abelian group G is a free Abelian group if it has a nonempty generating set X (the basis for the group) that obeys one of these two (equivalent) conditions:

  • Each nonzero element a in G can be written a = sum(n(i) * x(i)) for nonzero n(i) in Z, and distinct x(i) in X
  • X generates G and "sum(n(i) * x(i)) = 0 for n(i) in Z, and distinct x(i) in X" iff all n(i) = 0.

If G is a nonzero free Abelian group with a basis of r elements, then G is iso to ZxZx...xZ for r factors. So the number of factors is fixed: we call this the rank of G.

Subgroups of free Abelian groups: A subgroup K of a free Abelian group G (of finite rank n) is also a free Abelian group, with rank s <= n. Also there exists a basis { x(i) } for G and { d(i) } in Z+ such that { d(i)*x(i) } is a basis for K. G/K is also finitely generated, and its rank is n-s.

From all this, we can rewrite the fundamental theorem of finitely generated Abelian groups to say that all such groups G can be factored into a free Abelian group H and the product of finite cyclic groups T, G = H * T. Then G/T is iso to H, which is iso to pow(z, beta) where beta is the Betti number of G. T is often called the torsion subgroup of G and it contains all generators with finite orders.

Rings, Fields, Integral Domains

A ring <R, +, *> is a set R with two binary operations +, * (which we can call addition and multiplication) such that:

  • <R, +> is an Abelian group
  • * is associative
  • For a,b,c in R the left distributive law a(b+c) = ab + bc, and the right distributive law (a+b)c = ac+bc, both hold

A ring R with multiplicative identity 1 (x*1 = 1*x = x for all x in R) is a ring with unity.

We have the following concepts lining up between groups and rings:

Groups Rings
Abelian commutative
subgroup subring
normal ideal
cyclic principal

A field F is a commutative ring with unity and multiplicative inverses.

An integral domain D is a commutative rind with unity such that for all nonzero a,b in D, ab != 0.

An element u of an integral domain D is a unit of D if U has a multiplicative inverse in D. A nonzero element p of D that is not a unit of D is an irreducible of D if any factorisation p=ab in D requires either a or b to be a unit.

Fields and integral domains are very closely related:

  • every field is an integral domain
  • every finite integral domain is a field.

Examples:

  • Z is not a field (no multiplicative inverses) but it is an integral domain
  • Q and R are fields (and so also integral domains)
  • If p is prime, Zp is an integral domain (and so also a field, being finite)
  • If p is not prime, Zp is not an integral domain (as it has nonzero elements dividing zero)
  • If R is a commutative ring with unity, then the set of all polynominals f(t) = {sum( a(i) * pow(t,i ) ) with a(i) in R and t the indeterminate} form a commutative ring R[t] with unity.

An integral domain D is a principal ideal domain (PID) if every ideal in D is a principal ideal. This is the ring analogue to cyclic Abelian groups (all of whose subgroups are normal and cyclic):

Examples:

  • R, Q, Z, Zp (p prime) are all PIDs
  • Usually R[t] is not a PID for arbitrary rings R
  • If R is a field, R[t] is a PID.

Modules, Vector Spaces and Gradings

If R ia a ring, a (left) R-module is an Abelian group M together with a multiplication operation (r in R) * (m in M) so that for all a,b in M and all r,s in S we have:

  • (r * a) in M
  • r * (a + b) = r * a + r * b
  • (r + s) * a = r * a + s * a
  • (rs) * a = r(s * a)

In this context, people often call M an R-module. If R is a ring with unity, and 1 * a = a for all a in M, M is a unitary R-module. M is cyclic if there exists a in M such that M = { r * a | r in R } - so it can be generated by the action of R on a single M-element.

Let F be a field and V be an Abelian group. Then a vector space over R is a unitary F-module, where V is the associated Abelian group. The elements of F are scalars and the elements of V are vectors. We often refer to V as the vector space.

A graded ring <R, +, *> has a direct sum decomposition of Abelian groups R(i): similarly a graded module M over a graded ring R has a direct sum decomposition into modules. We write the decompositions as R(i) and M(i) with i in Z. We say a graded ring (or module) is non-negatively graded if R(i) (or M(i)) is zero for all i < 0.

Example:

  • R[t], the polynominal ring on R, can be graded non-negatively by taking the standard grading pow(t,n)*R[t] for n >= 0.

The Structure Theorem: if D is a PID, every finitely generated D-module is iso to a direct sum of cyclic D-modules. Similarly every graded module M over a graded PID D decomposes into a direct sum of graded modules.

Topology

Basics

  • A topology on a set X is a subset T (the open sets) of pow(2,X) that is closed under finite intersection and arbitrary union, and which contains O and X.
  • The closed sets are then X - S (S in T)
  • Sets may be both open and closed. For instance O is both open and closed by definition.
  • The pair <X, T> is then a topological space.
  • The interior of a set A in X is the union of all open sets contained in A.
  • The closure of a set A in X is the intersection of all closed sets containing A.
  • The boundary of a set A is its closure minus its interior.
  • A neighbourhood of x in X is any A in X such that x is in A's interior.
  • A basis of neighbourhoods at x in X is a collection of neighbourhoods of x so every neighbourhood of x contains one of the basis neighbourhoods.
  • A metric or distance function d: X x X -> R is a function satisfying:
    • positivity: For all x, y in X, d(x, y) >= 0
    • nondegeneracy: d(x, y) = 0 => x = y
    • symmetry: For all x, y in X, d(x, y) = d(y, x)
    • triangle inequality: For all x, y, z in X, d(x, y) + d(y, z) >= d(x, z)
  • The open ball B(x, r) with centre x and radius r > 0 w.r.t. metric d is defined as B(x, r) = {y | d(x, y) < r}
  • A set X with metric d is a metric space with the metric topology of d, and the set of open balls defined using d serve as basis neighbourhoods.
  • The product' of two topological spaces consists of the Cartesian product of their sets, along with the product topology that consists of the Cartesian product of their open sets

Homeomorphisms

  • A homeomorphism f: X -> Y is a 1-1 onto function such that both f & f-inverse are continuous. We say X is homeomorphic to Y, and that X and Y have the same topological type.
  • Homeomorphisms partition the class of topological spaces into equivalence classes of homeomorphic spaces.
  • One fundamental problem in topology is classifying these classes.

Manifolds

Intuitively, a n-dimensional manifold is a topological space that locally looks like Rn:

  • A chart at p in X is a function f: U -> Rd, where U in X is an open set containing p and f is a homeomorphism onto an open subset of Rd.
  • The dimension of the chart f is d.
  • The coordinate functions of the chart are x(i) = u(i).f : U -> R, where u(i) : Rd -> R are the standard coordinates on Rd.
  • A topological space X is Hausdorff if, for every x, y in X with x != y, there are neighbourhoods U, V of x, y respectively, such that their intersection is empty.
  • A topological space X is separable if it has a countable basis of neighbourhoods.
  • A (topological) d-manifold is a seperable Hausdorff space X where every x in X has a d-dimensional chart (so every x in X has a neighbourhood homeomorphic to Rd)
  • The manifold is said to have dimension d.
    • It's a d-manifold with boundary if every x in X has a neighbourhood homeomorphic to Rd or to the half-space Hd = {x in Rd | x(1) >= 0}.
    • The boundary is then the set of points with neighbourhood homeomorphic to Hd.
  • The boundary of a d-manifold with boundary is a (d-1)-manifold without boundary.

Compactness:

  • A covering of a subset A of X is a collection of subsets of X that contain A in their union.
  • An open covering is a covering consisting of open sets.
  • A subcovering of a given covering is one that only contains sets from the original covering.
  • A subset A of X is compact if every open covering of A has a finite subcovering.

Smoothness:

  • Let U, V in Rd be open. Then a function f: U -> R is smooth (C-infinity, continuous of order infinity) if f has partial derivatives of all orders and types.
  • A function phi: U -> Rd is a smooth map if all its components e(i).phi : U -> R are smooth.
  • Two charts g: U -> Rd and h: U -> Re are smooth-related if d = e and either U and V have no intersection, or g.(inv h) and h.(inv g) are smooth maps.
  • A smooth atlas is one where every pair of charts is smooth-related.
  • A chart is admissible to a smooth atlas if it is smooth-related to every chart in that atlas.
  • A smooth manifold is a manifold together with all the admissible charts of a smooth atlas.

Orientability:

  • A pair of charts x(i) and y(i) is consistently oriented if the Jacobian determinant det( d(x(i))/d(y(j)) ) is positive wherever defined.
  • A manifold is orientable if there exists an atlas s.t. every pair of charts in the atlas is consistently oriented.
  • Such an atlas is consistently oriented and determines an orientation on the manifold.
  • If a manifold is not orientable, it is nonorientable.

Embedding:

  • An embedding f: X -> Y is a map whose restriction to its image f(X) is a homeomorphism.
  • A manifold exists independently of any embedding: a sphere does not need to sit in R3 to be a sphere.

Homology and Homology

Homotopy

  • A deformation retraction of a space X onto a subspace A is a family of maps f(t) : X -> A, t in [0,1], such that f(0) is the identity map, f(1)(X) = A, and f(t)|A is the identity map for all t
  • The family should be continuous, in the sense that the associated map X x [0,1] -> A, (x, t) -> f(t)(x), is continuous.
  • A homotopy is a family of maps f(t): X -> Y, t in [0,1], such that the associated map F: X x [0,1] -> Y given by F(x,t) = f(t)(x) is continuous.
  • Then f(0), f(1): X -> Y are homotopic via the homotopy f(t).
  • A map f: X -> Y is a homotopy equivalence if there is a map g: Y -> X such that f.g is homotopic to 1Y and g.f. is homotopic to 1X.
  • Then X and Y are homotopy equivalent and have the same homotopy type.
  • Homeomorphic spaces are also homotopy equivalent (homeomorphy is stronger than homotopy).

Invariants

  • A topological invariant maps the same object to spaces of the same topological type. Invariants may assign the same object to spaces of different topological types, but a good invariant will have enough differentiating power to be useful.
  • If K is a simplical complex, the Euler characteristic xi(K) is sum( s in K-{0} : pow(-1, dim s ). Any triangulation of a space M will get the same value, which defines the Euler characteristic for M.

Basic 2-manifolds

We have

  • S2: sphere
  • T2: torus (a square with opposite sides identified)
  • K2: Klein bottle (a square with opposite sides identified, one pair with opposing directions)
  • RP2: Projective place (a square with opposite sides identified, both pairs with opposing directions)

We can triangulate S2 with a tetrahedron, so xi(S2) = 4 - 6 + 4 = 2. Triangulations for the other manifolds give xi(T2) = xi(K2) = 0 and xi(RP2) = 1.

Connected Sums

We form the connected sum of two n-manifolds M1 and M2, M1 # M2 by cutting out n-dimensional closed disks and gluing the manifolds together on the boundary of the disks.

Since in triangulation terms this is removing a triangle, we have xi(M1 # M2) = xi(M1) + xi(M2) - 2.

The connected sum of g torii is called a surface with genus g:

  • xi(g T2) = 2 - 2g
  • xi(g RP2) = 2 - g

Classifying 2-manifolds

Two closed compact surfaces M1 and M2 are homeomorphic iff

  • xi(M1) = xi(M2)
  • either M1 and M2 are both orientable or they are both non-orientable

Sadly, this only works for 2-manifolds! Both homeomorphism and homotopy are undecidable for n-manifolds, n >= 4. 3-manifolds are an area of active research.

Fundamental Group, Homotopy Groups

  • A path in X is a continuous map f: [0,1] -> X
  • Write [f] for the equivalence class of f under homotopy
  • Given paths f,g then the product path f.g is a path that traverses both f and g (double the traversal speed to achieve this). This product respects homotopy classes.
  • A loop is a path f with f(0) = f(1)
  • The fundamental group pi1(X, x0) of X and xo in X has the homotopy classes of loops in X based at x0 as elements, and [f][g] = [f.g] as the binary operation
  • Example: pi1(T2) = Z x Z (one loop around the hole, one loop around the circumference)

The fundamental group is one of a series of homotopy groups pi-n(X) for a space X. The higher dimensional homotopy groups extend the notion of a loop to n-dimensional cycles, and capture the homotopy classes of those. We won't be using homotopy groups to differentiate between spaces because:

  • The definition is inherently non combinatorial - it depends on smooth maps and the topology of the space
  • The higher dimensional homotopy groups are very complicated and hard to compute - in particular you can't get them from a simplical decomposition
  • Homotopy groups can lead to an infinite description of a space

We'd rather have something computable and combinatorial. Homology is a good way to go for this.

Homology Groups

  • The k-th chain group of a simplical complex K is written <Ck(K), +>: it's the free Abelian group on the oriented k-simplices, where [s] = -[t] if s = t and s,t have different orientations.
  • An element of Ck(K) is a k-chain, written (sum: n(q)[s(q)], where n(q) in Z and s(q) in K)
  • The boundary homomorphism d(k): C(k)(K) -> C(k-1)(K) is defined as d(k)(s) = sum( pow(-1,i) * [v(0) v(1) ... ~v(i) ... v(n) ] ), where ~v(i) indicates that vertex is deleted
  • Examples:
    • d1[a b] = b - a
    • d2[a b c] = [b c] - [a c] + [a b] = [b c] + [c a] + [a b]
    • d3[a b c d] = [b c d] - [a c d] + [a b d] - [a b c]
  • The boundary of a boundary is zero: d(k-1)d(k) = 0 for all k.
  • The boundary homomorphism gives rise to a chain complex on K: 0 -> C(n) -> C(n-1) -> ... -> C(1) -> C(0) -> 0
  • Im(d(k+1)) and ker(d(k)) are free Abelian normal subgroups of C(k). im(d(k+1)) is a normal subgroup of ker(d(k)).
  • The k-th cycle group is Z(k) = ker(d(k)). An element of Z(k) is a k-cycle.
  • The k-th boundary group is B(k) = im(d(k+1)). An element of B(k) is a k-boundary.
  • We also call boundaries bounding cycles, and cycles not in B(k) non-bounding cycles.

The k-th homology group H(k) = Z(k) / B(k) = ker(d(k)) / im(d(k+1))

If z1 = z2 + B(k), with z1, z2 in Z(k) then we say z1 and z2 are homologous.

Homology groups are finitely generated Abelian (as they are factor groups of two free Abelian groups). So the fundamental theorem of groups applies, and we can describe homology groups through their Betti numbers and torsion subgroups.

The k-th Betti number beta(k) of a simplical complex K is the rank of the free part of H(k).

We can get some intuition on what this means by using the Alexander duality: beta(0) is the number of components, beta(1) is tunnels, beta(2) is voids.

2-manifold homology groups:

2-manifold H0 H1 H2
sphere Z {0} Z
torus Z Z x Z Z
projective plane Z Z2 {0}
Klein bottle Z Z x Z2 {0}

Note that invariance is hard to prove for simplical homology, as one cannot arrive at common refinements of any two triangulations in general. Instead one needs to define a more general theory, singular homology (which turns out to be equivalent to simplical homology).

Once one has established the invariance of homology, the invariance of the Euler characteristic can be arrived at: we formulate xi in terms of the rank of the chain complex members

Homology with Arbitrary Coefficients

  • we can use Zp instead of Z as the coefficients: gives same Betti numbers (with a Zp-product instead of a Z-product), and torsions using the gcd with p
  • we can use a field instead of Z: gives same Betti numbers, and torsion groups disappear
  • work with Z2 if all we want is the Betti numbers, as that's computationally easiest

Morse Theory

Morse Theory is geometric - it identifies points where the geometry is changing and relates them via a complex. Let's assume we have a smooth compact 2-manifold M emedded in R3 for the following.

Tangent Spaces:

  • A tangent vector v-p to R3 consists of two points of R3: its vector part, v and its point of application, p.
  • A tangent vector v-p is tangent to M at p if v is the velocity of some curve in M.
  • The set of all tangent vectors to M at p is the tangent plane of M at p' and is denoted T-p(M).
    • If phi is a path in M s.t. phi(u0,v0) = p then a tangent vector v is tangent to M iff v can be written as a linear combination of d(phi)/du and d(phi)/dv.
  • A vector field or flow on V is a function that assigns a vector v-p in T-p(M) to each point p in M.
  • For v-p in T-p(M) and h:M->R then the derivative v-p[h] of h w.r.t. v-p is the common value of (d/dt)(h.y)(0) for all curves y in M with initial velocity v-p
  • The differential d(h-p) of h:M->R at p in M is a linear function on T-p(M) such that d(h-p)(v-p) = v-p[h] for all tangent vectors v-p in T-p(M)
    • View the differential as converting vector fields into real-valued functions

Critical points and Morse functions: Given a surface M and function h:M->R, we are interested in the geometry h gives the manifold. We are particularly interested in points p where h is stationary:

  • A point p in M is critical for map h:M->R if d(h-p) is the zero map: otherwise p is regular.
  • Let x,y be a patch on M at p.The Hessian H(p) of h:M->R is then
d2h/dx2(p) d2h/dydx(p)
d2h/dxdy(p) d2h/dy2(p)
    • this gives the Hessian in terms of the basis (d/dx(p), d/dy(p)) for T-p(M)
  • A critical point p in M is nondegenerate if the Hessian is nonsingular at p, det[H(p)] != 0. Otherwise it is singular.
  • A smooth map h:M->R is a Morse function if all its critical points are nondegenerate.
    • Any twice differentiable function h may be unfolded to a Morse function
    • Some Morse function definitions require the values of h are distinct across critical points: not here though

Classifying 2-manifold Critical Points:

  • Morse lemma: you can choose local coords x,y at a critical point p in M so a Morse function h is of form h(x,y) = +/-x*x +/- y*y.
    • +, +: minimum, -,-: maximum, other two are saddle points
  • The index i(p) of h at critical point p is the number of minuses (equal to # of -ve eigenvalues of H(p))
    • 0: minimum, 1: saddle, 2: maximum

Gradients and Integral Lines:

  • Let y be any curve passing through p, tangent to v-p in T-p(M): the gradient div-h of a Morse function h is (dy/dt).div-h = d(h.y)/dt.
    • We can always choose orthonormal coords x,y: then we have div-h = ( dh/dx(p), dh/dy(p) )
  • The gradient of a Morse function h is a vector field on M. We integrate this field to decompose M into regions of uniform flow.
  • An integral line y:R->M is a maximal path whose tangent vectors agree with the gradient: d/ds(p(s)) = div-h(p(s)) for all s in R
    • We call org p = lim(s->-inf) p(s) the origin and dest p = lim(s->+inf) p(s) the destination of the path p.
    • Each integral line is open at both ends, and the limits at both ends exist (as M is compact)
    • Note that a critical point is an integral line by itself (since d-h == 0 so p(s) will be constant at the point).
  • Properties of integral lines:
    • Two integral lines are either disjoint or the same
    • The integral lines cover all of M (every point has a line through it)
    • The limits org p and dest p are critical points of h

Stabie and Unstable Manifolds:

  • The stable manifold S(p) and the unstable manifold U(p) of a critical point p are defined as
    • S(p) = {p} U {y in M | y in image(path), dest path = p}
    • U(p) = {p} U {y in M | y in image(path), org path = p}
  • Both sets of manifolds decompose M into open cells: spaces homeomorphic to Rn for some n
  • The stable manifold S(p) of a critical point p with index i = i(p) is an open cell of dimension S(p) = i
  • The unstable manifolds of h are the stable manifolds of -h [as div-(minus h) = minus div-h]
    • So the two types of manifold have the same structural properties
  • Stable manifolds are pairwise disjoint and partition M into open cells: the cells form a complex, as the boundary of every cell S(a) is a union of lower dimensional cells.
  • We can view a cellular complex as a generalisation of a simplical complex
  • Unstable manifolds also decompose M into a dual to the stable manifold complex.
    • For a, b in M, dim(S(a)) = 2 - dim(U(a)), and S(a) is a face of S(b) iff U(b) is a face of U(a).
  • For 2-manifolds:
    • Saddle points will have single lines (1-manifolds) running through them: to minima for S, from maxima for U.
    • Maxima are just critical points for U, but have a 2-manifold cell for S.
    • Minima are just critical points for S, but have a 2-manifold cell for U.

Morse-Smale Complex:

  • A Morse function is a Morse-Smale function if the stable and unstable manifolds intersect only transversally
    • In two dimensions, this means stable and unstable 1-manifolds cross only at point intersections.
    • That crossing point is necessarily a saddle
  • We intersect the stable and unstable manifolds to get the Morse-Smale Complex
  • Call connected components of sets U(p) ^ S(q) for all critical points p,q in M Morse-Smale cells
    • We refer to cells of dimension 0,1,2 as vertices, arcs and regions respectively
  • The collection of Morse-Smale cells form a complex: the Morse-Smale complex.
  • For a 2-complex:
    • each vertex is a critical point
    • each arc is half the stable or unstable 1-manifold of a saddle
    • each region is a component of the intersection of a maximum's stable 2-manifold and a minimum's unstable 2-manifold

Example:

  • h(x,y) = sin(x) + sin(y)
  • Here there is a tiling of square cells

New Results

Persistence

Recall:

  • A filtration of a simplical complex is a nested sequence of subcomplexes: K, with its filtration, is a filtered complex.
  • A filtration ordering of a simplical complex is a full ordering of its simplexes such that each prefix of the ordering is a sub complex.

Let K(l) be a filtration of a space X: let Z(k,l) = Z(k)(K(l)) and B(k,l) = B(k)(K(l)) be the kth cycle and boundary groups of K(l). The the kth homology group of K(l) is H(k,l) = Z(k,l) / B(k,l). The kith Betti number beta(k,l) of K(l) is the rank of H(k,l).

The kith Betti numbers describe the topology of a growing simplical complex by a sequence of integers.The question is how to pull stable information out of the topological noise. The answer is the notion of persistence - that a significant topological attribute will persist in being a feature of the growing complex. So persistence is defined in terms of filtrations, which become the primary input to algorithms designed to establish persistent attributes.

Homology factors out boundary cycles. Informally, we want to capture non-bonding cycles with long lives: so we look for cycles that will continue to be non-bounding for at least the next p complexes in the filtration.

This gives us persistent homology: If K(l) is a filtration, the p-persistent k'th homology group of K(l) is H(k,l,p) = Z(k,i) / (Z(k,l) ^ B(k,l+p)). The p-persistent k'th Betti number beta(k,l,p) is the rank of H(k,l,p).

As a simplex fills up, cycles will become bounding and so homology group rank will decrease. Let z be a non-bounding k-cycle that's created at time i by simplex s, and let z' ~ z be a homologous k-cycle that is turned into a boundary at time j by simplex t. Then we define the persistence of z and of its homology class [z], as j - i - 1. s is the creator, and t is the destroyer, of [z]. A creator is called a positive simplex, and a destroyer is called a negative simplex. If a cycle class does not have a destroyer, its persistence is infinite.

Frequently a function rho: S(K) -> R will attach numbers to the simplices: these can be used to indice a filtration K(r) = { s in K | rho(s) <= r }. Then for every real w >= 0, we say the w-persistent kith homology group of K(r) is H(k,r,w) = Z(k,r) / (Z(k,r) ^ B(k, r+w)). The w-persistent k'th Betti number beta(k,r,w) of K(r) is the rank of H(k,r,w). The persistence of a k-cycle created at time r and destroyed at time r' is r' - r.

Persistence Algorithms

We work in a 3D torsion-free space. We construct a filtration adding exactly one simplex at every time step. We can calculate Betti numbers beta-k by saying beta-k = # (positive k-simplexes) - # (negative (k+1)-simplexes). This is just a different way of writing rank(H(k)) = rank(Z(k)) - rank(B(K)).

Recall Z are cycles, B are boundaries - so we are looking for cycles that have not been "filled in" by being a boundary of a higher-level simplex that is also in the complex. This means single points, or triangles with no interior simplex, or tetrahedra with no interior simplex, etc.

We need to find the pairs of simplexes responsible for the creation and destruction of cycles - once we have these, working out the Betti numbers is trivial.