(203a) The Euler Characteristic: A General Topological Descriptor for Complex Data | AIChE

(203a) The Euler Characteristic: A General Topological Descriptor for Complex Data


Smith, A. - Presenter, University of Wisconsin - Madison
Zavala, V. M., University of Wisconsin-Madison
Datasets are mathematical objects (e.g., point clouds, matrices, graphs, images, field/functions) that have shape [1]. Characterizing the shape (geometrical features) of these objects is important but is not always straightforward. Popular tools from statistics, linear algebra, and signal processing (e.g., moments, correlation functions, singular value decomposition, convolutions, Fourier analysis) do not directly characterize geometrical features of data objects; instead, such tools are used to characterize other types of features (e.g., variance and frequency content).

Topology is a branch of mathematics that provides powerful tools to characterize the shape of data objects. One such tool is the so-called Euler characteristic (EC); the EC, originally used for the characterization of polyhedra [2], is now broadly used in scientific areas such as random fields [3, 4, 5], cosmology [6, 7, 8], material science [9, 10, 11, 12], thermodynamics [13, 14, 15], and neuroscience [16, 17]. To the best of our knowledge, the EC has seen limited applications in engineering; moreover, a fact that is often overlooked in the literature is that the EC provides a general descriptor of different types of topological spaces (this enables the characterization of a wide range of data). This generality arises from the fact that (i) one can use transformations to map a data object into another type of object and (ii) the EC has fundamental connections with statistics, field theory, linear algebra, and graph theory.

Topological descriptors such as the EC offer advantages over statistical descriptors [18]. For instance, statistical descriptors such as Moran’s I or correlation matrices do not directly capture the spatial structure of the data (thus limiting the ability to characterize geometrical features) [19, 20]. High-order statistical descriptors such as correlation functions are also limited at capturing spatial and morphological features of the data (especially if the data object is irregular) [20]. However, there exists well-developed theory that connects the EC to the geometry of random fields [4, 5]. Such work establishes that the EC encodes the information of simple statistical descriptors such as means and variances and of more complex descriptors (e.g., space-time covariances) [3]. These connections between topology and statistics are powerful and provide a mechanism to understand the emergence of topological features from physical behavior (e.g., diffusion phenomena). The EC also connects with concepts from linear algebra and graph theory; for example, a matrix or an image can be represented as a weighted graph and the geometry of the graph can be quantified using the EC (a graph is a 2D polyhedron).

In this talk, we will focus on the EC and its application as a descriptor that characterizes geometrical features of data. This characterization is accomplished by performing a decomposition of a data object into a set of independent topological bases which is summarized in the form of what is called an EC curve. We briefly discuss the mathematics of the EC and how it can be used to characterize diverse data objects (e.g. graphs and images/fields). We then shift our focus to the application of these concepts to tackle diverse problems arising in science and engineering; in particular, we discuss how the EC can be used in process monitoring by analyzing correlation structures. We also apply the EC in the analysis of both 2D spatial and 3D spatio-temporal fields; these data objects are derived from reaction-diffusion partial differential equations (PDEs), micrographs of liquid crystals, and flow cytometry. We show that the EC effectively reduces complex datasets and that this reduction facilitates tasks such as visualization, regression, classification, and clustering.

[1] Alexander D Smith, Paweł Dłotko, and Victor M Zavala. Topological data analysis: Concepts, computation, and applications in chemical engineering. Computers & Chemical Engineering, page 107202, 2020

[2] Leonhard Euler. Elementa doctrinae solidorum. Novi commentarii academiae scientiarum Petropolitanae, pages 109–140, 1758.

[3] Robert J Adler. Some new random field tools for spatial analysis. Stochastic Environmental Research and Risk Assessment, 22(6):809–822, 2008

[4] Robert J Adler. The geometry of random fields. SIAM, 2010.

[5] Robert J Adler and Jonathan E Taylor. Random fields and geometry. Springer Science & Business Media, 2009.

[6] Jens Schmalzing and Krzysztof M Gorski. Minkowski functionals used in the morphological ´ analysis of cosmic microwave background anisotropy maps. Monthly Notices of the Royal Astronomical Society, 297(2):355–365, 1998.

[7] Pratyush Pranav, Rien Van de Weygaert, Gert Vegter, Bernard JT Jones, Robert J Adler, Job Feldbrugge, Changbom Park, Thomas Buchert, and Michael Kerber. Topology and geometry of Gaussian random fields i: on Betti numbers, Euler characteristic, and Minkowski functionals. Monthly Notices of the Royal Astronomical Society, 485(3):4167–4208, 2019.

[8] M Kerscher, K Mecke, J Schmalzing, C Beisbart, Th Buchert, and H Wagner. Morphological fluctuations of large-scale structure: The PSCZ survey. Astronomy & Astrophysics, 373(1):1–11, 2001.

[9] Klaus R Mecke and Dietrich Stoyan. Statistical physics and spatial statistics: the art of analyzing and modeling spatial structures and pattern formation, volume 554. Springer Science & Business Media, 2000.

[10] Christoph H Arns, Mark A Knackstedt, and KR Mecke. Reconstructing complex materials via effective grain shapes. Physical Review Letters, 91(21):215506, 2003.

[11] Ruth Charney and Michael Davis. The Euler characteristic of a nonpositively curved, piecewise Euclidean manifold. Pacific Journal of Mathematics, 171(1):117–137, 1995.

[12] Christian Scholz, Frank Wirner, Jan Gotz, Ulrich Rude, Gerd E Schroder-Turk, Klaus Mecke, and Clemens Bechinger. Permeability of porous materials determined from the Euler characteristic. Physical review letters, 109(26):264504, 2012.

[13] KR Mecke. Morphological thermodynamics of composite media. Fluid Phase Equilibria, 150:591– 598, 1998.

[14] Hamid Hosseinzade Khanamiri, Carl Fredrik Berg, Per Arne Slotte, Steffen Schluter, and Ole ¨ Torsæter. Description of free energy for immiscible two-fluid flow in porous media by integral geometry and thermodynamics. Water Resources Research, 54(11):9045–9059, 2018.

[15] Hendrik Hansen-Goos, Roland Roth, Klaus Mecke, and S Dietrich. Solvation of proteins: linking thermodynamics to geometry. Physical review letters, 99(12):128101, 2007.

[16] James M Kilner, Stefan J Kiebel, and Karl J Friston. Applications of random field theory to electrophysiology. Neuroscience letters, 374(3):174–178, 2005.

[17] Hyekyoung Lee, Moo K Chung, Hyejin Kang, Bung-Nyun Kim, and Dong Soo Lee. Discriminative persistent homology of brain networks. In 2011 IEEE international symposium on biomedical imaging: from nano to macro, pages 841–844. IEEE, 2011.

[18] Eitan Richardson and Michael Werman. Efficient classification using the Euler characteristic. Pattern Recognition Letters, 49:99–106, 2014.

[19] Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 2, pages 2169–2178. IEEE, 2006.

[20] Hubert Mantz, Karin Jacobs, and Klaus Mecke. Utilizing Minkowski functionals for image analysis: a marching square algorithm. Journal of Statistical Mechanics: Theory and Experiment, 2008(12):P12015, 2008.