Below is the MADDD seminar program for Spring Quarter 2021:
Organizer for Fall 2020: Stefan C. Schonsheck (Email: [email protected]).
Due to the COVID-19 pandemic we have decided to move the seminar to be fully online. Our seminar runs every Tuesday 4:10-5pm (unless otherwise specified) via a ZOOM meeting: https://ucdavis.zoom.us/j/98654026454
Schedule:
3/30: What are we reading I (WAWR) [4:10pm]
Abstract: This quarter we will try something a bit new in our series to encourage participation and collaboration in the era of remote meetings. Once a month, we will ask the question, "What are we reading?". Participants will present a brief summary (<5mins) of a paper or project they are interested in or are currently reading. Single slides, short whiteboard 'chalk talks', and/or hand waving explanations are all acceptable. Our goal is both to spread general knowledge and to help participants find collaborators with aligned interests. Students enrolled in this class for credit will be required to speak at least two of the three meetings but are encouraged to speak at each.
4/6: Causal Inference in the Light of Drug Repurposing for COVID-19 with Prof. Caroline Uhler [1:00pm]
Abstract: Massive data collection holds the promise of a better understanding of complex phenomena and ultimately, of better decisions. An exciting opportunity in this regard stems from the growing availability of perturbation/intervention data (drugs, knockouts, overexpression, etc.) in biology. In order to obtain mechanistic insights from such data, a major challenge is the development of a framework that integrates observational and interventional data and allows predicting the effect of yet unseen interventions or transporting the effect of interventions observed in one context to another. I will present a framework for causal structure discovery based on such data and highlight the role of overparameterized autoencoders. We end by demonstrating how these ideas can be applied for drug repurposing in the current SARS-CoV-2 crisis.
4/13: Deep Networks from First Principles with Prof. Yi Ma [4:10pm]
Abstract: In this talk, we offer an entirely “white box’’ interpretation of deep (convolution) networks from the perspective of data compression (and group invariance). In particular, we show how modern deep layered architectures, linear (convolution) operators and nonlinear activations, and even all parameters can be derived from the principle of maximizing rate reduction (with group invariance). All layers, operators, and parameters of the network are explicitly constructed via forward propagation, instead of learned via back propagation. All components of so-obtained network, called ReduNet, have precise optimization, geometric, and statistical interpretation. There are also several nice surprises from this principled approach: it reveals a fundamental tradeoff between invariance and sparsity for class separability; it reveals a fundamental connection between deep networks and Fourier transform for group invariance – the computational advantage in the spectral domain (why spiking neurons?); this approach also clarifies the mathematical role of forward propagation (optimization) and backward propagation (variation). In particular, the so-obtained ReduNet is amenable to fine-tuning via both forward and backward (stochastic) propagation, both for optimizing the same objective. This is joint work with students Yaodong Yu, Ryan Chan, Haozhi Qi of Berkeley, Dr. Chong You now at Google Research, and Professor John Wright of Columbia University.
4/20: Consistency of Archetypal Analysis with Prof. Braxton Osting [12:10pm]
Abstract: Archetypal analysis is an unsupervised learning method that uses a convex polytope to summarize multivariate data. For fixed k, the method finds a convex polytope with k vertices, called archetype points, such that the polytope is contained in the convex hull of the data and the mean squared distance between the data and the polytope is minimal. In this paper, we prove a consistency result that shows if the data is independently sampled from a probability measure with bounded support, then the archetype points converge to a solution of the continuum version of the problem, of which we identify and establish several properties. We also obtain the convergence rate of the optimal objective values under appropriate assumptions on the distribution. If the data is independently sampled from a distribution with unbounded support, we also prove a consistency result for a modified method that penalizes the dispersion of the archetype points. Our analysis is supported by detailed computational experiments of the archetype points for data sampled from the uniform distribution in a disk, the normal distribution, an annular distribution, and a Gaussian mixture model.
4/27: What are we reading II [4:10pm]
5/4: Applied Differential Geometry and Harmonic Analysis in Deep Learning Regularization with Prof. Wei Zhu [4:10pm]
Abstract: With the explosive production of digital data and information, data-driven methods, deep neural networks (DNNs) in particular, have revolutionized machine learning and scientific computing by gradually outperforming traditional hand-craft model-based algorithms. While DNNs have proved very successful when large training sets are available, they typically have two shortcomings: First, when the training data are scarce, DNNs tend to suffer from overfitting. Second, the generalization ability of overparameterized DNNs still remains a mystery despite many recent efforts. In this talk, I will discuss two works to “inject” the “modeling” flavor back into deep learning to improve the generalization performance and interpretability of DNNs. This is accomplished by deep learning regularization through applied differential geometry and harmonic analysis. In the first part of the talk, I will explain how to improve the regularity of the DNN representation by imposing a “smoothness” inductive bias over the DNN model. This is achieved by solving a variational problem with a low-dimensionality constraint on the data-feature concatenation manifold. In the second part, I will discuss how to impose scale-equivariance in network representation by conducting joint convolutions across the space and the scaling group. The stability of the equivariant representation to nuisance input deformation is also proved under mild assumptions on the Fourier-Bessel norm of filter expansion coefficients.
5/11: K-Deep Simplex: Structured Manifold Learning with Simplex Constraints With Abiy Tasissa [4:10pm}
Abstract: Sparse Manifold clustering and embedding (SMCE) is an algorithm to cluster nonlinear manifolds using self-representation in the dictionary of data points and a proximity regularization. A computational bottleneck of SMCE is its dependence on a dictionary that scales with the number of data points. In this talk, I will discuss K-Deep Simplex (KDS), a unified optimization framework for nonlinear dimensionality reduction that combines the strengths of manifold learning and sparse dictionary learning. KDS learns local dictionaries that represent a data point with reconstruction coefficients supported on the probability simplex. The dictionaries are learned using algorithm unrolling, an increasingly popular technique for structured deep learning. I will present the application of KDS to the clustering problem and demonstrate its scalability and accuracy on both real and synthetic datasets. This is a joint work with Pranay Tankala, James M. Murphy and Demba Ba.
5/18: A Fast Graph-Based Data Classification Method with Applications to 3D Sensory Data in
the Form of Point Clouds with Prof. Ekaterina Rapinchuk (Merkurjev) [12:10pm]
Abstract: Data classification, where the goal is to divide data into predefined classes, is a fundamental problem in machine learning with many applications, including the classification of 3D sensory data. In this talk, we present a data classification method which can be applied to both semi-supervised and
unsupervised learning tasks. The algorithm is derived by unifying complementary region-based and edge-based approaches; a gradient flow of the optimization energy is performed using modified auction dynamics. In addition to being unconditionally stable and efficient, the method is equipped with
several properties allowing it to perform accurately even with small labeled training sets, often with considerably fewer labeled training elements compared to competing methods; this is an important advantage due to the scarcity of labeled training data. Some of the properties are: the embedding of
data into a weighted similarity graph, the in-depth construction of the weights using, e.g., geometric information, the use of a combination of region-based and edge-based techniques, the incorporation of class size information and integration of random fluctuations. The effectiveness of the method is demonstrated by experiments on classification of 3D point clouds; the algorithm classifies a point cloud of more than a million points in 1-2 minutes.
5/25: What are we reading III [4:10pm]
6/1: TBA
3/30:
Stefan Schonsheck: The Dawning of a New Era in Applied Mathematics - https://www.ams.org/journals/notices/202104/rnoti-p565.pdf
Edgar Jaramillo Rodriguiez: Tolerance for Colorful Tverberg Partitions- https://arxiv.org/pdf/2005.13495.pdf
Dan Romik: The Sphere packing problem in Dimension 8 -https://arxiv.org/abs/1603.04246