Distortion Growth
openNSF
Understanding the structure of massive collections of objects in terms of pairwise distances among their members is fundamentally important to human activities and interests throughout science, industry, and beyond. Here, the notion of “distance” is varied and hence highly expressive: It can include familiar settings such as the 3-dimensional world in which we live and the geometry of curved surfaces, and it covers important notions of proximity within, for examples, social networks, collections of proteins or images, the Internet, and many more; such geometries typically reside in high dimensions and seem nonintuitive or strange in comparison to our Euclidean surroundings. A paradigm that has proven to be ubiquitously powerful and useful to a rich variety of pure and applied investigations is embedding a geometric space of interest into a “nicer” space in which one knows how to perform important tasks (e.g., discovering clusters, efficient storage, compression and optimization, and many more). Typically, this goal cannot be achieved without distorting distances, so the pertinent issue is determining the rate of growth of the smallest possible distortion. This is the overarching theme of the project, which is devoted to resolving central issues and longstanding mysteries of the field. For example, it is not even known how much distances must be distorted, in terms of the total number of objects, when one wishes to represent them in the 3-dimesnional space in which we reside. As another example, while famous and very important work in the 1980s and 1990s determined the largest possible rate of growth of the distortion required to embed a finite metric into Euclidean space (without restricting the target dimension as done above), it remains a longstanding conundrum whether this worst-case scenario can occur when the collection that we wish to embed is a subset of a classical space of vectors (e.g. when distances are measured in terms of the sum of powers of the differences of coordinates, or collections of matrices with natural notions of distance such as the so-called nuclear norm, which occurs in settings ranging from pure mathematics to machine learning and signal processing). When the distortion is shown to grow slowly, as it often does (this entails devising a meaningful geometric embedding), the project harnesses it to devise efficient algorithms for a range of scientific tasks. On the flip side, demonstrating that there is no embedding with small distortion entails obtaining insights and devising tools that are useful and interesting in their own right (e.g., meaningful ways to decompose geometries). The project is devoted to addressing specific open questions in the above directions, as well as deriving unforeseen applications and developing overarching methodologies. The project will include a major effort to train and mentor graduate students and early career researchers through the PI’s advising role and his leadership roles in the mathematics research community.
It is common in analysis and geometry that solutions to natural questions about representing objects using other simpler objects cannot be done without incurring large errors. For example, Bourgain’s famous embedding theorem states that any finite metric space can be embedded into a Hilbert space with bi-Lipschitz distortion of order the logarithm of the number of points, and subsequent work showed that this cannot be improved. However, if the given metric space is a subset of a classical space such as a Lebesgue space, a Schatten-von Neumann trace class, or a subset of a discrete group such as the Heisenberg group or the lamplighter group, can one say anything about its embeddability properties other than merely that it is a metric space? It also remains a longstanding open question to determine the growth rate of the largest possible distortion that is required to embed a finite metric space of a given size in 3-dimensional Euclidean space. As a final example, suppose that we are given a 1-Lipschitz mapping from a subset of a finite dimensional normed space to a Banach and we wish to extend it to a Lipschitz function that is defined on the entire ambient space and whose Lipschitz constant is as small as possible: How must this Lipschitz constant grow with the dimension? This is open even for Euclidean space. The project studies the above questions, and many more along those lines. When low distortion is possible, it yields structural insights that are impactful elsewhere, including algorithm design; such applications are pursued in this project after embedding theorems are proved. The project also studies impossibility results showing that large distortion must be incurred; this involves developing new geometric structural insights and tools in harmonic analysis, probability, isoperimetry, etc.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Up to $350K
machine learningmathematicssocial science