All Seminars

Title: A Brief History of Ramsey Theory
Seminar: General Colloquium
Speaker: Steven La Fleur of Emory University
Contact: Steve Batterson, sb@mathcs.emory.edu
Date: 2014-03-21 at 4:00PM
Venue: MSC W303
Download Flyer
Abstract:
Ramsey's theorem asserts that within large enough systems, complete disorder is impossible. This talk will be focused around the history of Ramsey theory, as well as Ramsey himself. We will discuss some of the motivating questions in the early 20th century leading up to the pivotal result by F.P. Ramsey in 1927, as well as the effect that this result has had on mathematics, and specifically combinatorics, since then. We will look at some of the variations of the initial problem that have been considered throughout the last century and partially assess the current state of affairs for these topics. Part of the talk will also investigate F.P. Ramsey himself, where we will briefly mention some of his other contributions to subjects such as mathematical logic (which was his interest when he stated his now famous theorem), as well as economic theory and probability.
Title: Opinion Mining for the Internet: Models, Algorithms and Predictive Analytics
Seminar: Computer Science
Speaker: Arjun Mukherjee of University of Illinois at Chicago
Contact: Vaidy Sunderam, vss@emory.edu
Date: 2014-03-20 at 4:00PM
Venue: W302
Download Flyer
Abstract:
The massive amounts of user generated content in social media offers new forms of actionable intelligence. Public sentiments in debates, blogs, and news comments are crucial to governmental agencies for passing new bills/policies, gauging social unrest, predicting elections, and socio-economic indices. The goal of my research is to build robust statistical models for opinion mining with applications to marketing, social, and behavioral sciences. To achieve this goal, a number of research challenges need to be addressed. The first challenge is fine-grained information extraction which can capture diverse types of opinions (e.g., agreement/disagreement; contention/controversy, etc.) and various other latent sentiments expressed in social conversations and discussions. The state-of-the-art machinery (e.g., topic modeling) falls short for such a task. I develop several novel knowledge induced sentiment topic models which respect notions of human semantics. The second challenge is that social sentiments are inherently dynamic and change over time. To leverage the sentiments over time for predictive analytics (e.g., predicting financial markets), I develop Bayesian nonparametric topic based sentiment time-series and vector autoregression models. The third challenge is to filter deceptive opinion spam/fraud. It is estimated that 15-20% opinions on the Web are fake. Hence, detecting opinion spam is a precondition for reliable opinion mining. In this talk, I will present novel statistical models for sentiment analysis and talk about two key frameworks: (1) Semi-supervised graphical models for mining fine-grained opinions in social conversations, and (2) Bayesian nonparametrics, sentiment time-series, and vector autoregression models for stock market prediction. In the later part of the talk, I will discuss the problem of opinion spam and throw light on some techniques for filtering opinion spam. The focus will be on modeling collusion and combating group spam in e-Commerce reviews. The talk will conclude with a discussion about my ongoing research and future research vision in opinion contagions, forecasting socio-economic indices, and healthcare.
Title: Optimization with sparse matrix cone constraints
Colloquium: N/A
Speaker: Martin Andersen of Technical University of Denmark
Contact: James Nagy, nagy@mathcs.emory.edu
Date: 2014-03-17 at 1:00PM
Venue: W306
Download Flyer
Abstract:
Optimization problems with sparse matrix cone constraints arise naturally in a wide range of applications, and such problems can often be solved efficiently by carefully utilizing the underlying structure. Two kinds of sparse matrix cones are of particular interest: the cone of symmetric positive semidefinte matrices with a given sparsity pattern and its dual cone, the cone of sparse, positive semidefinite-completable matrices. These cones are very general and include, as special cases, the nonnegative orthant, the quadratic cone, and the cone of positive semidefinite matrices. Using techniques from sparse numerical linear algebra, the structure of the sparse matrix cones can be exploited to construct faster optimization algorithms.. This talk will focus on the usefulness of sparse matrix cone formulations, which will be demonstrated through numerical examples drawn from a variety of problems such as optimal power flow and robust estimation.
Title: Secure and Privacy-Assured Outsourced Cloud Data Services
Seminar: Computer Science
Speaker: Ming Li of Utah State University
Contact: Vaidy Sunderam, vss@emory.edu
Date: 2014-03-17 at 4:00PM
Venue: MSC W303
Download Flyer
Abstract:
Cloud computing is envisioned as the next generation architecture of IT enterprises, which provides convenient remote access to massively scalable data storage and application services. Despite the cloud’s promise for huge potential economical savings, by outsourcing data services to the cloud, users lose physical control over their data while cloud service providers can no longer be trusted to guarantee their data security and privacy. This leads to a paradigm shift in cloud security research in recent years, under which many issues including data confidentiality, access control, integrity protection and utilization need to be revisited. In this talk, I will present our research efforts in data security and privacy in cloud computing, which aim at returning full control over outsourced data to their owners through cryptographic approaches. The first part introduces a scalable and owner-centric secure data sharing scheme, where owners can cryptographically enforce fine-grained data access control on any untrusted server by specifying access policies based on attributes of the data itself and authorized users, which is achieved by adapting a new cryptographic primitive called attribute-based encryption. The second part gives an overview of our other research projects, including secure integrity auditing of shared outsourced data (without physically possessing a copy of the data), and privacy-preserving searches over encrypted cloud data (without letting the cloud learn both the data contents and search keywords). Finally, I will outline future research directions on secure computation outsourcing, big data security and privacy, and secure cyber-physical systems.
Title: Regularization in Tomography - Dealing with Ambiguity and Noisy Data
Seminar: Computational Mathematics
Speaker: Per Christian Hansen of Technical University of Denmark
Contact: James Nagy, nagy@mathcs.emory.edu
Date: 2014-03-13 at 2:00PM
Venue: W306
Download Flyer
Abstract:
Tomographic reconstructions are routinely computed each day. Our reconstruction algorithms are so reliable that we sometimes forget we are actually dealing with inverse problems with inherent stability problems. This is because the algorithms automatically incorporate regularization techniques that, in most cases, handle very well the stability issues.\\ \\ In this talk we take a basic look at the inverse problem of CT reconstruction, in order to understand the stability problems that manifest themselves in solutions that may be very sensitive to data errors and may also fail to be unique. We demonstrate how regularization is used to avoid these problems and make the reconstruction process stable, and how the regularization is incorporated in standard reconstruction algorithms. Moreover, we shall see that different regularization techniques have different impact in the computed reconstructions.
Title: Scalable Probabilistic Inference for Complex Dynamical Models
Colloquium: N/A
Speaker: Lei Li of University of California, Berkeley
Contact: Vaidy Sunderam, vss@emory.edu
Date: 2014-03-07 at 3:00PM
Venue: MSC W303
Download Flyer
Abstract:
Time series data arise in numerous applications, such as data center monitoring, tracking web user activities, health care, etc. Detecting patterns and learning features in collections of data sequences are crucial to solve real-world, domain specific problems, for example, to track moving objects in videos, to spot nefarious online activities, and to forecast patients' health states.\\ \\ In this talk, I define a new tensor dynamical model for multivariate data as well as an efficient algorithm for learning such models from data. In addition, I will present an efficient approach to jointly estimate parameters and latent states for a large class of models including nonlinear dynamical systems. Finally, I will present my work on efficient inference for a probabilistic declarative programming language, which aims to democratize machine learning and to enable practitioners to solve their domain specific problems.\\ \\ Bio: Dr. Lei Li is a Post-Doctoral researcher at EECS department of UC Berkeley. His research interest lies in the intersection of machine learning, statistical inference and database systems. Specifically, he has been working on Bayesian inference in open universe probabilistic models, probabilistic programming language, large-scale learning, time series, communication and social networks. He has served in the Program Committee for ICML 2014, SDM 2013/2014, and IJCAI 2011/2013. He has been invited as reviewer for TOMCCAP, DAMI, TKDE, TOSN, Neurocomputing, KDD, SIGMOD, VLDB, PKDD and WWW. He has been invited to review NSF proposal in 2010 and to DARPA's Information Science and Technology (ISAT) probabilistic programming workshop in 2013.\\ \\ Lei received his B.S. in Computer Science and Engineering from Shanghai Jiao Tong University in 2006 and Ph.D. in Computer Science from Carnegie Mellon University in 2011, respectively. His dissertation work on fast algorithms for mining co-evolving time series was awarded ACM KDD best dissertation (runner up).
Title: Finding a Happy Medium between Accuracy and Speed for Dependency Parsing
Colloquium: N/A
Speaker: Jinho Choi of University of Massachusetts Amherst
Contact: Vaidy Sunderam, vss@emory.edu
Date: 2014-03-06 at 4:00PM
Venue: W306
Download Flyer
Abstract:
Why is Natural Language Processing interesting? What makes NLP hard? How can we bring NLP research to practice? These are all open-ended questions. In this talk, I present a novel approach called selectional branching, which optimizes both accuracy and speed for one of core NLP tasks, dependency parsing. Our approach uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy dependency parsing approach. Selectional branching is guaranteed to perform faster than beam search yet performs as accurately. With the benchmark setup in English, our parser shows an accuracy of 92.96% and a speed of 9 milliseconds per sentence, which is faster and more accurate than the previous state-of-the-art transition-based parser using beam search. It also outperforms other dependency parsers using beam search, dynamic programming, integer linear programming, etc. for languages such as Danish, Dutch, Slovene, and Swedish.
Title: Towards Large Scale Open Domain Natural Language Processing
Colloquium: N/A
Speaker: Gourab Kundu of University of Illinois
Contact: Vaidy Sunderam, vss@emory.edu
Date: 2014-03-05 at 3:00PM
Venue: MSC W201
Download Flyer
Abstract:
Machine Learning and Inference methods are becoming ubiquitous ñ a broad range of scientific advances and technologies rely on machine learning techniques. In particular, the big data revolution heavily depends on our ability to use statistical machine learning methods to make sense of the large amounts of data we have. Research in Natural Language Processing has both benefited and contributed to the advancement of machine learning and inference methods. However multiple problems still hinder the broad application of some of these methods. Performance Degradation of machine learning based systems in domains other than the training domain is one of the key problems hindering widespread deployment of these systems.\\ \\ In this talk, I will present techniques for domain adaptation "on the fly", that allows adaptation to test domains using the same model from training domain. This is accomplished by transforming text from the test domain to look more like the training domain and running the same model from the training domain. This process of text adaptation treats the model as black box, thus makes the adaptation of complex pipelines of models easy and flexible. The next key challenge for machine learning is the processing of vast amounts of data in an efficient manner. Prediction problems for tools are often complicated, for natural language processing and other disciplines, making application of these tools to big data infeasible. The later part of the talk will focus on improving the scalability of machine learning tools with complex prediction component to meet the challenges of big data. I will show how it is possible to amortize the cost of prediction over the lifetime of any machine learning tool. Particularly, I will focus on amortizing integer linear programs which can represent a wide variety of prediction problems. I will present exact and approximate theorems for speeding up the solution time of new integer programs by reusing solutions of previously solved integer programs.\\ \\ Gourab Kundu is a doctoral candidate in Computer Science Department of University of Illinois at Urbana-Champaign, supervised by Prof. Dan Roth. He has also worked in IBM research and Google for summer internships. He is broadly interested in all aspects of machine learning and natural language processing. He has publications in top tier natural language processing conferences along with a best student paper in CoNLL 2011.
Title: Bounded gaps between primes in Chebotarev sets
Seminar: Algebra
Speaker: Jesse Thorner of Emory University
Contact: David Zureick-Brown, dzb@mathcs.emory.edu
Date: 2014-03-04 at 4:00PM
Venue: W302
Download Flyer
Abstract:
A new and exciting breakthrough due to Maynard establishes that there exist infinitely many pairs of primes $p_1,p_2$ with $|p_1-p_2|\leq 600$ as a consequence of the Bombieri-Vinogradov Theorem. In this paper, we apply their general method to the setting of Chebotarev sets of primes. We study applications of these bounded gaps with an emphasis on ranks of prime quadratic twists of elliptic curves over $\mathbb{Q}$.
Title: Weights and Measures: Fast Prediction in an Era of Big-Data
Colloquium: N/A
Speaker: Lev Reyzin of University of Illinois at Chicago
Contact: Vaidy Sunderam, vss@emory.edu
Date: 2014-03-04 at 4:00PM
Venue: MSC W303
Download Flyer
Abstract:
In this talk I will discuss algorithms I have developed for learning in a world where data is abundant and predicting quickly and accurately is essential. In particular, I will focus on some recent work on modern variants of supervised and bandit learning. One common element of the algorithms I will present is that they nontrivially improve upon classical weighing and sampling methods to produce provable and practical improvements over traditional approaches.