All Seminars

Title: A Study of Benford's Law for the Values of Arithmetic Functions
Defense: Honors
Speaker: Letian Wang of Emory University
Contact: Letian Wang, letian.wang@emory.edu
Date: 2017-03-29 at 1:00PM
Venue: MSC E408
Download Flyer
Abstract:
"Benford's Law characterizes the distribution of initial digits in large datasets across disciplines. Since its discovery by Simon Newcomb in 1881, Benford's Law has triggered tremendous studies. In this paper, we will start by introducing the history of Benford's Law and discussing in detail the explanations proposed by mathematicians on why various datasets are Benford. Such explanations include the Spread Hypothesis, the Geometric, the Scale-Invariance, and the Central Limit explanations. "To rigorously de ne Benford's Law and to motivate criteria for Benford sequences, we will provide fundamental theorems in uniform distribution modulo 1 in Chapter 2. We will state and prove criteria for checking uniform distribution, including Weyl's Criterion, Van der Corput's Di erence Theorem, as well as their corollaries.\\ \\"In Chapter 3, we will introduce the logarithm map, which allows us to reformulate Benford's Law with uniform distribution modulo 1 studied earlier. We will start by examining the case of base 10 only and then generalize to arbitrary bases. "Finally, we will elaborate on the idea of good functions. We will prove that good functions are Benford, which in turn enables us to nd a new class of Benford sequences. We will use this theorem to show that the partition function p(n) and the factorial sequence n! follow Benford's Law."
Title: Utility-cost of provable privacy: A case study on US Census data.
Seminar: Computer Science
Speaker: Ashwin Machanavajjhala of Duke University
Contact: Li Xiong, lxiong@emory.edu
Date: 2017-03-29 at 4:00PM
Venue: MSC W303
Download Flyer
Abstract:
Privacy is an important constraint that algorithms must satisfy when analyzing sensitive data from individuals. Differential privacy has revolutionized the way we reason about privacy, and has championed the need for data analysis algorithms with provable privacy guarantees. Differential privacy and its variants have arisen as the gold standard for exploring the tradeoff between the privacy ensured to individuals and the utility of the statistical insights mined from the data, and are in use by many commercial (e.g., Google and Apple) and government entities (e.g., US Census) for collecting and sharing sensitive user data.\\ \\ In today's talk I will highlight key challenges in designing differentially private algorithms for emerging applications, and highlight research from our group that try to address these challenges. In particular I will describe our recent work on modernizing the data publication process for a US Census Bureau data product, called LODES/OnTheMap. In this work, we identified legal statutes and their current interpretations that regulate the publication of LODES/OnTheMap data, formulated these regulations mathematically, and designed algorithms for releasing tabular summaries that provably ensured these privacy requirements. Our solutions are able to release summaries of the data with error comparable or even better than current releases (which are not provably private), for reasonable settings of privacy parameters.\\ \\ Bio: Ashwin Machanavajjhala is an Assistant Professor in the Department of Computer Science, Duke University. Previously, he was a Senior Research Scientist in the Knowledge Management group at Yahoo! Research. His primary research interests lie in algorithms for ensuring privacy in statistical databases and augmented reality applications. He is a recipient of the National Science Foundation Faculty Early CAREER award in 2013, and the 2008 ACM SIGMOD Jim Gray Dissertation Award Honorable Mention. Ashwin graduated with a Ph.D. from the Department of Computer Science, Cornell University and a B.Tech in Computer Science and Engineering from the Indian Institute of Technology, Madras.
Title: On Saturation Spectrum
Defense: Dissertation
Speaker: Jessica Fuller of Emory
Contact: Jessica Fuller, jfulle@emory.edu
Date: 2017-03-28 at 2:45PM
Venue: MSC E406
Download Flyer
Abstract:
Given a graph H, we say a graph G is H-saturated if G does not contain H as a subgraph and the addition of any edge not already in G results in H as a subgraph. The question of the minimum number of edges of an H saturated graph on n vertices, known as the saturation number, and the question of the maximum number of edges possible of an H -saturated graph, known as the Turán number, has been addressed for many different types of graphs. We are interested in the existence of H -saturated graphs for each edge count between the saturation number and the Turán number. We determine the saturation spectrum of (Kt-e)-saturated graphs and Ft-saturated graphs. Let (Kt-e) be the complete graph minus one edge. We prove that (Kt-e)-saturated graphs do not exist for small edge counts and construct (Kt-e)-saturated graphs with edge counts in a continuous interval. We then extend the constructed (Kt-e)-saturated graphs to create (Kt-e)-saturated graphs. Let Ft be the graph consisting of t edge-disjoint triangles that intersect at a single vertex v. We prove that F2-saturated graphs do not exist for small edge counts and construct a collection of F2-saturated graphs with edge counts in a continuous interval. We also establish more general constructions that yield a collection of Ft-saturated graphs with edge counts in a continuous interval.
Title: Finite index for arboreal Galois representations
Seminar: Algebra
Speaker: Andrew Bridy of Texas A and M
Contact: David Zureick-Brown, dzb@mathcs.emory.edu
Date: 2017-03-28 at 4:00PM
Venue: MSC W201
Download Flyer
Abstract:
Let K be a global field of characteristic 0, let f in $K(x)$ and b in K, and set $K_n := K(f^{-n}(b))$. The projective limit of the groups $Gal(K_n/K)$ embeds in the automorphism group of an infinite rooted tree. A difficult problem is to find criteria that guarantee the index is finite; a complete answer would give a dynamical analogue of Serre's famous open image theorem. When f is a cubic polynomial over a function field, I prove a set of necessary and sufficient conditions for finite index (for number fields, the proof is conditional on Vojta's conjecture). This is joint work with Tom Tucker.
Title: Compositional Models for Information Extraction
Seminar: N/A
Speaker: Mark Dredze of Johns Hopkins University
Contact: Eugene Agichtein, eugene@mathcs.emory.edu
Date: 2017-03-27 at 4:00PM
Venue: White Hall 207
Download Flyer
Abstract:
Information extraction systems are the backbone of many end-user applications, including question answering, web search and clinical text analysis. These applications depend on underlying technologies that can identify entities and relations as expressed in natural language text. For example, Amazon Echo may answer a user question based on a relation extracted from a news article. A clinical decision support system may offer a physician suggestions based on a symptom identified in the clinical notes from a previous patient visit. In political science, we may seek to aggregate opinions expressed in public comments about a new public policy. Advances in machine learning have led to new neural models for learning effective representations directly from data that improve information extraction tasks. Yet for many tasks, years of research have created hand-engineered features that yield state of the art performance. I will present feature-rich compositional models that combine both hand-engineered features with learned text representations to achieve new state-of-the-art results for relation extraction. These models are widely applicable to problems within natural language processing and beyond. Additionally, I will survey how these models fit into my broader research program by highlighting work by my group on developing new machine learning methods for extracting public health information from clinical and social media text.
Title: TBD
Seminar: N/A
Speaker: TBD of
Contact: TBA
Date: 2017-03-23 at 0:00AM
Venue: TBA
Download Flyer
Abstract:
Title: Human-centered Data Science for Crisis Informatics
Seminar: Computer Science
Speaker: Marina Kogan of University of Colorado
Contact: Li Xiong, lxiong@emory.edu
Date: 2017-03-23 at 4:00PM
Venue: MSC W201
Download Flyer
Abstract:
Disasters arising from natural hazards are associated with the disruption of existing social structures, but they also result in the creation of new social ties by those affected as they problem-solve alone and together. With social media now being a site for some of this interaction, there is much to learn about the nature of those changing social structures, including how and why they shift. However, the study of this social arena is challenging, because the high-tempo, high-volume convergent nature of crisis events produces vast amounts of social media data, necessitating the use of the data science methods. On the other hand, to glean meaningful insight from the crisis-related social media activity, it is necessary to use methods that account for the complex social context of the user activity, including qualitative analysis.\\ \\In this talk Kogan will show how Human-Centered Data Science provides methodological approaches that both harness the power of computation methods and account for the highly situated nature of social media activity in disaster. Utilizing these methodological approaches, she will show how disaster-related coordination and distributed problem solving take shape on two social media platforms: Twitter and OpenStreetMap.
Title: Congruence of Galois representations
Seminar: Algebra
Speaker: Sujatha Ramdorai of University of British Columbia
Contact: David Zureick-Brown, dzb@mathcs.emory.edu
Date: 2017-03-21 at 4:00PM
Venue: W306
Download Flyer
Abstract:
We consider Galois representations whose residual representations are isomorphic and study what this implies for invariants associated to such representations.
Title: Extremal Problems for Graphs and Hypergraphs
Defense: Dissertation
Speaker: Bill Kay of Emory University
Contact: Bill Kay, w.w.kay@emory.edu
Date: 2017-03-21 at 4:00PM
Venue: MSC W301
Download Flyer
Abstract:
We discuss a pair of papers in extremal combinatorics. One establishes asymptotically the chromatic number of the so-called type graphs, and the other investigates a certain property of oriented hypergraphs (Property O).
Title: Designing Abstract Meaning Representations
Seminar: Computer Science
Speaker: Martha Palmer of University of Colorado
Contact: Jinho Choi, choi@mathcs.emory.edu
Date: 2017-03-17 at 3:00PM
Venue: MSC W301
Download Flyer
Abstract:
Abstract Meaning Representations (AMRs) provide a single, graph-based semantic representation that abstracts away from the word order and syntactic structure of a sentence, resulting in a more language-neutral representation of its meaning. AMRs implements a simplified, standard neo-Davidsonian semantics. A word in a sentence either maps to a concept or a relation or is omitted if it is already inherent in the representation or it conveys inter-personal attitude (e.g., stance or distancing). The basis of AMR is PropBank’s lexicon of coarse-grained senses of verb, noun and adjective relations as well as the roles associated with each sense (each lexicon entry is a ‘roleset’). By marking the appropriate roles for each sense, this level of annotation provides information regarding who is doing what to whom. However, unlike PropBank, AMR also provides a deeper level of representation of discourse relations, non-relational noun phrases, prepositional phrases, quantities and time expressions (which PropBank largely leaves unanalyzed), as well as Named Entity tags with Wikipedia links. Additionally, AMR makes a greater effort to abstract away from language-particular syntactic facts. The latest version of AMR includes adding coreference links across sentences, including links to implicit arguments. This talk will explore the differences between PropBank and AMR, the current and future plans for AMR annotation, and the potential of AMR as a basis for machine translation. It will end with a discussion of areas of semantic representation that AMR is not currently addressing, which remain as open challenges.\\ \\ Martha Palmer is a Professor at the University of Colorado in Linguistics, Computer Science and Cognitive Science, and a Fellow of the Association of Computational Linguistics.. She works on trying to capture elements of the meanings of words that can comprise automatic representations of complex sentences and documents. Supervised machine learning techniques rely on vast amounts of annotated training data so she and her students are engaged in providing data with word sense tags, semantic role labels and AMRs for English, Chinese, Arabic, Hindi, and Urdu, both manually and automatically, funded by DARPA and NSF. These methods have also recently been applied to biomedical journal articles, clinical notes, and geo-science documents, funded by NIH and NSF. She is a co-editor of LiLT, Linguistic Issues in Language Technology, and has been on the CLJ Editorial Board and a co-editor of JNLE. She is a past President of the Association for Computational Linguistics, past Chair of SIGLEX and SIGHAN, co-organizer of the first few Sensevals, and was the Director of the 2011 Linguistics Institute held in Boulder, Colorado.