All Seminars

Title: Data Warehousing and Ensemble Learning of Omics Data
Graduate Student Seminar: Computer Science
Speaker: Xiaobo Sun of Emory University
Contact: TBA
Date: 2018-04-20 at 1:00PM
Venue: Room GCR311 of Department of Biostatistics
Download Flyer
Abstract: The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets where the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically, beyond just metadata to improve “findability”. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org.

Title: The maximum number of cycles in a graph
Seminar: Combinatorics
Speaker: Andrii Arman of The University of Manitoba
Contact: Dwight Duffus, dwight@mathcs.emory.edu
Date: 2018-04-20 at 4:00PM
Venue: MSC W301
Download Flyer
Abstract: The problem of bounding the total number of cycles in a graph is more than a century old. In 1897, Ahrens proved bounds on the number of cycles using the cyclomatic number of the graph and since then many results have appeared on the maximum number of cycles in graphs with different restrictions. In this talk I will consider a problem of maximizing the number of cycles for three classes of graphs: graphs with given number of edges (and unrestricted number of vertices), graphs with a given average degree, and graphs without a clique of a specific size. For the first two classes I will show that the maximum number of cycles in a graph has bounds exponential in the number of edges of the graph. I will also present exponentially tight bounds for the maximum number of cycles in a multigraph with a fixed number of vertices and edges.

Title: Vector-valued Hirzebruch-Zagier series and class number sums
Seminar: Algebra
Speaker: Brandon Williams of UC Berkeley
Contact: David Zureick-Brown, dzb@mathcs.emory.edu
Date: 2018-04-17 at 4:00PM
Venue: W304
Download Flyer
Abstract: For any fundamental discriminant $D > 0$, Hirzebruch and Zagier constructed a modular form of weight two whose Fourier coefficients are corrections of the Hurwitz class number sums $\sum_{r^2 \equiv 4n \, (D)} H((4n - r^2) / D)$. In this talk, we will discuss how one can reinterpret their result and remove the condition that $D$ is fundamental by working instead with vector-valued modular forms for Weil representations.

Title: Primes fall for the gambler's fallacy
Colloquium: N/A
Speaker: K. Soundararajan of Stanford University
Contact: David Zureick-Brown, dzb@mathcs.emory.edu
Date: 2018-04-12 at 5:00PM
Venue: MSC W301
Download Flyer
Abstract: The gambler's fallacy is the erroneous belief that if (for example) a coin comes up heads often, then in the next toss it is more likely to be tails. In recent work with Robert Lemke Oliver, we found that funnily the primes exhibit a kind of gambler's fallacy: for example, consecutive primes do not like to have the same last digit. I'll show some of the data on this, and explain what we think is going on.

Title: Sleeping Beauty and Other Probability Conundrums
Seminar: Combinatorics
Speaker: Peter Winkler of Dartmouth
Contact: Dwight Duffus, dwight@mathcs.emory.edu
Date: 2018-04-11 at 4:00PM
Venue: MSC W301
Download Flyer
Abstract: Probability theory rests on seemingly firm axioms, yet simple questions continue to confound philosophers and intrigue the public. We'll examine some of these questions and try to determine whether they uncover real problems with the foundations of probability theory, or just challenges to our flawed human intuition.

Title: Statistical and Machine Learning Methods in the Studies of Epigenetics Regulation.
Defense: Dissertation
Speaker: Tianlei Xu of Emory University
Contact: Tianlei Xu, txu28@emory.edu
Date: 2018-04-10 at 10:00AM
Venue: Claudia Nance Rollins Bldg. Rm 1036
Download Flyer
Abstract: Rapid development of next generation sequencing technologies produces a plethora of large-scale epigenome profiling data. Given the quantity of available epigenome datasets, obtaining a clear and comprehensive picture of the underlying regulatory network remains a challenge. The multitude of cell type heterogeneity and temporal changes in the epigenome make it impossible to assay all epigenome events for each type of cell. Computational model shows its advantages in capturing intrinsic correlations among epigenetic features and adaptively predicting epigenome marks in a dynamic scenario. Current progress in machine learning provides opportunities to uncover higher level patterns of epigenome interactions and integrating regulatory signals from different resources. My works aim to utilize public data resources to characterize, predict and understand the epigenome-wide regulatory relationship. The first part of my work is a novel computational model to predict in vivo transcription factor (TF) binding using base-pair resolution methylation data. The model combines cell-type specific methylation patterns and static genomic features, and accurately predicts binding sites of a variety of TFs among diverse cell types. The second part of my work is a computational framework to integrate sequence, gene expression and epigenome data for genome wide TF binding prediction. This extended supervised framework integrates motif features, context-specific gene expression and chromatin accessibility profiles across multiple cell types and scale up the TF prediction task beyond the limits of candidate sites with limited known motifs. The third part of my work is a novel computational strategy for functional annotation of non-coding genomic regions. It takes advantage of the newly emerged, genome-wide and tissue-specific expression quantitative trait loci (eQTL) information to help annotate a set of genomic intervals in terms of transcription regulation. This method builds a bridge connecting genomic intervals with biological pathways and pre-defined biological-meaningful gene sets. Tissue specificity analysis provides additional evidence of the distinct roles of different tissues in the disease mechanisms

Title: The Translation from SQL to Relation Algebra
Defense: Honors Thesis
Speaker: Yicong Li of Emory University
Contact: Shun Yan Cheung, chueng@mathcs.emory.edu
Date: 2018-04-06 at 11:00AM
Venue: MSC N301
Download Flyer
Abstract: SQL (Structural Query Language) and Relational Algebra are two important languages to manipulate relational database. SQL is an international standard language used to express queries on data stored in a database. Relational Algebra is a Mathematical language with operations on sets. SQL queries are first translated to an equivalent expression in Relational Algebra in query processing. The thesis explores the translation from SQL to Relational Algebra to gain a deeper understanding in database systems. The thesis begins with an introduction to the problem (including motivation to working on the translation), the related background knowledge to handle the translation, and follows with the project design. It then discusses the evaluation of the result, reflects on my learning experience from the project, and makes suggestion about further improvement.

Title: Beating flops, communication and synchronization in sparse factorizations
Colloquium: Computational Mathematics
Speaker: Sherry Li of Lawrence Berkeley National Lab
Contact: Lars Ruthotto, lruthotto@emory.edu
Date: 2018-04-05 at 3:00PM
Venue: MSC E 208
Download Flyer
Abstract: Multiphysics and multiscale simulations often need to solve discretized sparse algebraic systems that are highly indefinite, nonsymmetric and extremely ill-conditioned. For such problems, factorization based algorithms are often at the core of the solvers toolchain. Compared to pure iterative methods, the higher computation and communication costs in factorization methods present serious hurdles to utilizing extreme-scale hardware. I will present several research vignettes aimed at reducing those costs. By incorporating data-sparse low-rank structures, such as hierarchical matrix algebra, we can obtain lower arithmetic complexity as well as robust preconditioner. By replicating small amount of data in sparse factorization, we can avoid communication with provablly lower communication complexity. By means of asynchronous, custermized broadcast/reduction, we can reduce the dominating latency cost in sparse triangular solution. The effectiveness of these techniques will be demonstrated with our open source software STRUMPACK and SuperLU. Bio: Sherry Li is a Senior Scientist at Lawrence Berkeley National Laboratory. She has worked on diverse problems in high performance scientific computations, including parallel computing, sparse matrix computations, high precision arithmetic, and combinatorial scientific computing She has (co)authored over 100 publications. She is the lead developer of SuperLU sparse direct solver library, and has contributed to several other widely-used mathematical libraries, including ARPREC, LAPACK, STRUMPACK, and XBLAS. She received Ph.D. in Computer Science from UC Berkeley in 1996. She is a SIAM Fellow and an ACM Senior Member.

Title: Efficient, stable, and reliable solvers for the steady incompressible Navier-Stokes equations in computational hemodynamics
Defense: Dissertation
Speaker: Alexander Fuller Viguerie of Emory University
Contact: Alexander Fuller Viguerie, aviguer@emory.edu
Date: 2018-04-04 at 9:00AM
Venue: MSC E406
Download Flyer
Abstract: In recent years, improvements in medical imaging and image-reconstruction algorithms have led to increased interest in the use of Computational Fluid Dynamics (CFD) as a clinical tool in hemodynamics. While such methods have long been employed in the design of medical devices and in basic medical research, many of the techniques commonly employed in these contexts are not ideal in the clinical setting. In particular, in clinical settings typically one is faced with more demanding turnaround times for simulations, less powerful computational resources, and noisy, incomplete, or missing data.\\ \\ In this thesis, we discuss these challenges and introduce CFD methods which are more practical for direct clinical application. Frequently in these settings, the variable of interest is the temporal average of some time-periodic quantity, such as wall shear-stress, over a cardiac cycle. In these cases, the standard procedure is to perform an unsteady simulation over several cardiac cycles and then to take the time average of the last one. Here, we propose to instead surrogate the unsteady time-averaged solution with the solution of a steady-state problem, allowing us to compute it directly. This approach, if properly applied, can dramatically lower computational cost as we show here; however in many respects the steady problem is arguably more difficult numerically than its unsteady counterpart.\\ \\ We will address these difficulties and propose effective workarounds. In particular, we aim to develop methods for steady solvers that are \textit{efficient}, \textit{stable}, and \textit{reliable}. Roughly speaking, this work is divided into three parts, with each part focusing on one of these aspects. Concerning efficiency, we extend the inexact algebraic factorization approach popular for the unsteady problem into the steady setting. We will address the issue of stability by taking inspiration from nonlinear filtering techniques used in turbulence modeling to develop stabilization techniques for the steady problem. Finally, we will develop and validate methods for assigning boundary conditions in data-deficient settings while maintaining reliability. Throughout each section, we will provide both theoretical and numerical justification for our methods.

Title: Deligne's Exceptional Series and Modular Linear Differential Equations
Type: Master's Defense
Speaker: Robert Dicks of Emory University
Contact: Robert Dicks, robert.julian.dicks@emory.edu
Date: 2018-04-03 at 1:00PM
Venue: White Hall 200
Download Flyer
Abstract: In 1988, Mathur, Mukhi, and Sen studied rational conformal field theories in terms of differential equations satisfied by their characters. These differential equations are modular invariant, and the solutions they obtain for order 2 equations have relationships with certain Lie algebras. In fact, the Lie algebras in the Deligne Exceptional series appear, whose study is motivated by uniformities which appear in their representation theory. This thesis studies the Deligne Exceptional Series from these two perspectives, and gives a sequence of finite groups which has analogies with the Deligne series.