MATH Seminar

Title: Gossip-based distributed matrix computations
Seminar: Scientific Computing Seminar
Speaker: Hana Strakova of University of Vienna and GA Tech
Contact: TBA
Date: 2012-11-28 at 12:50PM
Venue: W306
Download Flyer
Abstract:
Truly distributed matrix computations with randomized communication schedules, such as gossip-based algorithms, can offer many attractive properties. Due to their randomized communication restricted only to direct neighbors they are very flexible with respect to the underlying hardware infrastructure. They can operate on arbitrary topologies and they can be made resilient against dynamic changes in the network, against message loss or node failures, and against asynchrony between compute nodes. Moreover, their overall cost can be reduced by accuracy-communication trade-offs. Such properties are attractive especially for loosely-coupled distributed systems with unreliable communication links, such as sensor or P2P networks. However, due to the growth in the number of nodes for future extreme-scale HPC systems and the anticipated decrease in reliability, some properties of gossip-based distributed algorithms may become important also for future HPC systems. We are investigating distributed algorithms for various prototypical matrix computation problems which utilize gossip-based aggregation algorithms for performing reduction operations in a distributed manner. Questions addressed relate to the (communication) cost paid for the increased flexibility and robustness, to convergence properties, to numerical accuracy achieved, as well as to the benefits of accuracy-communication cost trade-offs.

See All Seminars