Advanced Networks and High Performance Computing: The Big Data Challenge

A Virtual Symposium
Target Audience: 
Higher Education Faculty, Researchers, Administrators, IT and Network Staff
Requirements for Participation: 

Interactive IVC Participation: Open to MAGPI Members and Non-Members with advanced networking and H.323 videoconference capabilities. However interactive seats are limited to 7 spots for MAGPI members and 7 spots for Non-MAGPI Members. Registration is only required if you wish to be a live, interactive videoconference site. Host sites are encouraged to register to allow for participation from larger groups of faculty and staff.

LIVE WEBSTREAM: A live webstream of the entire Symposium including all break-out sessions will be available at Click on "Live Videos" on the day of the event. Registration is not required to view the live webstream. Webstream participants can send questions to the presenters by tweeting them to @magpik20 (with hashtag #magpi_hpc) or posting them to our Facebook page at Please note: If you are behind a firewall and having problems viewing the webstream you may need to open port 1935 TCP for to work properly. If you continue to have problems with the webstream, contact your firewall administrator.

Symposium Overview

Advanced research and education networks, such as Internet2, offer researchers and higher education faculty opportunities to connect to a diverse array of resources and "Big Data" and to collaborate with partners around the world toward some of the biggest challenges of our time across disciplines including genomics, bioinformatics, physics, and more. 

The MAGPI High Performance Computing and Big Data Symposium aims to present current areas of research, pedagogy, and information technology best practices to higher education and research faculty, staff and administrators. The keynote and breakout sessions during this half-day symposium are designed to demonstrate to faculty and practitioners the range of HPC projects and resources that exist and how advancements in the HPC space can be applied to their own teaching and research activities. 

Symposium Schedule

1:00-1:10PM EST — Welcome and Introductions

1:10-1:50PM EST — Keynote Address - High Performance Computing at NOAA: Successes, Challenges, and Insights
Ron Bewtra, Technology Advisor, Head of IT Services, National Oceanic and Atmospheric Administration (NOAA)

Recently the National Oceanic and Atmospheric Administration (NOAA) embarked on a transformation of its Research and Development High Performance Computing Systems. These systems are leveraging Petascale computing systems in a distributed environment while taking on some of the most challenging big data problems of today. This talk will discuss current computing capability, the challenges of 50 petabytes of active data, large data movement, distributed computing, and how NOAA will have to transform in the future.

2:00-2:50PM EST — Breakout Session A

Session 1: Making Molehills Out of Mountains: Remote Visualizations and HPC 
Mike Chupa, Manager of Research Computing, Lehigh University 

The "Big Data Challenge" has captured the public's imagination, and figures prominently in IT industry publications. However, the root challenges of Big Data have a long history in HPC. Remote visualization tools provide a viable solution to the data movement problem by vastly diminishing data movement requirements while still providing effective visualization capabilities required to understand and generate insights from large engineering and scientific datasets. Chupa will describe remote visualization capabilities at national sites (XSEDE, NERSC) and how they can be leveraged by institutions like Lehigh University to improve research productivity in the era of Big Data.

3:00-3:50PM EST — Breakout Session B
Session 1: The Computational Needs of Modern Genomics
Jeffrey Rosenfeld, Senior Scientific Programmer, University of Medicine and Dentistry New Jersey

Since the introduction of high-throughput sequencing machines, the amount of data produced by genetic experiments has exploded. Each run of a machine produces hundreds of gigabytes of data which must be stored and analyzed. This has greatly increased the computational requirements and a substantial amount of computational power is required to perform even a small-scale experiment. In this talk, Dr. Rosenfeld will give an introduction to the data requirements of what is generally known as next-generation sequencing (NGS). He will illustrate this through a discussion of the 1000 Genomes and ENCODE Projects which are produced by major international consortia. In addition, he will discuss the ways in which this data is dealt with by smaller labs.

Session 2: Supercomputing for Atmospheric Sciences
Anke Kamrath, Director, Operations and Services Division, Computational & Information System Laboratory

NCAR (National Center for Atmospheric Research) completed construction of its new “green” datacenter, NWSC (NCAR-Wyoming Supercomputing Center) in 2011 and during 2012 deployed a new supercomputing resource to support the needs of atmospheric sciences researchers from across the U.S. This presentation will focus on this new supercomputing system, Yellowstone, its supporting storage and interconnect components, the early scientific computing efforts on Yellowstone, and finally the key energy efficient innovations for the NWSC will be presented.

About the Presenters

Ron Bewtra, serves as the Chief Technology Officer for the National Oceanic and Atmospheric Administration (NOAA). Mr. Bewtra’s most recent activity is leading the NOAA-wide network optimization initiative. Mr. Bewtra also leads the NOAA High Performance Computing (HPC) Integrated Management Team; is responsible for the agency's HPC relationship with DOE; and is responsible for IT at the Geophysical Fluid Dynamics Laboratory located on Princeton University’s Forrestal campus. In addition, Mr. Bewtra supports HPC acquisitions and design for other Federal Agencies.

Michael Chupa is the Manager of Research Computing at Lehigh University. He is the liason between the university's IT organization (Library & Technology Services), and the university community, focusing on high-performance and data-intensive computing. The Research Computing group provides system administration for the centrally located computing resources, application support for open-source and commercial technical computing software, and consulting services to the research community. Chupa is also Lehigh's Campus Champion for the NSF XSEDE program. Chupa has an M.S. in Computational Engineering from Mississippi State University, where he served as a Research Associate at the NSF ERC for Computational Field Simulation, and a B.A. in Physics from Oberlin College.

Anke Kamrath
is Director of Computing Operations and Services in NCAR’s Computational and Information Systems Laboratory. She came to NCAR in 2009 after 22 years at the San Diego Supercomputer Center at the University of California, San Diego. Ms. Kamrath has over 25 years experience in supporting, operating, deploying and managing world-class supercomputing resources. She has oversight responsibilities for the NCAR-Wyoming Supercomputing Center, all supercomputing operations and for all computing systems, operational and services staff. Prior to her experience in supercomputing, she worked as a rocket scientist at the Aerospace Corporation in El Segundo, California and has a M.S. in Mechanical Engineering from U.C. Berkeley.

Jeffrey Rosenfeld, PhD, is an Assistant Professor of Medicine at the New Jersey Medical School-UMDNJ and a Scientific Programmer in the High Performace and Research Computing Group. Since joining UMDNJ, Dr. Rosenfeld has been helping the faculty across the university become familiar with the different sequencing technologies and their applications. He is familiar with Complete Genomics, Illumina, ABI SOLiD, Pacific Bioscience, Ion Torrent and other sequencing technologies. Dr. Rosenfeld serves as an expert on next-generation DNA sequencing. Previously, he was a Research Scientist at North-Shore Long Island Jewish Health System. After completing PhD degree, Dr. Rosenfeld worked on targeted resequencing, exome sequencing and structural variant detection directed towards understanding the genetic basis of schizophrenia. He is also a member of the 1000 Genomes project where he serves a co-chair of the MNP sub-group of the analysis group and is focused on identifying complex variants in the human genome. In addition, Dr. Rosenfeld is a Research Associate at the American Museum of Natural history where he is working on bacterial genomics (Without Compensation). He completed his Ph. D in Biology jointly at NYU and The Cold Spring Harbor Laboratory where Dr. Rosenfeld was one of the first people to work with Illumina sequencing technology. In addition to his doctorate, he received and Master in Biotechnology and a Bachelor in Biology from the University of Pennsylvania.

For more information, please contact Jennifer Oxenford at