Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

The NVIDIA AI Technology Center at the University of Florida is offering an instructor-led, deep learning institute workshop in April: Data Parallelism: How to Train Deep Learning Models on Multiple GPUs.

Workshop Dates: April 11-12, 2024 (Thursday and Friday), from 1:00-5:00 p.m.

Registration Link: https://forms.gle/KiNxdjqxJ7AZCZFk6

The workshop will be held over two days (four hours each day) in Malachowsky Hall’s NVIDIA Auditorium. Its focus is on techniques for data-parallel deep learning training on multiple GPUs to shorten the training time required for data-intensive applications. Working with deep learning tools, frameworks, and workflows to perform neural network training, attendees will learn how to decrease model training time by distributing data to multiple GPUs, while retaining the accuracy of training on a single GPU. The full course outline may be found on this NVIDIA website page.

The course is FREE and open to the university community, but pre-registration is required. Also required is experience with Python. Technologies used in the workshop are PyTorch, PyTorch Distributed Data Parallel, and NCCL.

If you have any questions about this workshop, please email the instructor, NVIDIA Data Scientist Yungchao Yang (yunchaoyang@ufl.edu).

 

‘Hero’ Calculation Capability Yields Significant Achievement

Basic biology textbooks will tell you that all life on Earth is built from four types of molecules: proteins, carbohydrates, lipids, and nucleic acids.  But what if we could actually show that these “molecules of life,” such as amino acids and DNA bases, can be formed naturally in the right environment? Researchers at the University of Florida are using HiPerGator – the fastest supercomputer in U.S. higher education – to test this experiment. 

“Our previous success enabled us to use Machine Learning and AI to calculate energies and forces on molecular systems, with results that are identical to those of high-level quantum chemistry but around 1 million times faster,” said Adrian Roitberg, Ph.D., a professor in UF’s Department of Chemistry who has been using Machine Learning to study chemical reactions for six years. “These questions have been asked before but, due to computational limitations, previous calculations used small numbers of atoms and could not explore the range of time needed to obtain results. But with HiPerGator, we can do it.” 

HiPerGator – with its AI models and vast capacity for Graphics Processing Units, or GPUs (specialized processors designed to accelerate graphics renderings) – is transforming the molecular research game. Until a decade ago, conducting research on the evolution and interactions of large collections of atoms and molecules could only be done using simple computer simulation experiments; the computing power needed to handle the datasets just wasn’t available.  Read the full press release here.

UFIT Senior Director Erik Deumens explained how this full takeover of HiPerGator was possible: 

“HiPerGator has the unique capability to run very large ‘hero’ calculations that use the entire machine, with the potential to lead to breakthroughs in science and scholarship,” Deumens said. “When we found out about the work Dr. Roitberg’s group was doing, we approached him to try a ‘hero’ run with the code he developed.” 

Researchers interested in discussing using HiPerGator for hero calculations are welcome to contact Dr. Deumens.

Collaborate with NVIDIA Center at UF

University of Florida researchers have the opportunity to collaborate with NVIDIA experts to accelerate their workflow, improve performance on algorithms, and have regular consults during their project. The NVIDIA AI Technology Center at UF (NVAITC) is a joint research center of UF and NVIDIA, with a mission to advance artificial intelligence education and research. The NVAITC is the first in the U.S. and enables UF’s researchers access to NVIDIA experts and be early adopters of NVIDIA’s advanced technologies.

Both research groups and individual researchers are eligible to apply. The NVAITC is university-wide so faculty from any college or department are welcome to become an NVAITC collaborator.

Interested in working with the NVAITC? Begin the process by contacting UF Site Manager Kaleb Smith. Dr. Smith is a senior data scientist with NVIDIA and evaluates prospective collaborations. For additional information or to request a consultation about the process, please email UFIT Research Computing’s AI Team Lead Ying Zhang.

UF-NVIDIA Hackathon: May 17-25

“Attending the 2023 Hackathon will help our team optimize our models to run on HiPerGator and increase their efficiency and performance,” wrote Warrington College of Business Assistant Professor Ivy Munoko. “We use a large dataset with tens of millions of data points.”

Partnering with NVIDIA and OpenACC, the second annual UF-NVIDIA GPU Hackathon began this week. Ten teams of computational researchers and developers are participating, including three external teams representing the National Oceanic and Atmospheric Administration, the University of Alabama, and Arizona State University. Each team is receiving mentorship in GPU programming, high-performance computing, and data applications from NVIDIA and UFIT staff. Professor Munoko’s team includes Karla Saldaña Ochoa, assistant professor, College of Design, Construction, and Planning, and Maxim Terekhov, Ph.D. candidate, Department of Information Systems and Operations Management.

The hackathon is an opportunity to port, accelerate, and optimize scientific applications with programming models and tools hosted through HiPerGator. Participants are also developing a deeper understanding of HiPerGator’s computational capabilities while utilizing applications on the latest supercomputing hardware. Researchers with questions about the hackathon or who would like to schedule a consult about UF-AI computing support may contact Applications Specialist and AI Support Team Lead Ms. Ying Zhang.

2023 UF+NVIDIA Hackathon

Together with NVIDIA and OpenACC, UFIT is hosting the second annual University of Florida Open Hackathon from May 17 – May 25, 2023.

Advanced parallel computing or GPU skills are NOT required. However, it is helpful for teams to know the basics of GPU programming and profiling. The application deadline is March 1, 2023, with selected teams being notified shortly thereafter.

Scientist and computing experts from NVIDIA, along with UFIT’s Research Computing AI team, will serve as mentors to help the hackathon teams optimize their code for GPU acceleration. UFIT will provide HiPerGator as the work platform for the hackathon. Priority acceptance will be given to UF-affiliated research groups and their collaborators, but faculty, students, and research staff from all Florida universities and SEC member institutions are encouraged to apply. Anyone with questions about the application process or the hackathon contest format are welcome to contact AI Support Team Lead Ms. Ying Zhang.

Spring 2023 HiPerGator Training

UFIT Research Computing is hosting a variety of trainings and workshops throughout the Spring 2023 semester. The options include HiPerGator user training, panel events, in-person training, and networking opportunities for UF’s research community.

The robust schedule features multiple virtual NVIDIA Deep Learning Institute (DLI) workshops on the fundamentals for deep learning and for accelerated computing with CUDA Python. The always popular Birds-of-a-Feather sessions (BOF), facilitated by Research Computing staff, are for current and potential HiPerGator users to introduce high performance computing and AI resources and services available, such as accelerated genomics and MLFlow. There are also two AI panels scheduled. The first panel is for promoting women in HPC&AI, and the second will discuss the use of AI in arts and humanities research.

All UFIT Research Computing training, panels, and BOF sessions are free. To register for any of the offerings, visit https://rc.ufl.edu/calendar/. Faculty and staff can also request group, department, or 1-on-1 training consultations. For assistance with custom training needs, please contact UFIT’s Training and Biocomputing Specialist, Dr. Matt Gitzendanner.

Full-Day NVIDIA Workshops–Summer 2022

UFIT is offering two, full-day NVIDIA workshops this summer.  Registration for the Deep Learning Institute (DLI) offerings is open to faculty and to staff who support research computing applications. Anyone with questions prior to registering may contact AI Support Team Lead Ying Zhang, yingz@ufl.edu.

NVIDIA DLI: Building Transformer-Based Natural Language Processing Applications
This is an online workshop, held via Zoom.
DATE: June 21, 2022
TIME: 9:00 a.m. – 6:00 p.m.
INFORMATION: https://rc.ufl.edu/calendar/#!view/event/date/20220621/event_id/24401

NVIDIA DLI: Fundamentals of Deep Learning
This is an in-person workshop, held at the UF Informatics Institute (432 Newell Drive).
DATE: July 28, 2022
TIME: 9:00 a.m. – 5:00 p.m.
INFORMATION: https://www.rc.ufl.edu/calendar/#!view/event/date/20220728/event_id/24328

Participants receive an NVIDIA DLI certificate to recognize their subject matter competency after the successful completion of the post-workshop assessment. UFIT offers year-round training opportunities to support research inquiry. Visit the calendar of training and events for other learning opportunities.

UF and Nvidia Co-Hosting Hackathon

UF and Nvidia, in collaboration with OpenACC, are jointly hosting the UF Hackathon from March 29–April 6, 2022. The deadline for teams to apply is
Feb. 21, with selected teams being notified shortly thereafter.

UFIT’s AI team, along with Nvidia AI staff, will serve as mentors to help teams parallelize and optimize code for GPU acceleration. UFIT is also providing HiPerGator AI as the work platform for the UF Hackathon. Teams from the University of Florida have priority during the application process, but teams from other Florida universities and all SEC universities are also able to apply.

The UF Hackathon is a multi-day, intensive hands-on event designed to help computational scientists and researchers port and optimize their applications using GPUs. It pairs participants with dedicated mentors experienced in GPU programming and development in AI, high performance computing, and data science applications. The event will utilize computing resources from HiPerGator AI, currently ranked as the 2nd most powerful supercomputer in U.S. higher education.

Participating teams will leave the event either with applications running on GPUs or a clear roadmap of next steps to leverage GPUs. Anyone with questions about the UF-Nvidia Hackathon may contact Ms. Ying Zhang, applications specialist and AI team lead for UFIT.

Options for Using HiPerGator and HiPerGator AI

HiPerGator and HiPerGator AI can be used for teaching and research by UF faculty and faculty from Florida’s state universities. Options for using University of Florida supercomputing resources are as follows:

1. For teaching a class, allocations are free and last for one semester.
2. For research, allocations can be purchased for periods ranging from three months to several years. The rates are listed at https://www.rc.ufl.edu/services/rates/.
3. A free three-month trial allocation may also be requested. Trial allocations can be used to develop a course and to explore HiPerGator’s use for research. Interested faculty should complete the trial application form. Upon completion of the trial period, faculty will work with UFIT to find the best way forward for continuing their use of HiPerGator and HiPerGator AI.
4. Colleges and departments can also request a free three-month trial allocation to be shared between faculty in the unit. This option provides access for learning about AI and preparing to include AI in courses at no cost to individual faculty. Details of a basic AI Starter Allocation are available on the https://www.rc.ufl.edu/artificial-intelligence page.

HiPerGator has been successfully operating on the financial model described above since 2013. Financial support is due to significant investment from the Provost and Senior VP for Academic Affairs, the VP for Research, and the Office of the VP and CIO. Anyone with questions about UFIT’s computational resources and support for teaching or research may contact UFIT Research Computing Director Erik Deumens.

Sharing UF’s AI Journey with the World

Enhancements to UF’s https://ai.ufl.edu/ website (AI) debuted on January 26. The site now more fully showcases UF’s commitment to integrate AI across academic, research, and outreach efforts. The Office of Strategic Communications and Marketing partnered with UFIT’s Web Services to bring about the site enhancements.

“The story we want to share digitally is one that encompasses all aspects of AI at UF,” notes Melanie Schramm, assistant vice president for strategic communications. “UF’s AI Initiative touches on every aspect of the university. Working with UFIT, this site will help us share information and discoveries with stakeholders around the world.”

The new design features a restructured homepage with content blocks dedicated to university-wide research and announcements. Two new tabs have been added: the Industry tab details UF’s investments in solidifying its role as the first “AI University,” while the News tab lists updates and events from departments and units. Students can also check out the chart on the Academics tab that lists all new and enhanced undergraduate and graduate courses that include AI components. The popular Calendar link lists the symposia, trainings, webinars, and other events focusing on AI across the enterprise. UF Human Resources also has a Jobs page dedicated solely to the AI-focused faculty and staff positions available with the university.