Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

The NVIDIA AI Technology Center at the University of Florida is offering an instructor-led, deep learning institute workshop in April: Data Parallelism: How to Train Deep Learning Models on Multiple GPUs.

Workshop Dates: April 11-12, 2024 (Thursday and Friday), from 1:00-5:00 p.m.

Registration Link: https://forms.gle/KiNxdjqxJ7AZCZFk6

The workshop will be held over two days (four hours each day) in Malachowsky Hall’s NVIDIA Auditorium. Its focus is on techniques for data-parallel deep learning training on multiple GPUs to shorten the training time required for data-intensive applications. Working with deep learning tools, frameworks, and workflows to perform neural network training, attendees will learn how to decrease model training time by distributing data to multiple GPUs, while retaining the accuracy of training on a single GPU. The full course outline may be found on this NVIDIA website page.

The course is FREE and open to the university community, but pre-registration is required. Also required is experience with Python. Technologies used in the workshop are PyTorch, PyTorch Distributed Data Parallel, and NCCL.

If you have any questions about this workshop, please email the instructor, NVIDIA Data Scientist Yungchao Yang (yunchaoyang@ufl.edu).

 

New NVIDIA DLI Workshop Offered in February

The University of Florida’s ambassadorship status with NVIDIA means that faculty, students, and staff have free training opportunities in accelerated computing and applied AI. Through the NVIDIA Deep Learning Institute (DLI) and in coordination with UFIT, a two-day Generative AI with Diffusion Models workshop is being offered for the first time at UF on February 22-23.

Day/Date/Time: Thursday and Friday, Feb. 22–23, from 12:00 – 4:00 p.m. each day

Location: Malachowsky Hall Auditorium – Room 1000

Register to Attend: Registration Link

The Generative AI with Diffusion Models workshop is taught by UF’s NVIDIA AI Technology Center Site Manager and Senior Data Scientist Kaleb Smith. Participants will gain a deeper understanding of denoising diffusion models to generate images from text prompts. Proficiency in PyTorch and deep learning models is required to attend, with participants who complete the 8-hour course earning a certificate of completion.

Learning highlights in this workshop include:

  • How to build a U-Net to generate images from pure noise
  • Improving the quality of generated images with the Denoising Diffusion process
  • Controlling the image output with context embeddings
  • Generating images from English text-prompts using CLIP

NVIDIA DLI workshops are in-person only and not recorded for later/repeat viewing. Anyone with questions about this workshop is welcome to contact Research Computing Training Team Lead Matt Gitzendanner.

Spring 2023 HiPerGator Training

UFIT Research Computing is hosting a variety of trainings and workshops throughout the Spring 2023 semester. The options include HiPerGator user training, panel events, in-person training, and networking opportunities for UF’s research community.

The robust schedule features multiple virtual NVIDIA Deep Learning Institute (DLI) workshops on the fundamentals for deep learning and for accelerated computing with CUDA Python. The always popular Birds-of-a-Feather sessions (BOF), facilitated by Research Computing staff, are for current and potential HiPerGator users to introduce high performance computing and AI resources and services available, such as accelerated genomics and MLFlow. There are also two AI panels scheduled. The first panel is for promoting women in HPC&AI, and the second will discuss the use of AI in arts and humanities research.

All UFIT Research Computing training, panels, and BOF sessions are free. To register for any of the offerings, visit https://rc.ufl.edu/calendar/. Faculty and staff can also request group, department, or 1-on-1 training consultations. For assistance with custom training needs, please contact UFIT’s Training and Biocomputing Specialist, Dr. Matt Gitzendanner.

Fall 2022 HiPerGator Training

Registration for the fall 2022 Research Computing training schedule is now open. The training sessions will be held on Thursdays from 10:40 a.m. to 12:00 p.m. And for the first time, this semester UFIT is offering both in-person registration and Zoom attendance options for each training.

The in-person location is the
UF Informatics Institute Seminar Room. All sessions are open to faculty, lab staff, and undergraduate and graduate students. Please register by 9:00 a.m. on the day of the training to ensure you receive the Zoom link. Sessions will be recorded and posted on Research Computing’s pre-recorded training page. To learn more about any of the training options and to register, visit https://rc.ufl.edu/calendar/.

Sep. 08: 10:40 a.m.-12:00 p.m. │ Intro to Research Computing and HiPerGator
Sep. 22: 10:40 a.m.-12:00 p.m. │ Intro to the Linux Command Line
Sep. 29: 10:40 a.m.-12:00 p.m. │ HiPerGator SLURM Submission Scripts
Oct. 06: 10:40 a.m.-12:00 p.m. │ HiPerGator SLURM Scripts for MPI Jobs
Oct. 20: 10:40 a.m.-12:00 p.m. │ Running Graphical Applications on HiPerGator
Nov. 03: 10:40 a.m.-12:00 p.m. │ Jupyter Notebook and Managing Conda Environments
Nov. 10: 10:40 a.m.-12:00 p.m. │ Git and github.com

Optimizing Performance on the NVIDIA Platform

UFIT is offering a 90-minute tutorial, “Performance Analysis and Optimization on the NVIDIA Platform,” on April 14. The tutorial is free and open to faculty, students, and staff.

Tutorial: Performance Analysis and Optimization on the NVIDIA Platform
Date: Thursday, April 14, 2022
Time: 12:00 PM – 1:30 PM
Pre-registration is required

The tutorial will be lead by an NVIDIA scientist who will present an introduction to performance analysis on accelerated CPU-GPU servers. Attendees will learn how to use NVIDIA Nsight Visual Studio profiling tools to understand the behavior of AI and high-performance applications to determine what optimization steps are appropriate for improving the overall time-to-solution.

Registrants will be sent the secure Zoom link the day before the tutorial. Anyone with questions about this event may contact UFIT’s AI Team Lead and Senior Application Developer Ying Zhang.