Campus Computing News
UNT Plans High Performance Computing Upgrade
By Dr. Philip Baczewski, Senior Director of Academic Computing and User Services and Deputy Chief Information Officer for University Information Technology
In 2009, UNT made a major investment in supporting computational research with the purchase of a large-scale campus-wide High Performance Computing (HPC) installation that the University named Talon. The Talon system was expected to operate for three years, after which a replacement system would need to be purchased. Talon is now in its fourth year of operation, and since High Performance Computing stresses the compute servers by constantly applying 100 percent processing loads, disk and memory failures are becoming more common as the equipment ages.
Since the 2009 installation of Talon, there has been an increase from 9 research groups in 4 departments to over 53 Principal Investigator groups in 12 departments using HPC resources. The sheer number of users has grown from 55 to over 300. Departments utilizing central HPC resources include: Biological Sciences, Chemistry, Mathematics, and Physics in the College of Arts & Sciences; Materials Science and Engineering, Computer Science and Engineering, Electrical Engineering, Mechanical & Energy Engineering, Engineering Technology in the College of Engineering; Sociology in the College of Public Affairs and Community Service. This is expected to expand in the future. The current system is heavily used, with 70-80% of system capacity commonly occupied and 100% of servers frequently engaged in research calculations. Talon averages about 800,000 processing hours (91 processing years) per month, with 8 out of 36 months exceeding a million processing hours. 1.4 million processing hours were achieved in February of 2012.
Replacing our current HPC equipment
At the February 14 meeting of the UNT Board of Regents, a proposal was approved to enter into a three-year lease with Dell for the replacement of our current HPC equipment. Coordination between UNT's Research and Economic Development, University Information Technology, and UNT's Finance and Budget office has resulted in a funding model that establishes a regular 3-year cycle to replenish HPC technology in order to keep researchers competitive while providing a base level of broad usage HPC technology as a utility for the entire campus. As part of the new lease agreement, Dell has offered to provide UNT 4 additional months of lease at no additional charge. Accepting the extended lease offer provides UNT researchers early access to current HPC technology and allows a smoother transition between the current Talon system and its replacement equipment.
The proposed system has roughly 5 times the processing power of the original Talon system with 248 compute nodes, 4096 cores, and 16,384 NVidia graphical processor unit (GPU) cores. The proposed system has about 10 times the high-performance storage of the original Talon system, for close to 1.5 petabytes of usable storage. Installation of this system will enable researchers to solve problems that involve use of larger amounts of RAM and disk storage than is currently available and will enable support of new research areas such as those involving large data science. The heterogeneous design of the HPC system, featuring 4 distinct types of compute servers that can handle specialized workloads or function as part of a large processor group, ensures that the system can support the varied computational problems pursued by UNT researchers and will enable the system to be adapted to new research areas that may be pursued at UNT.
New System Configuration
The new system configuration will be as follows:
- 160 Parallel nodes (32G RAM, 16 cores each) – useful for computations that use multiple nodes (servers) to perform one set of calculations, such as materials science applications.
- 64 Large-memory nodes (64G RAM, 16 cores each) – useful for single-server, multi-core computations that require a large memory space, but also available for parallel processing, such as computational physics applications.
- 8 Extra-large memory nodes (512G RAM, 32 cores each) – useful for single-server calculations that are the most memory intensive, such as computational chemistry modeling of large molecular systems.
- 16 GP-GPU nodes (64G RAM, 8 cores each, 1024 CPU cores each) – onboard NVidia co-processor cards provide access to thousands of additional processor cores in a single or parallel processing model. These nodes can be applied to some existing computational problems, but could also be used for tasks like rendering complex visual output.
We expect that equipment delivery for the upgraded HPC system may begin arriving by the end of March. Installation will likely occur in April and May, and partial operations on the new equipment could begin as early as May. For more information on UNT's High Performance Computing services, see it.unt.edu/hpc. Researchers interested in using HPC services should contact Dr. Scott Yockel, Manager of HPC Services within Academic Computing and User Services.