Documentation for Grid Technology Grants
Wayne State University's (WSU) High Performance Computing Services Department develops, deploys, and maintains a centrally managed, scalable, Grid enabled computing system capable of storing and running research related high performance computing (HPC) projects. The Grid infrastructure at WSU is designed to allow groups access to many options corresponding to the nature of research being performed. The core Grid services are maintained by the University’s central computing staff within the Computing and Information Technology (C&IT) Department.
The Grid is comprised of clusters available for general use, with the option of preemption available to the owners of the clusters, as well as separate clusters dedicated to specific research groups. These clusters utilize high speed 10GB ethernet and Infiniband networks. The Grid currently has the combined processing power of 7,256 cores: 2,328 Intel cores, 4,928 AMD cores, with over 22TB of RAM and 1.2PB of disk space.
In addition to the aforementioned hardware configuration the primary cluster site also includes several key systems that are dedicated to core grid services. These systems consist of development machines, an NFS server for central user directories, management systems for monitoring and notification, a web server for training and documentation, and dedicated backup machines for managing data and integrating it with the University's enterprise backup system. All machines are managed through a private administrative network, and employ out of band management techniques for central administration, monitoring, and maintenance.
WSU's Grid utilizes Linux as the operating system, and is currently installed with Rocks version 6.1 and CentOS 6. These resources are managed by Altair's PBS Professional 12 job scheduler which allows researchers to access different networks and architectures using a standard and simple command set. This software suite provides a suitable framework for developing and deploying Grid based applications and performing Grid based research at WSU. WSU also maintains software agreements and site license contracts with many vendors and actively uses these agreements to provide software at a reduced cost to research groups on campus.
WSU employs a full time staff dedicated to maintaining Grid resources and supporting systems. These highly trained and educated professionals assist research groups with integrating their work into the WSU Grid. This central staff ensures that independent research groups are still operating within the parameters set forth by the University as a whole, and that best practices are followed by all researchers on campus involved with HPC. The staff works closely with research groups to minimize installation time, and to ensure a high return on investment when working with computing hardware that can depreciate quickly.
The WSU Grid implements an NFS attached parallel storage system to house critical research data in a secure, scalable system that can grow to meet the demands of the Grid. The current configuration consists of a Panasas ActiveStor system with 48TB of highly redundant usable storage across 6 chassis. The Panasas represents the latest in technology with a possible 1.5GB/s throughput per chassis, object RAID assigned per file with fast reconstruction times, and a per chassis battery backup system.
Backup and Disaster Recovery
All critical research data on the Grid is backed up daily via WSU's Symantec Netbackup Enterprise Server. This system operates on a private Gigabit network, and is composed of an IBM TS3500 (ts3584) Tape robot with 12 LTO-4 FC tape drives and a pair of SunFire 6800 systems to maintain catalogues, tapes, and data pools. This backup system is an enterprise operation that is maintained by the central computing department, and provides recovery capability for all of the University's critical data. The tape library system can presently house Petabytes of data.
The central grid site at WSU is located in a secured, 24 hour monitored facility, with over 6900 square feet of raised floor space, central air conditioning, battery backed electrical service, and an emergency natural gas generator. This multi-million dollar environment houses not only critical research equipment, but also the primary information systems that maintain the University's financial and academic records. The computing center environment is protected by an FM-200 waterless fire suppression system which interrupts the chemical chain reaction in fires and absorbs heat thereby protecting the computing hardware and data.
A Gigabit Ethernet fiber backbone connecting over 90 buildings, composes WSU's primary network infrastructure. This state of the art, high speed network allows researchers at WSU to connect to our central Grid services over a reliable, fast, secure network, and guarantees the availability of decentralized grid components, regardless of their physical location on campus. The current network is scalable to 10 gigabits/sec in the future, and presently contains Juniper M160 gigabit routers, and CISCO 6500 series electronics.
The Grid is also connected to the Michigan LambdaRail. MiLR (pronounced ìMY-larî) is a very high-speed, special purpose, data network built jointly by Michigan State University, the University of Michigan, and Wayne State University, and operated by the Merit Network. MiLR provides campus researchers low-cost, 10 Gbps Ethernet connections between the three university campuses and to national and international research and education connection points in Chicago. Work is underway to interconnect MiLR with other similar networks being built in the U.S. and internationally.
Security and Authentication
WSU uses Sun Microsystems's LDAP directory system for central authentication to all Grid resources. This central production system is maintained by a full time staff and is used for many other systems on campus such as email, library systems, general computing labs, student records, and registration. This central authentication system provides a secure way to ensure that only appropriate users can access the grid, and their respective data. WSU is also a founding member of Merit Networks, which provides connectivity to all of the state run universities in Michigan.