Hardware Today: Sun Cluster Bolsters HPC at Penn State

by Drew Robb

Despite its heterogeneous nature, the server room for Pennsylvania State University's research computing department reflects its 15-year relationship with Sun Microsystems. The latest addition: a Sun Fire V20z Opteron-based Linux cluster.

Pennsylvania State University's research computing department has a 15-year relationship with Sun Microsystems. During this time, it accumulated a large collection of Sun SPARC and Sun Xeon boxes in a heterogeneous server room where relationships with server vendors like IBM and Dell are equally cultivated.

"Sun accounts for about of 800 processors out of a total of 1,500 we have in our clustering environment," says Vijay Agarwala, director of high-performance computing (HPC) and visualization at Penn State. "These include servers using the SPARC, Xeon, and Opteron processors."

The University's HPC needs are consistently expanding beyond its capabilities. To keep pace with the demand, the organization now adds new clusters with regularity. While older SPARC and Xeon clusters remain an integral part of the operations, it recently decided to establish a new and more powerful cluster built on Sun Fire V20z units using AMD Opteron processors and running the Linux operating system. These 2-way servers currently have 8 GB of RAM with room for a maximum of 16 GB — which the University is planning to take advantage of in the near future.

"The Sun/Opteron combination was the most cost-effective way for us to significantly increase our compute capacity," says Agarwala. "Our stock of V20z's, will soon be augmented by some Sun Fire V40z servers, and later by dual-core V40z machines."

The research computing group at Penn State University consists of a staff of 15 people. Its resources are an important part of the university's research activities, and scientists often run large-scale computations that cannot be performed on a desktop. Those with research grants have a choice of acquiring their own hardware to build a small cluster on which to attempt their own computations or paying for the services they use on the university research clusters.

"Opteron gives us better access to memory, lower cooling demands and a better price/performance point, and we felt that the Sun V20z was an excellent design." — Vijay Agarwala, director of HPC and visualization at Penn State.

Agarwala's unit receives central funding to cover salaries and other basic expenses, so it can offer its services at a competitive rate.

"Over 95 percent of those wanting to do serious computing choose to work with us due to the excellence of our service services," he says. "They only have to pay for the hardware acquisition and not for the support of the IT group."

The research computing group runs various engineering, mathematical, and scientific calculations, in areas such as computational physics, chemistry, weather predictions, bio-informatics, aerospace engineering, and geosciences.

Several years ago, the University added a cluster containing 350 Intel Xeon processors running on Sun hardware. Nine months ago, when it realized it needed to expand its clustering capabilities, it purchased 84 dual-processor Sun Fire V20z servers with 2.6 GHz AMD Opteron single-core chips running Red Hat Enterprise Linux 3. It is moving to version 4 within a month.

The V20z server comes with up to 16 GB of memory and is particularly suited to demanding jobs, such as the high-performance technical computing tasks done at Penn State. In addition, Sun says the server can be used for Web or application serving, horizontally scaled databases, and grid computing.

Why did the university select Opteron-based machines? Agarwala's experience lead him to believe there are many benefits to moving to that specific processor. AMD's HyperTransport Technology, for example, offers an interconnect designed to eliminate I/O performance bottlenecks. Agarwala claims it is 25 percent less power hungry than the Xeon.

"When Sun moved away from Intel Xeon to AMD Opteron, we thought that was a good move," says Agarwala. "Opteron gives us better access to memory, lower cooling demands and a better price/performance point, and we felt that the Sun V20z was an excellent design."

Having run Xeon- and Opteron-based Sun machines side-by-side, Agarwala feels he's earned the stripes to compare them. To be fair, the Xeon machines have older processors, so an apples-to-apples comparison isn't easy or even always possible. Agarwala took that into account when commenting on the qualities of both processors. While he prefers the Opteron, he also sees value in the Xeon.

"Opteron's integrated memory controller makes it ideal for memory-intensive tasks," he says. "For research computations that require smaller amounts of memory, however; we can achieve similar performance levels on Xeon. We tend to flow jobs to those machines that can deal with them best, although there is always some mixing."

>> Future Clusters

Future Clusters

Penn State's insatiable demand for research compute power continues unabated. Forty 4-way Sun V40z servers are en route, and it is talking about adding another 40 more in the near future. These boxes will also have 2.6 GHz AMD processors. For the past few months, Agarwala has been testing one such unit in-house. According to his benchmarks, he feels it scales well with regard to problems.

"With many of the problems we run, the V40z will speed up the end-to-end time to execution," he says. "However, some computations don't lend themselves well to multiple processors."

These 4-processor machines will have 32 GB of memory. Their introduction is expected to multiply the options available to Penn State. Some computations take a very long time to process because they require an excessive amount of I/O from disk. By moving I/O to memory, researchers can solve problems that were formerly beyond them. Alternatively, specific situations might demand three processors in the 4-way Sun server sit idle. One CPU using all 32 GB of RAM might be the most efficient configuration for certain problems.

Agarwala is also leaning in the direction of dual-core technology. After the arrival of the 40 single-core Sun V40z's, he believes it extremely likely the university will order dual-core V40z's next. These are 4-way dual-core (eight processor) 2.2 GHz machines.

"We expect to have at least 100 V40z servers eventually so it is natural that many of these will end up being dual-core Sun machines," says Agarwala. "I believe that dual-core technology is one of the most significant advances of the past five years."

Agarwala's imagination is already full of the possibilities. When attempting to solve research computations where 32 GB of memory is enough, he can throw eight chips at it instead of four, giving it a performance boost of 50 percent to 70 percent at the same cost.

"Sun's new dual-core Opteron systems allow the university to almost double its processing core capacity while maintaining the same heat and real-estate footprint," Agarwala said. "If I can get even 50 percent more processing performance, while using the same amount of power, then this will be huge."

This article was originally published on Monday May 16th 2005
Mobile Site | Full Site