Hardware Today: Clusters Catch on in the Enterprise

Monday Mar 6th 2006 by Drew Robb
Share:

Out of the NASA lab and into an enterprise near you, clustering technology is practically plug-and-play.

Tang isn't the only product to come out of the space program. Server clustering started there as well. And, unlike the beverage, it delivers benefits directly related to ROI. Clustering provides a scalable approach for enterprises to achieve supercomputing power by simply running commodity servers in parallel over an Ethernet connection.

Thomas Sterling and Don Becker built the first such cluster, known as Beowulf, in 1994 at NASA's Goddard Space Flight Center in Greenbelt, Maryland. That unit consisted of 16 Intel DX4 processors connected by 10 Mbps Ethernet. By the end of that decade, more than 100 clusters were in use at research universities, laboratories, and corporations. The most recent list of the 500 fastest supercomputers includes 360 clusters, up from two just seven years earlier. And the growth isn't limited to top-end systems.

Clusters do not scale linearly. Instead, they experience a fair amount of overhead, typically in the 15 to 25 percent range.

"The big news is that server clustering is going mainstream — moving from purely the domain of the scientists and academics in federal labs and universities to commercial — aerospace, auto design, oil and gas, financial services — and into the general enterprise," says Pauline Nist, senior vice president for product development and management at Penguin Computing. "CIOs are increasingly moving to managing large clusters and grids of servers for applications, such as Web services and business logic."

2+2=3

The initial driver for developing clusters was to reduce the cost of building high performance computers. And the technology can still do that — many clusters are built with off-the-shelf Intel or AMD boxes. Virginia Tech even has a cluster consisting of 1,100 dual-processor Apple Xserve units that produces 12 teraflops, enough to rank it the 20th fastest supercomputer in the world. But for many organizations, cost cutting is no longer the only motivator. Instead, clustering offers them a way to improve scalability and reliability.

"Having the ability to add a node into the configuration and redistribute the workload can be very appealing," says Chip Nickolett, president of Comprehensive Consulting Solutions in Brookfield, Wisc. "While not fully fault tolerant, they do provide companies with the ability to resume business operations and/or processing with a minimal amount of disruption, downtime, and loss of data."

Setting up a cluster involves more than just wiring together a bunch of servers. "Clusters are complex to set up and complex to manage," says John Humphreys, research manager with IDC.

Nickolett cites several misconceptions customers frequently have about clustering. As a result, they sometimes fail to realize all that is involved in setting up and operating a cluster. The first is that clusters do not scale linearly. Instead, they experience a fair amount of overhead, typically in the 15 to 25 percent range.

"They [customers] expect near-linear results when combining systems and may soon realize that this is not a realistic expectation," he says. "This can result in having to acquire more hardware than originally planned or enhancing the software to perform more efficiently.

"When you have a computer cluster with 10,000 files created, you might have 1,000 storage servers attached. A modular approach makes a better approach than big iron." — David Freund, Illuminata analyst

And then there is the matter of software. Managing a cluster involves far more than just sharing disk space or distributing a processing work load.

"Software needs to be cluster aware," Nickolett continues. "That means that a distributed lock manager (which causes most of the overhead) needs to be built into the software to manage concurrency and data consistency when being accessed by more than one physical system."

He believes Ingres Corporation, based in Redwood City, Calif., is one of the leaders in this area due to its database software. In addition, Oracle has the Real Application Cluster version of its enterprise databases, and IBM has the DB2 Integrated Cluster Environment for Linux, which consists of its DB2 Universal Database running on an eServer 1350 cluster.

But no matter what changes are made to the databases or applications, there is still the matter of managing the cluster itself, something Penguin Computing's Nist acknowledges is "still very complex, time consuming, and error prone." To simplify matters, one growing area is "stateless provisioning," where the operating system, middleware, and application stacks are loaded into memory rather than on the hard disks.

"There is a growing recognition among organizations that stateless provisioning is a promising approach to dramatically improving the management of large pools of servers," says Nist. "It loads orders of magnitude faster, guards against having the wrong versions, and is much more effortless. And repurposing can occur on-demand as service and business demands change and resource allocation must be adjusted."

>> Clustering in the Real World

Split Storage

In addition to being used in traditional high-performance computing and failover applications, specialized clusters are making their way into the storage server market.

"Traditional clusters have trouble scaling and hit bottlenecks when trying to create thousands of files every minute," says Illuminata analyst David Freund.

He says that the clustered file system breaks down when trying to accomplish such tasks. This is resolved by parallelizing the structure, an approach HP takes with its HP StorageWorks Reference Information Storage System (RISS) active-archiving solution for e-mail and Microsoft Office documents (later releases will address other file types). RISS addresses scalability and performance issues by distributing content across a grid of storage units called "smart cells," which are connected in a peer-to-peer fabric. Network-attached storage (NAS) vendor Network Appliance and EMC also sell clustering products.

Products from smaller firms shouldn't be overlooked. For example, Boulder, Colo. based Cluster File Systems' Lustre file system is used by firms such as HP and SGI; Panasas of Fremont, Calif. sells storage clusters with performance as high as 10 GBps in throughput and 300,000 I/O operations per second.

"When you have a computer cluster with 10,000 files created, you might have 1,000 storage servers attached," says Freund. "A modular approach makes a better approach than big iron."

Out of the Laboratory

As any horror-film buff knows, creatures developed in the lab always manage to escape, and Beowulf is no exception. Like a movie monster, it mutated into a variety of forms and infiltrated society. Beowulf has already taken over the supercomputer market and is now going for complete domination. The list of companies currently manufacturing or supporting clustered architectures includes most of the major hardware and software vendors. IBM, Sun, HP, and Dell all offer server clusters with associated software, wiring, and networking.

"HP, IBM, and Dell are focused on providing a solution in a box or a rack," says IDC's Humphreys. "Drop in the racks and you are off and running."

Initially, clusters, despite their adoption of off-the-shelf hardware, were difficult to manage, thus limiting their usefulness and market penetration. Now, the clustering technology itself has reached near plug-and-play functionality in some cases and no longer requires a team of experts to operate.

Although clusters generally run on Linux, even Microsoft is getting into the picture with its 64-bit Windows Compute Cluster Server 2003 currently in a Beta2 release. A number of hardware and software vendors have based their business models on clustering: Myricom, of Arcadia, Calif., which creates the 10G Ethernet and Myrinet cluster interconnections, and Linux Networx of Bluffdale, Utah, which builds supercomputing clusters, are two examples.

"The picture is decidedly fragmented," says Nist. "There are a number of open source cluster management projects that came out of the academic high performance computing space: many ISV and captive general server management software suites, a few commercial vendors focused on grid architecture, and a very small number of commercial cluster management players, such as Penguin Computing and Scyld Software, that offer commercial-grade solution stacks."

As clustering technology develops, it is reaching into new areas. Initially, clusters, despite their adoption of off-the-shelf hardware, were difficult to manage, thus limiting their usefulness and market penetration. Now, the clustering technology itself has reached near plug-and-play functionality in some cases and no longer requires a team of experts to operate.

Penguin Computing, for example, recently released the Penguin Personal Cluster, which fits 6 to 24 CPUs delivering up to 200 Gflops in a workstation form factor, and it is managed as simply as a workstation.

"Because clusters are going mainstream, Penguin Computing has created a high performance Linux cluster to-go," says Nist. "It delivers powerful, scalable, and easy-to-use supercomputing resources in any place that customers need including project teams, departmental computing, and highly productive individual contributors."

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved