In the first week of September last year, I explained, Why You Need a Fuller Data Center and challenged you to partake in some data center minimizing activities. One of those activities was workload (server) consolidation. If workload consolidation wasn't on your 2009 to-do list, I'd put money on the probability of it being on there this year. Workload consolidation is one way to decrease the number of servers sucking power for systems delivering diminishing returns for their maintenance as opposed to productivity cost.
Workload consolidation involves analyzing system performance and combining the workloads of underutilized systems to create a more efficient data center. For example, if you have 10 web servers all humming along at 15 to 20 percent CPU and memory utilization on each system, from a practical standpoint, those systems are idle. Their low impact workloads present you with the opportunity to combine them into a pair of highly available systems whose average utilization will hover in the 65 to 75 percent range. Peak utilization might reach 95 percent at times, but the average utilization range is a comfortable target for which to aim.
This week, I present five steps to a more efficient data center through workload consolidation.
Step 1: Collect Data
The first step in this process is to collect system performance data. This step is likely to take the longest amount of time to perform. You dont want performance snapshots but rather a full picture of performance trends. You must gather enough data so that you can see hourly trends, day of week trends and even monthly trends. A year's worth of data is a reasonable amount of time to gather the information you need. If you already have this data, then you're ahead of the game and you may proceed to the next step: Data Analysis.
If you haven't gathered system performance data, you must engage your staff to do so. There are numerous tools available for gathering this data, but the free, open source product called Orca is a good example of the kind of product and data collector you need for this activity. If time permits, collect data for at least two weeks before attempting any data analysis in the next step.
Step 2: Data Analysis
After you've collected enough data, it's time to analyze that data. It is this analysis upon which you'll create your workload consolidation plan in Step Three. Fortunately, tools like Orca have a strong visual, as well as a strong numeric component to them. The hourly, daily, weekly, monthly, quarterly and yearly graphs offer great insight into your system's performance at a glance. You will not need your calculator to visualize positive or negative trends.
You'll need to focus on four main performance areas: CPU, memory, network and disk. For CPU usage, you must check the amounts of CPU used by user and system processes. Idle tells you how quiet the CPU is for a given period. Memory statistics vary widely on different OSs, so much so that it's difficult to generalize. Pages in and Pages out, plus the amount of Swap in and Swap out, provides a very quick view of overall memory performance. Network performance provides you with the amount of network data coming into and leaving the system as input and output. For most systems, disk performance is far more important than disk space. You can always add more disk to increase free space. Disk reads and writes per second, especially for disk intensive applications (databases), can tip you off as to when you must buy faster disks, change your RAID strategy or move to SAN-based storage for increased performance.
- CPU User, System and Idle
- Memory Pages in/out and Swap in/out
- Network Input and Output (bits/sec)
- Disk Reads and Writes per second
Systems that run between 0 to 50 percent idle are candidates for consolidation. Those that run in the 50 percent to 70 percent range can move to a consolidated system if the other workloads prove compatible. Compatible workloads are workloads that complement each other. For example, a CPU-intensive workload may pair well with one that is heavy on disk reads and writes. A web server and database server consolidated onto a single system have this kind of relationship.
Step 3: Draft a Plan
Your data analysis should identify systems whose workloads are appropriate for consolidation. Systems can run multiple instances of web services, multiple databases and multiple network services as well as provide storage for multiple users. There are two basic rules of workload consolidation with which you're familiar from the previous section: Group similar workloads and group complementary workloads onto your consolidated systems.
Your plan should include the consolidation of similar workloads by combining as many idle workloads as possible onto a single system. By single system, I mean a pair of twin systems connected together for the purpose of failover. A literal single system is a single point of failure, and you should avoid that. Your plan should include the combination of complementary workloads (web/database) to fully use the new system.
Step 4: Implement the Plan
Copy existing workloads onto the consolidated systems and run them in parallel to your existing systems prior to cutover so you can work out any flaws in your plan. After the cutover, you should keep your existing systems in place for a comfortable period in case you must back out of a consolidation move. It's rare, but it does happen. You'll assess the success of your plan in Step Five.
Step 5: Assess the Results
Install performance collection software on the consolidated systems to assess the results of your efforts. You should expect these consolidated systems to run "hot" compared to your old underutilized ones. Acceptable utilization numbers might exceed 80 percent and peak near 100 percent on consolidated systems. Keep an eye on wait states or too much swap usage, which can indicate performance problems. If you encounter performance problems, post cutover, spread out or reconfigure your workload configurations. Re-analyze your data to make sure the numbers didnt deceive you.
Workload consolidation is your first step toward a more efficient data center and one that's less power needy and maintenance heavy. This exercise might also put off your conversion to a virtualized infrastructure for another year or more. Decreasing the number of systems running your workloads is a good thing for you and your technology-drained budget. Ive given your first five steps on your thousand mile journey to data center efficiency only 995 more to go.
Ken Hess is a freelance writer who writes on a variety of open source topics including Linux, databases, and virtualization. He is also the coauthor of Practical Virtualization Solutions, which is scheduled for publication in October 2009. You may reach him through his web site at http://www.kenhess.com.