One of the greatest worries today is high availability of your IT systems.
Downtime can be caused by hardware failure, Software errors and much more, but
industry studies show that 80 percent of system failure can be traced to human
errors of flawed processes. I think that everyone knows someone who has lost
vital information because they forgot to do a backup. This is classic example
of the kind of problem a rigorous IT operations environment can help avoid.
The W2K advanced server and W2K Datacenter server are not the first products
that Microsoft brought to market for high availability, NT 4.0 Server enterprise
edition supported a 2 node fail-over scenario.
Microsoft claims that with their new products there should be a up-time of 99,999 percent, which means that the downtime is 5 minutes over a year. In this article I review that not only good products are needed for high availability, but also that your organization must be ready for it.
What about high availability:
Clearly, a good operating system and the proper hardware are a good start for getting a better uptime, but what about the following points:
One of the best things to do is building your services of the IT department on Standardized best practices. At the moment one of the leading best practices for the IT are well documented within the Central Computer en Telecommunications Agency's (CCTA) IT infrastructure Library (ITIL). This is not simply a book or tool but this is a set of practices, which describe procedures. It is also a mind setting, so implementing ITIL is not done on a rainy afternoon, but it takes time before the people have reached the proper mind setting. Don't under-estimate an ITIL implementation.
Certificating pays back. Before you give somebody a car, you also make sure that he or she knows how to drive it. The same goes for IT, you can give your employees the best solutions in software or hardware, but if they don't know how to work with it, even the best solution is worth nothing. Make sure that your employees have the proper training to work with the systems, i.e. MCSE-, CCNA-, CNE-training.
Not only the administrators and engineer need to be trained, also the management. They need to have training so that they know the problems of their administrator and engineers, not in detail, but in basic. They also need to know how they follow the procedures to set the proper lines for the customers of the IT-department and for the employees of the department.
You can have the best operating system and hardware, but if your infrastructure have a single point of failure, they're worthless. So make sure that you inveterate your entire infrastructure when you are thinking about high availability. Use redundant router/switch solutions. Create backup routes in your WAN or to the Internet. Not simple, but worth doing it.
One of the things people often forgot when talking and thinking of high availability. How often do we see that everybody can enter the server room without any authentication or that there is no airco available in this room. A thing, which you also can consider is the geographical location of your building (earthquakes, floods, fire) so make sure that you have a good total backup solution. This is also part of the ITIL set.
In this article, which is not a very technical one, I pointed out some of the other things concerning high availability.