In the previous article in our series on Windows 2003 Server High Availability solutions, we reviewed the SCSI architecture, which has been providing shared storage capabilities since the earliest Microsoft server cluster deployments. Although this approach is still available and frequently used in lower-end, two-node clustering implementations (since higher number of nodes is not supported with this hardware platform), its popularity has declined, in part due to the introduction of other, considerably more efficient, flexible, stable, scalable and secure options.
The undisputed lead in this area now belongs to the Fibre Channel storage area network (FC SANs) solutions (although, iSCSI and Network Attached Storage are quickly catching up), which this article will cover.
FC SANs represent a considerable shift from the directly attached storage paradigm. They offer significant functionality and performance improvements. The basic idea is to use a network infrastructure for connecting servers to their disks, allowing physical separation of the two by far greater distances than was previously possible. But there are also other, equally important advantages of this separation. Managing storage in larger environments no longer requires dealing with each individual system, as was the case with directly attached models. Disks are grouped together, simplifying their administration (e.g., monitoring, backups, restores, provisioning and expansion) and making it more efficient, through such inventions as LAN-free or server-free backups and restores, or booting from a SAN.
In addition, since large number of servers and storage devices can participate in the same SAN, it is possible to attach new ones as needed, making allocation of additional space a fairly easy task. This is further simplified by the DISKPART.EXE Windows 2003 Server utility, which is capable of dynamically extending basic and dynamic volumes, as explained in Microsoft Knowledge Base Article Q325590. This is especially true when comparing the SAN with a SCSI-based setup, where the limited amount of internal or external connectors and adjacent physical space available must be taken into account.
Fibre Channel SAN technology leverages SCSI-3 specifications for communication between hosts and target devices, since its implementation is based on the SCSI command set. Their transmission, however, is handled using FC transport protocol. This is done in a serial manner, typically over fiber optic cabling (although copper-based media are allowed), which eliminates distance limitations inherent to parallel SCSI.
Note, however, that the term "network" should not be interpreted in the traditional sense, since SANs do not offer routing capabilities, primarily because they are intended for high-speed, low-latency communication. SANs also use a distinct end node identification mechanism, which does not rely on Media Access Control (MAC) addresses associated with each network adapter but instead employs 64-bit (expressed usually in the form of eight pairs of hexadecimal characters) World Wide Names (WWN), burned into fibre host bus adapters (HBAs) by their manufacturers. FC interconnecting devices handle dynamic address allocation on the fabric level. In addition, unlike majority of IP-based networks, FC SANs have primarily asymmetric characters, with active servers on one end connecting mostly to passive devices, such as disks arrays or tape drives on the other, arranged in one of the following topologies:
- Point-to-point links a single host and its storage device directly via a fiber connection. This type of configuration is the simplest and least expensive to deploy and manage, but it lacks the flexibility and expandability of the other two, since it is conceptually equivalent to SCSI-based directly attached disks. It is, therefore, rarely implemented.
- Shared, also known as Fibre Channel Arbitrated Loop (FC-AL) takes the shape of a logical ring (but physically forming a star), with an FC hub or a loop switch serving as the interconnecting device. The design is similar to Token Ring architecture. This similarity is also apparent when comes to arbitration of loop usage.
Since FA-CL devices share the same media, whenever one of them needs to communicate with another, it is must send an arbitration packet around the loop, which, once returned back to the sender, signals exclusive loop access can be granted. Should conflicts occur when multiple devices attempt to communicate at the same time, the one with the lowest address wins. Addresses, which allow to differentiate among all nodes participating in the loop, can be hard-coded or assigned dynamically. The majority of loop switches provide this capability. Although dynamic allocation simplifies configuration in multi-node scenarios, it might also cause instability when devices are restarted or new ones added, since such events trigger loop reinitialization and node readdressing.
Although considerably less expensive than their switch-based counterparts, FC-ALs are not as efficient. Access to fabric is shared across all interconnected devices, which allows only two of them communicate at any given time. They are also not as scalable and support fewer nodes 126 is the upper limit. As with SCSI-based shared storage, FC-AL-based Windows 2003 Server clusters are limited to two nodes. In addition, Microsoft recommends using arbitrated loops for individual cluster implementations, rather than sharing them with other clusters or non-clustered devices. Larger or shared implementations require switched configuration.
- Switched, referred to as Switched Fibre Channel Fabric (FC-SW) networks use FC switches functioning as interconnecting devices. This topology addresses the efficiency limitations of the loop configuration by allowing simultaneous, dedicated paths at the full wire speed between any two Fibre-attached nodes. This is based on the same principle as traditional LAN switching. Scalability is greatly increased due to hierarchical, fully redundant architecture. It consists of up to three layers with core employing highest speed and port density switches, distribution relying on midrange hardware, and access characterized by low-end switches, arbitrated loops, and point-to-point connections.
Switches keep track of all fabric-attached devices, including other switches, in federated and cascaded configurations, using 3-byte identifiers. This sets the theoretical limit of roughly 16 million unique addresses. Stability is improved as well, since restarts and new connections are handled gracefully, without changes to already established addressing scheme or having a negative impact on the status of the fabric. This is partially because of the introduction of less disruptive, targeted LUN and SCSI ID resets, which are attempted first before resorting to the bus-wide SCSI Reset command. Previously, this was the only available option in Windows 2000 Server cluster implementations. Keep in mind, however, that the availability of this feature depends on the vendor-developed HBA specific miniport driver, which must be written specifically to interact with the Microsoft-provided StorPort port driver. This is a new feature in Windows 2003 Server. It is designed specifically to take advantage of performance enhancing capabilities of FC adapter, rather than legacy SCSIPort.
Increased performance, flexibility and the reliability of switched implementations come with their own set of drawbacks. Besides considerably higher cost (compared to arbitrated loops) and interoperability issues across components from different vendors, one of the most significant ones is the increased complexity of configuration and management. In particular, it is frequently necessary to provide an appropriate degree of isolation across multiple hosts connected to the same fabric and shared devices with which they are supposed to interact.
As mentioned earlier, this exclusive access is required to avoid data corruption, which is bound to happen with unarbitrated, simultaneous writes to the same disk volume. In general, three mechanisms deliver this functionality zoning, LUN masking (known also as selective presentation), and multipath configurations.
Zoning can be compared to Virtual LANs (VLANs) in traditional networks, since it defines logical boundaries (known in SAN terminology as zones) that encompass arbitrarily designated switch ports. Zones definitions in clustered deployments are typically stored and enforced by the switch port ASIC (Application-Specific Integrated Circuits) firmware, with communication permitted only between nodes attached to the switch ports that belong to the same zone. They can also be implemented by referencing WWN of host bus adapters. In addition to preventing accidental data corruption, zoning offers also an additional level of security. It protects server from unauthorized access. In clustered configurations, cluster nodes, along with the shared disks that constitute clustered resources, should belong to the same zone.
LUN (an acronym for Logical Unit Number, describing a logical disk defined in a FC SAN) masking makes it possible to limit access to individual, arbitrarily selected LUNs within a shared storage device. Such functionality is typically required in configurations involving large multidisk systems, where port-level zoning does not offer sufficient granularity. LUN masking provides necessary isolation in cases of overlapping zones, where hosts or storage devices belong to more than one zone. The relevant configuration is performed and stored on the storage controller level.
Multipath technology is the direct result of the strive for full redundancy in SAN environment. Such redundancy is available on the storage side (through fault-tolerant disk configurations, dual controllers with their own dedicated battery-backed caches and power supplies) and on the server side (through server clustering, with each of the member servers featuring dual, hot-swappable components). It is reasonable to expect the same when it comes to SAN connectivity.
Unfortunately, the solution is not as simple as installing two FC host bus adapters (HBAs) and connecting them to two, redundant switches, each of which, in turn, attaches to separate FC connections on the storage controller. This is because without additional provisions, Windows would detect two distinct I/O buses and separately enumerate devices connected to each (resulting in a duplicate set of drives presented to the operating system), which could potentially lead to data corruption. To resolve this issue, Microsoft Windows 2003 Server includes native support for Multipath I/O, which makes it possible to connect dual HBAs to the same target storage device with support for failover, failback and load balancing functionality.
Each implementation of a Windows 2003 Server server cluster must belong to a dedicated zone, to eliminate potential adverse effect of the disk access protection mechanism included in the clustering software on other devices. This does not apply, however, to storage controllers, which can be shared across multiple zones, as long as they are included on the Cluster/Multi-Cluster Device HCL. In addition, you should avoid collocating disk and tape devices in the same zone, as the SCSI bus reset commands can interfere with normal tape operations.
Remember, the rule regarding consistent hardware and software setup across all cluster nodes extends to SAN connections including host bus adapter models, their firmware revision levels and driver versions.
You should also ensure that automatic basic disk volume mounting feature is disabled. This does not apply to volumes residing on dynamic disks or removable media, which are always automatically mounted. Earlier versions of Windows would spontaneously mount every newly detected volume. In a SAN environment, this could create a problem if zoning or LUN masking was misconfigured or if prospective cluster nodes had access to the shared LUNs prior to installation of the clustering software. This feature is configurable, and disabled by default, in Windows 2003 Server. It can be controlled by running the MOUNTVOL command or using AUTOMOUNT option of the DISKPART utility.