Layer 2 switching over the WAN or the metro network, whether it is a native Ethernet frame format or a Layer 2 over TCP/IP over any type of transport, should not add any latency to that imposed by the physical distance between sites (mostly dictated by the speed of light). Switching technologies used to extend the Layer 2 over Layer 3 (L2oL3)and obviously the native Layer 2 protocol must be computed by the hardware (ASIC) to achieve line rate transport. The type of LAN extension and the choice of distance are imposed by the maximum latency supported by HA cluster framework and the virtual mobility.
It is crucial to consider whether the number of sites to be connected is two or more than two. Technologies used to interconnect two data centers in a back-to-back or point-to-point fashion are often simpler to deploy and to maintain. These technologies usually differ from more complex multipoint solutions that provide interconnections for multiple sites.
The drawback to simple point-to-point technology is reduced scalability and flexibility. This is a serious disadvantage for cloud computing applications, in which supporting an increasing number of resources in geographically distributed remote sites is critical.
Therefore, enterprises and service providers should consider possible future expansion of cloud networking and the need to operate without disruption when considering Layer 2 Extension technologies. Solutions for interconnecting multiple sites should offer the same simplicity as interconnecting two sites, with transparent impact for the entire site. Dynamically adding or removing one or several resource sites in an autonomous fashion is often referenced as “Point to Cloud DCI”. The network manager should be able to seamlessly insert or remove a new data center or a segment7of a new data center in the cloud to provide additional compute resources without modifying the existing interconnections and regardless of the status of the other remote sites.
Whatever DCI technology solution is chosen to extend the Layer 2 VLANs between remote sites, the network transport over the WAN must also provide secure ingress and egress access into those data centers.
When extending Layer 2, a number of rules must be applied to improve the reliability and effectiveness of distributed cloud networking:
- The spanning tree domain should not be extended beyond a local data center, although all the links and switches that provide Layer 2 extension must be fully redundant.
- The broadcast traffic must be controlled and limited by the edge devices to avoid the risk of polluting remote sites and should not have any performance impact.
- All existing paths between the data centers must be forwarded and intra-cloud networking traffic should be optimized to better control bandwidth and latency.
- Some workflows are more sensitive than others. Therefore, when possible, diverse path should be enabled for some services such as heartbeat, used by clusters, or for specific applications such as management or monitoring.
- The Layer 3 services may not be able to natively locate the final physical position of a VM that has migrated from one host to another. This normal behavior of routed traffic may not be efficient when the Layer 2 network (Broadcast Domain) is extended over a long distance and hosts are spread over different locations. Therefore, the traffic to and from the default gateway must be controlled and restricted onto each local data center when appropriated. Similarly, the incoming traffic should be redirected dynamically on the physical site where virtualized applications have been activated.
- For long-distance inter-site communication, mechanisms to protect the links must be enabled and rapid convergence algorithms must be provided to keep the transport as transparent as possible.
- The VLANs to be extended must have been previously identified by the server and network team. Extending all possible VLANs may consume excessive hardware resources and increase the risk of failures. However, the DCI solution for LAN extension must provide the flexibility to dynamically remove or add any elected VLAN on demand without disrupting production traffic.
- Multicast traffic should be optimized, especially in a cloud architecture made up of geographically dispersed resources.