Distributed Virtual Data Center

Some of the individuals posting to this site, including the moderators, work for Cisco.  Opinions expressed here and in any corresponding comments are the personal opinions of the original authors, not those of Cisco.

Dear readers,

The first recommendation I would like to highlight prior to go deeper through the different articles is, don’t extend the layer 2 segment beyond your physical DC if you don’t need. Keep in mind that for a better cost containment while maintaining easier IT operation, it exists different solutions to geographically (cold) migrate an application, while maintaining the same platform identifiers without extending the LAN across long distances – e.g. LISP IP mobility across subnet mode. We also need to take into consideration the distances and the transport – e.g. A dedicated 3 km fiber link between 2 DC does not necessarily suffer from the same challenges than deploying a layer 2 pseudowire or overlay over long distances.

Distributed Virtual DC aka DC Interconnect concerns not only the extension for Layer 2 to offer a solid and transparent interconnection for workload mobility (disaster avoidance), but Layer 3 transport may also be required for IP mobility, business continuity (disaster recovery) as well as intelligent IP localization services to improve the path optimization.

– Do I really need to extend my subnet outside my data center ?

– Can Layer 3 help to support application mobility between sites without LAN extension ?

– When and how can we optimize the traffic to and from outside the DC or between application tiers after a move of a machine.

I tried to summarise in the following articles most of the components and requirements imposed by Interconnecting resources spread over long distances in the following paper:

DCI or Inter-cloud Networking

Let’s go step by step on the different topics (IP Mobility, LAN Extension, SAN Extension and Path Optimization) to provide you details on each technology and above all to allow you guys to comment based on your experiences or just feel free to ask any question if you need.

Having said that, please feel free to use this blog to also provide your experiences interconnecting geographically dispersed resources between multiple data centers.

With the evolution of the DC Fabric network, I am also going to add some articles on Intra-DC Fabric solutions and how DCI solutions evolve and will evolve to better support spanned Apps resources from site to site.

Keep cool and don’t be shy, it’s a friendly blog 🙂

Remember that above all, because the speed of light is greater than the speed of sound, some folks appear brilliant before they sound like an idiot  !…


13 Responses to Distributed Virtual Data Center

  1. Peter says:

    Hi Yves,
    It looks like the link (DCI or Inter-cloud Networking) on this page is broken
    Broken Hyperlink:
    Best Wishes,

  2. Peter says:

    Hi Yves,
    I wondered if you were planning a blog post on the subject of DCI specifically in the context of ACI.
    And thanks for providing a link to that white paper.

    • Yves says:

      Absolutely 🙂
      There are multiple options and scenarios that will be elaborated.
      I will post a couple of articles soon.
      Thank you, yves

      • Vincent says:

        Hello Yves,

        If you want to use ACI with 9336PQ as spines and 9396PX as leaves (with FEXes connected to these leaves) and you want to have ACI present on two different sites separated by an MPLS network, which design would you advise?
        – Stretched fabric with 3 APICs in total?
        – Dual-fabric with 6 APICs in total?
        –> I thought dual-fabric would fit because there will be a L3 separation between both DC. But does it mean you need OTV to stretch the L2 VLANs across the MPLS network? Does it mean additional utilization of N7K or ASR? Isn’t there a solution with only the N9K?

        Thank you in advance,


        • Yves says:

          Hi Vincent,

          Let me try to be very succinct, I will come back soon with a series of articles on ACI

          The 2 options are valid, but Stretched fabric requires more caution and imposes some rules.
          As of today, for the stretched Fabric you need a partial mesh design like explained in post 29 http://yves-louis.com/DCI/wp-content/uploads/2015/06/transit-leaf-1.png. And you need 40GE fiber point-to-point from Spine to Leaf layer.
          How to address this design with an MPLS core?
          We have tested and validated this scenario using EoMPLS port X-connect offering the speed adaptation (40GE < => 10GE EoMPLS < => 40GE) as well as the pseudo-wire link (EoMPL). The APIC cluster being stretched across the two sites with the 3 members (2+1). Keep in mind that the maximum latency should not exceed 10ms between Spine node (Fabric A) and Leaf node (Fabric B).

          For the Dual-Fabrics (2 independent APIC cluster), you can enable OTV (N7k or ASR1k) from a pair of vPC Border leaf nodes from each site and initiate the Overlay over the MPLS core. Or you can also simulate a vPC double-side back to back from each vPC Border Leaf nodes using EoMLS as pseudo-wire.

          Kind regards, yves

  3. Dharmesh Patel says:

    Hi Yves,

    My customer is planning to deploy Active-Active Datacenters with workloads present on both DC-1 and DC-2. Some application work flows are both Inter-Site and Inter-VRFs (like host in VRF-A on DC-1 to VRF-B on DC-2). Customer requires all inter-VRF traffic to be handled via Firewalls. There are 2 pair of Firewalls at each site. So total 4 firewalls.

    Can you guide on what sort of Firewall Architecture customer can choose with Active-Active DCs.
    I understand Clustering is one solution so that sessions are load balanced across all FWs, but is there a way to handle the front end bottle neck issue arising out of clustering ? My understanding is Clustering will have one FW acting as a Primary or Active advertising a VIP and state data will be synced over the cluster link. In such deployments, when FWs in DC-1 are designated or elected as Primary, all inter-vrf traffic flow from DC-2 will have to traverse the DCI/go to DC-1 and be catered via the Primary FW first.

    Note – Multipod option is used for DCI.

    Any inputs are highly are appreciated.


    • Yves says:

      Hi Dharmesh

      You mentioned Multipod, can you please clarify which fabric solution he is planning to deploy, ACI or VXLAN EVPN?
      I don’t think there is one answer. It depends on the CU requirements and it depends on the distances between DCs and migration scenarios.

      IMO as a very short answer, the best is certainly to contain the hot live migration for intra-DC needs, and to only allow Warm/Cold migration across sites. When migration is needed, migrate the whole application (multi-tier) using for example a VM-Host affinity policy. Each session remains local using local FW (deploy Act/Sby FW in each site)
      You can also leverage host route advertisement to steer the traffic to the location where the application is active and use local default gateway to exit from the same DC.
      For inter-VRF traffic, you need to prevent asymmetric workflow using IGP assist (check post 35). With ACI or VXLAN EVPN, host route is natively embedded, hence traffic inter-VRF is by definition symmetrical.
      I know it’s a short answer, but I’m afraid that I would need a full post to answer your question accurately.
      best regards, yves

  4. Dharmesh Patel says:

    Hi Yves, Thanks for your inputs.
    Regarding Multi-Pod, yes its VXLAN EVPN.
    Inter-DC distance is not more than 2kms.
    I am more focussed to resolve the asymmetricity of traffic hitting firewalls at each site, Firewalls will be the default GWs for any traffic that exits the VRF. Customer has an application flow as: Host 1 sending req to Host 2
    Host 1 —- DC-1 VRF-A —- DC1 Act FW —–DC1- VRF-B DC-2 VRFB <—-Host 2
    Now when Host 2 in VRF-B responds back to remote Host 1, DC2 Active FW will be their default GW and the session will be dropped (since it never received the incoming TCP req).

    Not sure if I am able to clearly put my case. But is there some workaround or tweak to this sort of scenarios?

    I will have a thorough read to the Post 35 to see if I can gain an answer from it.


    • Dharmesh Patel says:

      Hi Yves,

      I walked through Post 35 on LISP IGP Assist. My customer is using Nexus 9236 as spine and there are no N7Ks.
      Tried googling LISP support on N9Ks, but am not sure if the current model 9236 can support LISP.
      Also any pointers you can help as to how LISP IGP Assist can work with MultiPod as DCI? Based on your post, I understand we would certainly need OTV between DC1 and DC2 and a secured L3 DCI between FWs at each site?


      • Yves says:

        Hi Dharmesh

        As long as you are using VXLAN EVPN, you don’t need LISP IGP Assist as the Host route notification toward the external WAN is embedded with VXLAN EVPN.
        If this is a private inter fabric network, then, you don’t need LISP IP Mobility. /32 advertisement works like a charm.
        If the inter fabric network is managed by an ISP, then host route injection is not an option (/32 is not accepted), hence, you may want to combine VXLAN EVPN multi-pod with an external LISP site gateway (N7K, ASR series..). The Nexus 900 doesn’t support LISP IP Mobility (LISP encapsulated/decap and LISP Mapping database).

        You don’t need OTV for L2 extension in your case, however instead of using VXLAN EVPN Multi-pod, I will recommend you to deploy a real VXLAN EVPN Multi-site.

        Let me know

        ps: I’m in vacation until mid March with very limited network access, hence, please expect some delays in my responses. Sorry !

        Best regards, yves

  5. Dharmesh Patel says:

    Thanks Yves for your inputs. We have decided to move with multipod as it’s private fiber between two DCs. And Multisite was discussed but existing spine models seem not to be tested for multisite per CCO. Inter-site/inter-VRF traffic is taken care by DCI (having a transit VRF on firewalls and stretching across the fabrics using multipod)

    • Yves says:

      Good day Dharmesh,

      Thank you for your comment. I understand you point of view, but if I can say, just to make sure you got the whole picture on your side with VXLAN Multi-site.

      VXLAN multi-pod has been a slight evolution +3 years ago to stretch a VXLAN fabric across different locations in a more solid way using an MP eBGP inter-site communication for dampening bouncing links (independent ASN per Pod), when only a VXLAN EVPN multi-fabric solution (post 36.x) was available by this period of time (using an independent DCI solution). Contrary to “ACI multi-Pod”, “VXLAN EVPN Multi-Pod” should not be considered as a DCI solution per se, but as a unique extended VXLAN fabric with the same but broad failure domain from Data Center to Data Center.

      Additionally to all other inter-site connectivity options across complex Layer 3 WAN/MAN and from Border leaf nodes, VXLAN Multi-site allows also direct back-to-back connectivity (including back-to-back using private fiber) from border spine devices, from the same “All-in-One” switch (VXLAN to VXLAN stitching via the same VTEP/ASIC). Notice that BGW Anycast mode from spine nodes is fully validated since day 1 & 1/2.

      From a VXLAN EVPN fabric and multi-site mgmt and automation point of view, DCNM 11.1(1) allows the deployment of back-to-back VXLAN multi-site in 3 clicks, with all underlay and overlay connectivities automatically configured and deployed, and the whole VXLAN EVPN multi-site built based on the best practices.

      Contrary to the lack of a strong boundary for the Data Plane and Control Plane imposed by VXLAN EVPN multi-pod, VXLAN EVPN Multi-site reduces the failure domain to its smallest diameter, offering rate limiter for any particular BUM traffic at the same border gateway as well as addressing the scalability concerns. Some may say that the scalability is not a road-block as the same VXLAN fabric domain supports now up to 512 leaf nodes, which is true since 9.2(3). However, does that mean we want to put all VTEPs including from remote PoDs to belong to the same weakness domain? This is not what we would like to recommend, and that’s one of the key reasons we VXLAN EVPN Multi-site came out. Inter-site (L2 and L3) and inter-VRF communications (external FW) is achieved using the same VXLAN EVPN transport (external site to site VXLAN EVPN domain), from the same devices.

      Last but not least, VXLAN EVPN Multi-site is not limited to interconnect distant VXLAN EVPN fabrics, but it is recommended to deploy it for large VXLAN domain in the same building/Campus as well, offering a solid hierachical architecture interconnecting multiple VXLAN EVPN fabrics, in a sturdy fashion.

      All this being said, I am guessing you captured the most right arguments to move from VXLAN Multi-Pod to VXLAN Multi-site 🙂
      If you still have a doubt, please, feel free to contact me directly via email (ylouis@cisco.com).

      Hope that helps

      Best regards, yves

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.