35 – East-West Endpoint localization with LISP IGP Assist

East-West Communication Intra and Inter-sites

For the following scenario, subnets are stretched across multiple locations using a Layer 2 DCI solution. There are several use cases that require LAN extension between multiple sites, such as Live migration, Health-check probing for HA cluster (heartbeat), Operational Cost containment such as migration of Mainframes, etc.  It is assumed that due to long distances between sites, the network services are duplicated and active on each of the sites. This option allows the use of local network services such as default gateways, load balancer’s and security engines distributed across each location, helps reduce server to server communication latency (East-West work flows).

Traditionally, an IP address uses a unique identifier assigned to a specific network entity such as physical system, virtual machine or firewall, default gateway, etc. The routed WAN uses the identifier to also determine the network entity’s location in the IP subnet. When a Virtual Machine migrates from one data center to another, the traditional IP address schema retains its original unique identifier and location, although the physical location has actually changed. As a result, the extended VLAN must share the same subnet so that the TCP/IP parameters of the VM remain the same from site to site, which is necessary to maintain active sessions for migrated applications.

When deploying a logical data center architecture stretched across multiple locations, with duplicated active network services, a couple of critical behaviors have to be taken into consideration. First, it is extremely important to maintain transparently with no interruption the same level of security within and across sites, before, during and after live migration of any Endpoints. Secondly, while machines follow a stateful migration process (hot live migration) from site to site, each security and network service that owns current active sessions should not reject any sessions due to an asymmetric workflow. It is mandatory that the round-trip data packet flow always reach the stateful device that owns the current session, otherwise the session is broken by the stateful device. To achieve the required symmetric workflow, it is fundamental that each IP Endpoint is localized dynamically and any movement of virtual machines are notified in real time to update all relevant routing tables.

The scenario discussed in the following lines covers security appliances in routed mode, however, the same behavior applies with any stateful devices such as SLB engines, physical appliances or virtual services.

In the following use-case, two physical data centers are interconnected using a Layer 2 DCI solution offering a single logical data center from a multi-tiered application framework point of view. Using OTV it is possible to address intercontinental distances between the two sites in a simple and robust fashion. However, this solution is agnostic to the DCI transport (OTV, VPLS, PBB EVPN, MPLS EVPN, VXLAN EVPN) as long as a method to filter FHRP handshake between datacenter protocol is supported.

The figure below depicts a generic deployment of multiple sites interconnected with a Layer 2 overlay network extended over the Layer 3 connection. The two datacenter tightly close by the Layer 2 extension are organized to support multitenancy with in this example, Tenant Red and Tenant Green elaborated latter.

Physical Architecture of DC Multi-sites with Stateful Devices

DC Multi-sites – High Level Physical Architecture

The goal is to utilize the required services from local stateful devices, reducing latency for server to server communication, in a multi-tenant environment, whilst traffic flows are kept symmetric from end-to-end.

In the following example, when host R1 from tenant Red communicates with host G1 from tenant Green, both located in DC-1, the traffic is inspected by the local FW in DC-1. On the other side, when host R2 communicates with hosts G2, the traffic is inspected by the local FW in DC2.

DC Multi-sites - Localized E-W traffic

DC Multi-sites – Localized E-W traffic

If this behavior sounds obvious for independent data centers, with geographically dispersed data centers it becomes a real concern when the broadcast domains are extended across multiple sites, with duplicate active stateful devices.

The following logical view depicts the issue. The traffic from VLAN 10 (subnet Red) destined to VLAN 20 (subnet Green) is routed via each local active Firewall. As a consequence, when host R1 (VLAN 10) in DC-1 needs to communicate with host G2 (VLAN 20) in DC-2, its data packet is routed toward its local FW in DC-1, which in turn, routes the traffic destined to G2 (subnet Green) extended to the remote site. It is therefore needed to extend the L2 for those Subnets to reach the endpoints wherever they are located. The routed traffic is therefore forwarded toward host G2 across the extended Subnet Green (Layer 2 DCI connectivity established across the OTV overlay). G2 receives the data packets and replies as expected to R1 using its local default gateway, which in turn, routes toward its local FW (DC-2) the response destined to R1 (subnet Red). In this design, by default the local FW on each site receives the preferred path for Red and Green Subnet from its local fabric.  Hence, routing between endpoints belonging to those two IP subnets is always kept local to a site. As a result, this behavior affects the workflow that becomes asymmetrical and consequently the FW in DC2 terminates that current session for security reason.

Asymmetric flow with Stateful device not allowed

Asymmetric flow with Stateful device not allowed

Two solutions exist based on host-routing to provide the physical location of the IP Endpoint, with and without Layer 2 segments extended between multiple locations. The concept relies on more specific routes (/32) propagated to multiple host databases (DC-1 and DC-2). Having a dynamic database of endpoints associated with their physical location allows to redirect the traffic destined to IP machines of interest over a specific Layer 3 path.

 Host Route Injection

The following section describes a sub-function of LISP that consists injecting the host-route toward the IGP protocol in conjunction with extended subnets across distant locations; this is known as LISP IGP Assist ESM (Extended Subnet Mode). LISP IGP Assist is agnostic to the DCI technology deployed, however because IGP Assist uses a Multicast group to transport host route notifications from site to site, it is important that IP Multicast traffic can be routed across the DCI network.

Note the latter doesn’t mean that the WAN must be IP Multicast capable, it means that Multicast data packet can be carried across all remote locations, thus the choice to OTV in this DCI design, which supports and optimizes MCAST traffic using a Head-end Replication technique (aka OTV Adjacency Server) to transport data Multicast packets over an non-Multicast capable WAN.

LISP IGP Assist can be leveraged to trigger dynamically an update in its Endpoint Identifiers (EID) database for each detection of new machine that belong to a selected subnet. As soon as an Endpoint is powered-on or has moved to a new physical host, it is automatically detected by its Default gateway (LISP First Hop Router function), the LISP process running on the relevant switch (DG) registers the new host reachability information (host route) to its EID database with its location and notifies all remote First Hop Routers accordingly using a dedicated Multicast group across the L2 DCI. Meanwhile, the FHR redistributes the Host route into its IGP routing table, adding the /32 Host route for each local EID.

Based on the above, the key concept is to propagate the host routes dynamically to the remote site using a dedicated Layer 3 DCI network. As a consequence, a more specific route (/32) is announced dynamically to attract the traffic to the EID of interest using a specific path. This L3 DCI connectivity is depicted in the next figure as the Secure L3 DCI connection for inter-site inter-VRF routed traffic.

  • R1 and G1 Host routes are propagated toward DC-2 over the Layer 3 DCI.
  • R2 and G2 Host routes are propagated toward DC-1 over the Layer 3 DCI.

The following diagram provides further details that will be used for the following test-bed.

DC Multi-sites - Physical Architecture

Physical Architecture of DC Multi-sites with dedicated L3 DCI

Site to site Inter-VRF secure communication

Site to site Inter-VRF secure communication

To summarize the logical view above:

  • Hosts R1 and R2 belong to the same VLAN 10 within Tenant Red, while G1 and G2 share the same VLAN 20 that belongs to Tenant Green.
  • VLAN 10 and VLAN 20 are extended across the L2 DCI connection established using OTV.
  • R1 and G1 are located in DC-1, R2 and G2 are located in DC-2.
  • Communications within and between IP subnets belonging to a given DMZ or tenant happens freely, while inter-tenant packet flows must be enforced through the Firewall. As a result, a L3 segmentation (VRF) is performed between tenants to force the routed traffic to use the Firewall. VLAN 10 belongs to VRF Red and VLAN 20 to VRF Green.
  • FHRP filter is enabled in conjunction with OTV, so that the same SVI can be active on both sites.
  • LISP First Hop Router (FHR) must be configured as the Default Gateways for all hosts of interest as it uses ARP Messages to trigger the host route notification for the EIDs.
  • An active Firewall on each site is used to secure the routed traffic between VRF Red and VRF Green.

It is assumed (but not discussed in this document) that each Network service such as SVI, LISP, FW, OTV, etc… are fully redundant within each data center (usually Active/Standby mode per site).

We can consider 3 types of data flows:

  • Intra-Subnet communication via L2 DCI: Bridged traffic destined for a remote host within the same broadcast domain uses the Layer 2 DCI connectivity (OTV). Note that, additional security services in transparent mode and/or encryption can be leveraged if needed to secure the Layer 2 DCI connectivity without impacting this scenario (not discussed in this post).
Intra-Subnet communication via L2 DCI

Intra-Subnet communication via L2 DCI

 

  • Local routed traffic inter-Tenant intra-DC: Each  Firewall within a data center is first of all used to secure the local traffic between the two VRF’s. Firewalls are configured using dynamic routing.
Local Inter-Subnet communication

Local Inter-VRF communication

 

  • Routed traffic inter-Tenant inter-DC: Routed traffic destined for a remote machine that belongs to a different DMZ, uses both Firewalls from each site. The traffic is routed via a dedicated Layer 3 DCI connection (depicted in the following figure as Secured L3 DCI link), preventing asymmetric traffic. Indeed, when receiving a host route notification update from the remote LISP FHR, the local gateway records the concerned Endpoint with a /32 entry. As a result, the route to reach that host is now being more specific via the remote Firewall as the next hop. As a consequence, the relevant traffic inter-VRF will be transported across the dedicated Layer 3 DCI path, toward the remote Firewall.
Site to site Inter-VRF communication

Site to site Inter-VRF communication across both FWs

Configuration

The full configuration used for this testbed can be downloaded here:

DC-1 OTV

OTV has been tested in Multicast mode and in Adjacency server mode (Head End Replication) carrying the multicast group used to exchange the LISP EID notification across sites. HSRP filtering is initiated in the OTV edge device.

interface Overlay1
 otv join-interface Ethernet1/10
 otv control-group 239.1.1.100
 otv data-group 232.10.1.0/24
 otv extend-vlan 10, 20
 no shutdown
 otv-isis default
 vpn Overlay1
 redistribute filter route-map stop-HSRP
 otv site-identifier 0001.0001.0001
interface Ethernet1/9
 description OTV Internal interface
 switchport
 switchport mode trunk
 switchport trunk allowed vlan 10,20,210
interface Ethernet1/10
 description OTV Join Interface
 ip address 192.168.1.2/24
 ip ospf network point-to-point
 ip router ospf 1 area 0.0.0.0
 ip igmp version 3
 no shutdown

DC-1 Firewall

For the tests, ICMP (to check host reachability), and SSH (to validate that routed traffic remains symmetric from site to site) are permitted from any source to any destination.

DC-1 LISP IGP Assist

The minimum configuration required for LISP IGP Assist is quite simple.

The first step is to enable PIM to distribute the host route notification using a multicast group. All interfaces of interest must be configured with PIM sparse-mode. In this setup, a single RP is configured in DC-1 using the same loopback address used for the LISP locator.

interface loopback10
 description LISP Loopback
 ip address 10.10.10.10/32
 ip router ospf 1 area 0.0.0.0
 ip pim sparse-mode
ip pim rp-address 10.10.10.10 group-list 224.0.0.0/4
ip pim ssm range 232.0.0.0/8

Then, it is required to configure the route-map to advertise the host-routes toward the distant LISP FHR. The command described here is “global” for any subnets but more specific prefixes can be used. The route-map is redistributed via the OSPF process per VRF.

ip prefix-list HOST-ROUTES seq 5 permit 0.0.0.0/0 eq 32
route-map ADV-HOST-ROUTES deny 5
 match interface Null0
route-map ADV-HOST-ROUTES permit 10
 match ip address prefix-list HOST-ROUTES
router ospf 100
 router-id 1.1.1.2
 vrf GREEN
 redistribute lisp route-map ADV-HOST-ROUTES
 vrf RED
 redistribute lisp route-map ADV-HOST-ROUTES

Because it’s a multi-tenant environment, LISP IGP Assist is configured under each relevant VRF. The notification of host routes is achieved using a dedicated multicast group 239.1.1.n per Tenant. A dedicated LISP Instance is also required per Tenant.  The Subnet 20.1.1.0/24 used for the VRF GREEN is added into the “local” mapping database (actually the loop-back address 10.10.10.10 in DC-1). Subnet 10.1.1.0/24 used for the VRF RED is added into the same “local” mapping database. In this testbed, the notifications of EID (host-route) are performed using the multicast group 239.1.1.2.

vrf context GREEN
 ip pim ssm range 232.0.0.0/8
 ip lisp itr-etr
 lisp instance-id 2
 ip lisp locator-vrf default
 lisp dynamic-eid LISP_EXTENDED_SUBNET
 database-mapping 20.1.1.0/24 10.10.10.10 priority 1 weight 50
 map-notify-group 239.1.1.2
 no route-export away-dyn-eid
vrf context RED
 ip pim ssm range 232.0.0.0/8
 ip lisp itr-etr
 lisp instance-id 1
 ip lisp locator-vrf default
 lisp dynamic-eid LISP_EXTENDED_SUBNET
 database-mapping 10.1.1.0/24 10.10.10.10 priority 1 weight 50
 map-notify-group 239.1.1.1
 no route-export away-dyn-eid

All concerned interface VLAN must be configured to use the LISP dynamic EID process

interface Vlan10
 no shutdown
 vrf member RED
 lisp mobility LISP_EXTENDED_SUBNET
 lisp extended-subnet-mode
 ip address 10.1.1.251/24
 ip ospf passive-interface
 ip pim sparse-mode
 hsrp 10
 preempt
 priority 110
 ip 10.1.1.254
interface Vlan20
 no shutdown
 vrf member GREEN
 lisp mobility LISP_EXTENDED_SUBNET
 lisp extended-subnet-mode
 ip address 20.1.1.251/24
 ip ospf passive-interface
 ip pim sparse-mode
 hsrp 20
 preempt
 priority 110
 ip 20.1.1.254

The above configuration is the minimum required with IGP Assist to detect and inject the host-route dynamically toward the upward IGP routing table, offering more specific route for remote Endpoints.

DC-2

The same configuration is duplicated on each site with the relevant changes.

!
interface loopback20
  description LISP Loopback
  ip address 20.20.20.20/24
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
!
ip pim rp-address 10.10.10.10 group-list 224.0.0.0/4
ip pim ssm range 232.0.0.0/8
!
ip prefix-list HOST-ROUTES seq 5 permit 0.0.0.0/0 eq 32
  match interface Null0
route-map ADV-HOST-ROUTES deny 5
route-map ADV-HOST-ROUTES permit 10
  match ip address prefix-list HOST-ROUTES
!
vrf context GREEN
  ip pim ssm range 232.0.0.0/8
  ip lisp itr-etr
  lisp instance-id 2
  ip lisp locator-vrf default
  lisp dynamic-eid LISP_EXTENDED_SUBNET
    map-notify-group 239.1.1.2
    database-mapping 20.1.1.0/24 20.20.20.20 priority 1 weight 50
    no route-export away-dyn-eid
!
vrf context RED
  ip pim ssm range 232.0.0.0/8
  ip lisp itr-etr
  lisp instance-id 1
  ip lisp locator-vrf default
  lisp dynamic-eid LISP_EXTENDED_SUBNET
    database-mapping 10.1.1.0/24 20.20.20.20 priority 1 weight 50
    map-notify-group 239.1.1.1
    no route-export away-dyn-eid
!
interface Vlan10
  no shutdown
  vrf member RED
  lisp mobility LISP_EXTENDED_SUBNET
  lisp extended-subnet-mode
  ip address 10.1.1.252/24
  ip ospf passive-interface
  ip pim sparse-mode
  hsrp 10
    preempt
    priority 120
    ip 20.1.1.254
    ip 10.1.1.254
!
interface Vlan20
  no shutdown
  vrf member GREEN
  lisp mobility LISP_EXTENDED_SUBNET
  lisp extended-subnet-mode
  ip address 20.1.1.252/24
  ip ospf passive-interface
  ip pim sparse-mode
  hsrp 20
    preempt
    priority 120
  router-id 1.1.1.2
!
router ospf 100
  vrf GREEN
    redistribute lisp route-map ADV-HOST-ROUTES
  vrf RED
    redistribute lisp route-map ADV-HOST-ROUTES
no system default switchport shutdown

Verification

Check the dynamic-eid database on each site

Note that the output is reduced to make it readable

The EID reachability information is given for each VRF:

  • R1 (10.1.1.1)
  • R2 (10.1.1.2)
  • G1 (20.1.1.1)
  • G2 (20.1.1.2)

Data Center 1 (Left)

LISP-IGP_DC-1# sho lisp dynamic-eid summary vrf RED
LISP Dynamic EID Summary for VRF “RED
LISP_EXTENDED_SUBNET      10.1.1.1    Vlan10 
!
LISP-IGP_DC-1# sho lisp dynamic-eid summary vrf GREEN
LISP Dynamic EID Summary for VRF “GREEN
LISP_EXTENDED_SUBNET      20.1.1.1    Vlan20

Data Center 2 (Right)

LISP-IGP_DC-2# sho lisp dynamic-eid summary vrf RED
LISP Dynamic EID Summary for VRF "RED"
LISP_EXTENDED_ SUBNET      10.1.1.2    Vlan10
!
LISP-IGP_DC-2# sho lisp dynamic-eid summary vrf GREEN
LISP Dynamic EID Summary for VRF "GREEN"
LISP_EXTENDED_SUBNET       20.1.1.2    Vlan20

Check the routing table (just keeping the relevant EID). From VRF RED, R1 is (10.1.1.1/32) locally attached to VLAN  10. R2 (10.1.1.2) , G1 (20.1.1.1) & G2 (20.1.1.2) are reachable via the next hop router (100.1.1.254).

Data Center 1 (Left)

LISP-IGP_DC-1# sho ip route vrf RED
IP Route Table for VRF "RED"
...
10.1.1.1/32, ubest/mbest: 1/0, attached 
 *via 10.1.1.1, Vlan10, [240/0], 00:31:42, lisp, dyn-eid
10.1.1.2/32, ubest/mbest: 1/0   
 *via 100.1.1.254, Vlan100, [110/1], 00:32:49, ospf-100, type-2
...
20.1.1.1/32, ubest/mbest: 1/0
 *via 100.1.1.254, Vlan100, [110/1], 00:31:40, ospf-100, type-2
20.1.1.2/32, ubest/mbest: 1/0
 *via 100.1.1.254, Vlan100, [110/1], 00:32:47, ospf-100, type-2
...

Firewall in DC-1

FW-DC-1# sho route
..//..
O E2 10.1.1.2 255.255.255.255 [110/1] via 30.1.1.2, 60:53:29, Inter-DMZ  <== via L3 DCI toward DC-2
O E2 10.1.1.1 255.255.255.255 [110/1] via 100.1.1.10, 60:52:56, RED          <== Local routed traffic
..//..

Firewall in DC-2

FW-DC-1# sho route
..//..
O E2     10.1.1.1 255.255.255.255 [110/1] via 30.1.1.1, 2d12h, Inter-DMZ  <== via L3 DCI toward DC
O E2     10.1.1.2 255.255.255.255 [110/1] via 102.1.1.20, 2d12h, RED     <== Local routed traffic
..//..

Data Center 1 (Left)

From VRF GREEN, G1 (20.1.1.1) is locally attached to VLAN 20. R1 (10.1.1.1/32), R2 (10.1.1.2) & G2 (20.1.1.2) are reachable via the next hop router (200.1.1.254).

LISP-IGP_DC-1# # sho ip route vrf green
IP Route Table for VRF "GREEN"
...
10.1.1.1/32, ubest/mbest: 1/0
    *via 200.1.1.254, Vlan200, [110/1], 01:04:23, ospf-100, type-2
10.1.1.2/32, ubest/mbest: 1/0
    *via 200.1.1.254, Vlan200, [110/1], 01:05:30, ospf-100, type-2
...
20.1.1.1/32, ubest/mbest: 1/0, attached
    *via 20.1.1.1, Vlan20, [240/0], 01:04:22, lisp, dyn-eid
20.1.1.2/32, ubest/mbest: 1/0
    *via 200.1.1.254, Vlan200, [110/1], 01:05:28, ospf-100, type-2

Let’s perform a Live migration with R1  moving to DC-2 and R2  moving to DC-1. As soon as the migration is complete, the dynamic EID mapping database is updated accordingly.

Data Center 1 (Left)

LISP-IGP_DC-1# sho lisp dynamic-eid summary vrf red
LISP Dynamic EID Summary for VRF "RED"
LISP_EXTENDED_SUBNET      10.1.1.2    Vlan10 

Data Center 2 (Right)

LISP-IGP_DC-2# sho lisp dyn summary vrf red
LISP Dynamic EID Summary for VRF "RED"
LISP_EXTENDED_ SUBNET      10.1.1.1    Vlan10


Ping and SSH sessions inter-DC between the 2 VRF continue to work with a sub-second interruption.

Configuration (continue)

As previously mentioned, the LISP IGP Assist set-up given above is the minimum configuration required to notify the EID dynamically using the multicast protocol across the L2 DCI links and redistribute the host route into the IGP routing table. As is, it already works like a charm, as long as the multicast group can reach the remote LISP mapping-database using the Layer 2 DCI extension.

It is optionally possible, but recommended, to add an alternative mechanism to notify the EID via a routing protocol established with a LISP Map-Server, in case the primary mechanism fails. For any reasons, if the Multicast transport or L2 extension stops working, the map-server will notify the remote mapping database about the new EID using the routing protocol. Actually, this is the method used for IGP Assist in ASM mode (Across Subnet Mode without any L2 extension), when no extended VLAN exists across data centers to carry the multicast accordingly for each VRF.

The Map-Resolver is responsible to receive map requests from remote ingress Tunnel  Routers (iTRs) to retrieve the mapping between an Endpoint Identifier and its current location (Locator). For the purpose of this specific scenario, there is no inbound path optimization, nor either eTR, iTR or LISP encapsulation. Hence, only the Map-server function is relevant for this solution as an backup mechanism to trigger the Endpoint notification’s. In the context of IGP Assist, the Map-Server system is responsible to exchange EID mapping between all other LISP devices. The M-DB can cohabit on the same device as the LISP FHR multiple mapping databases can be distributed over dedicated hosts. For the purpose of our test, the function of MS runs on the same switch device.

To configure the LISP map-server, follow the next global configuration on each LISP IGP router.

Data Center 1 (Left)

ip lisp itr-etr
ip lisp map-server
lisp site DATA_CENTER
  eid-prefix 10.1.0.0/16 instance-id 1 accept-more-specifics
  eid-prefix 20.1.0.0/16 instance-id 2 accept-more-specifics
  authentication-key 3 9125d59c18a9b015
ip lisp etr map-server 10.10.10.10 key 3 9125d59c18a9b015
ip lisp etr map-server 20.20.20.20 key 3 9125d59c18a9b015

Data Center 2 (Right)

ip lisp itr-etr
ip lisp map-server
lisp site DATA_CENTER
  eid-prefix 10.1.0.0/16 instance-id 1 accept-more-specifics
  eid-prefix 20.1.0.0/16 instance-id 2 accept-more-specifics
  authentication-key 3 9125d59c18a9b015
ip lisp etr map-server 10.10.10.10 key 3 9125d59c18a9b015
ip lisp etr map-server 20.20.20.20 key 3 9125d59c18a9b015

Additional document

http://lisp.cisco.com/docs/LISP_IGP_ASSIST_WHITEPAPER.pdf

 
This entry was posted in DCI. Bookmark the permalink.

2 Responses to 35 – East-West Endpoint localization with LISP IGP Assist

  1. mimu says:

    Hi Yves! Thanks for your write-up (and all the other ones :) )
    What hardware at min would you recommend for the setup above?

    Thanks!
    Michael

    • Yves says:

      Hi Michael,

      For LISP IGP Assist, in a traditional DC network deployment, AFAIK, you will need a N7k series as the default gateway (First hop router) in order to leverage a piece of LISP Multi-hop function (endpoint detection and notification). I’m not aware of any other platforms that supports this, but I may be wrong.
      Let me double-check and will comeback to this reply.

      Having say that, if you are running a modern fabric with distributed Layer 3 anycast gateway, you can also adress the same type of requirements using VXLAN MP-BGP EVPN. I’m going to test it, when I can, and will make another post for that particular fabric-based scenario. Then, for multiple distant VXLAN EVPN fabric, you should be fine with the Nexus 7k or Nexus 9k.

      best regards, yves

Leave a Reply