Thursday, August 2, 2012

Alternatives to VXLAN


NVGRE

NVGRE (Network Virtualization Generic Routing Encapsulation) achieves the same functional goal as VXLAN on the network, however the encapsulation of VLAN traffic is done on the virtualised server instead.   In essence the network VLAN limitations are overcome by application software.

The server builds a gre tunnel and the virtual subnet id is included in the GRE header.  Each virtual machine is mapped to a host PA (physical address).  End to end communication is completed over existing networks that are essentially unaware of the encapsulated payloads and server vlan mappings.


IP Rewrite

An alternative approach to GRE tunnelling is ‘IP rewrite’. This solution requires each virtual server has its own IP address. This ip address is rewritten to a physical ip address for transport across the network.  This mapping is reversed on reaching the destination.  

CISCO OTV
OTV or Overlay Transport Virtualisation is Cisco's proprietary solution to extending VLANs over IP networks. It encapsulates L2 within IP allowing VLANs to be tunneled over a routed IP network. It is currently enabled on the NEXUS OS.


TRILL
Transparent Interconnection of Lots Of Links or TRILL, enables extension of L2 domains, through use of RBridges or (Routing Bridges).  TRILL RBridges communicate with a link state protocol, however this  executes within the L2 domain i.e. no ip addresses are required.  RBridges have knowledge of the topology consisting of all the RBridges and all the links between RBridges.

When an RBridge receives an Ethernet frame from an end node it encapsulates the frame in a TRILL header, addressing the packet to the RBridge with the destination MAC attached to it.  The destination RBridge performs the decapsulation before sending the packet onwards.

TRILL like the other technoloigies described enables a large L2 cloud to be created with a common subnet. Therefore hosts within the L2 cloud do NOT need  to change their ip address if they relocate. TRILL is an ietf standard.



to be continued .....

Monday, July 30, 2012

VXLANs - Building On VLANS


VLANS have been around ever since I have been in networking, providing a network segregation function at Layer 2.  

802.1q is the IEEE standard that defines VLAN tagging. In summary a 32 bit field is added between source MAC and Ether Type field in an Ethernet Frame.  12 bits of this field were set aside for the VLAN id, so enabling 4094 vlans.  

In hindsight only setting aside 12 bits to the VLAN id could now be viewed as an oversight.   Today especially in the Cloud Infrastructure space, single physical topologies, have now become multi tenanted spaces each requiring their own isolated network.   4094 vlans has become a constraint.

Additionally when the VLAN concept was derived it was designed to run in a ‘localised’ environment. Today there are requirements for multiple physical environments to be logically connected at L2.

VXLAN or Virtual eXtensible LANs aim to build upon the existing VLAN concept but solve some of the problems described. Firstly the VXLAN id is 24 bits, doubling the old VLAN id field size, and enabling over 16 million VLAN ids.     Secondly VXLAN extends the reach of VLAN by enabling VLANs to be transported or encapsulation over an IP, Layer 3 routed domain.

VXLAN does not represent the only solution to the VLAN limitations of 802.1Q.  I hope to post on some of the other solutions available.

Monday, March 12, 2012

Troubleshooting OSPF


Useful commands

show ip ospf interface brief  
show ip ospf neighbor
debug ip ospf hello
debug ip ospf adj


common ospf problems

1) NETWORK TYPES, plus HELLO AND DEAD TIMERS


check compatibility on ospf network-types and timers



 Broadcast to Broadcast                                 (DR)
Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5

 Non-Broadcast to Non-Broadcast            (DR)
 Timer intervals configured, Hello 30, Dead 120, Wait 120, Retransmit 5



 Point-to-Point to Point-to-Point                             (NO DR)
 Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5

 Point-to-Multipoint to Point-to-Multipoint       (NO DR)
 Timer intervals configured, Hello 30, Dead 120, Wait 120, Retransmit 5



Network types with DR can be mixed. Likewise network types with NON DR can be mixed. However the hello timers will need to be tweaked and matched  e.g.  ip ospf hello-interval 10


debug ip ospf hello with point-to-point towards point-to-multipoint

*Mar  1 00:32:29.935: OSPF: Mismatched hello parameters from 1.1.1.2
*Mar  1 00:32:29.935: OSPF: Dead R 40 C 120, Hello R 10 C 30


Notes

i)if you mix network types that have compatible timers then the adjacency may well form, but the route exchange wont work as expected!!!!

ii) A  DR and BDR on a frame relay network must have full reachability with other routers in the region. Make sure of the neighbour command to achieve this.  



2) MASK

The network mask must match on adjoining interfaces (unless its a point to point ospf network type).

This will be highlighted by debug ip ospf hello

*Mar  1 00:18:21.327: OSPF: Mismatched hello parameters from 1.1.1.2
*Mar  1 00:18:21.327: OSPF: Dead R 40 C 40, Hello R 10 C 10  Mask R 255.255.255.0 C 255.255.255.128


3) MTU

Interface mtu's must match.   This can be highlighted by debug ip ospf adj

*Mar  1 00:14:15.879: OSPF: Rcv DBD from 2.2.2.2 on FastEthernet0/0
       seq 0x9C9 opt 0x52 flag 0x7 len 32  mtu 500 state EXSTART

4) AREA ID AND STUB SETTINGS

If using stub areas. All routers in area must be configured that way.

5) AUTHENTICATION

Using debug ip ospf adj


*Mar  1 00:07:26.607: OSPF: Rcv pkt from 10.10.10.2, FastEthernet0/0 : Mismatch Authentication Key - Message Digest Key 1
*Mar  1 00:07:30.843: OSPF: Send with youngest Key 1
*Mar  1 00:07:30.843: OSPF: Send hello to 224.0.0.5 area 0 on FastEthernet0/0 from 10.10.10.1

6) ROUTER IDS MUST BE UNIQUE

*Mar  1 00:02:43.899: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate
router-id 1.1.1.1 from 10.10.10.1 on interface FastEthernet0/0

Friday, February 24, 2012

MSDP

MSDP - Multicast Source Discovery Protocol


The purpose of an MSDP topology is to enable discovery of multicast sources in other domains

Each PIM-SM domain uses its own RPs.  RPs run MSDP over TCP to discover multicast sources (generators) in other domains. Following MSDP peering a list of sources sending to multicast groups is exchanged.  

SA (Source Addresses) are sent to the neighboring RP over MSDP. By default an RP does not ask for  SA  messages i.e. it generally operates as a PUSH not a PULL!  RPs generally wait to get regular source announcements from any RP msdp peers.  This means there may be some latency for hosts trying to join groups that are not current but reduces router memory requirements.  (this behaviour can be changed with ip msdp sa-request command) 

An RP will periodically send (PUSH) SA messages about registered sources to its MSDP peers.  By default, any source that registers with an RP will be advertised. However there are two control mechanisms available for these exchanges.  For example you can control the source information that the router originates as per the following examples

example 1

The router here is configured to send only SA messages for a subset of the possible sources that might send PIM-SM Register messages to it. It is allowing the whole multicast range. The ip msdp redistribute statement references access list 122, which in turn permits source prefixes of 100.0.0.0/8 and group address prefixes of 224.0.0.0/4 (all multicast groups). Any source can still register with the RP but only those sources whose first 8 address bits are 100.0.0.0 are advertised in SA messages.
 
ip msdp redistribute  {acl}

ip msdp redistribute list 122  

access-list 122 permit ip 100.0.0.0 0.255.255.255 224.0.0.0 15.255.255.255

 

example 2

The router is using the ip msd sa-filter command to control SA messages received 
and sent to the specified peers. Again the ACL denotes an S,G pairing.
 
ip msdp sa-filter in 1.1.1.1 list 101

ip msdp sa-filter out 1.1.1.1 list 103

 




Sunday, February 5, 2012

OER MC Best Route Selection - Part 3


From the Master Controller R1 I trace to the loopback of R9.  It can be seen I have two valid paths: one via R4 and one via R5. 

trace 9.9.9.9

Type escape sequence to abort.
Tracing the route to 9.9.9.9

  1 11.11.11.2 84 msec
     10.10.10.1 72 msec
     11.11.11.2 56 msec
  2 12.1.1.5 72 msec
     13.1.1.6 72 msec
     12.1.1.5 56 msec
  3 21.0.0.9 [AS 2] 88 msec
     20.0.0.9 [AS 2] 120 msec *



I now enable oer 'route control' on the master. I also configure 'select exit best'.

oer master
mode route control
mode select-exit best


After a few minutes i notice the master has selected a best exit via R4. It achieves this by adjusting the bgp local preference.

s ip bgp 9.9.9.9
BGP routing table entry for 9.9.9.9/32, version 7
Paths: (2 available, best #2, table Default-IP-Routing-Table)
Multipath: iBGP
  Not advertised to any peer
  2
    13.1.1.6 (metric 110) from 5.5.5.5 (5.5.5.5)
      Origin incomplete, metric 409600, localpref 100, valid, internal
  2
    12.1.1.5 (metric 20) from 4.4.4.4 (4.4.4.4)S IP BG
      Origin incomplete, metric 409600, localpref 5000, valid, internal, best



 I now trace to R9 again and can see the path taken is the OER selected best route.

trace 9.9.9.9
Type escape sequence to abort.
Tracing the route to 9.9.9.9

  1 10.10.10.1 28 msec 44 msec 32 msec
  2 12.1.1.5 36 msec 56 msec 32 msec
  3 20.0.0.9 [AS 2] 36 msec *  72 msec







Saturday, February 4, 2012

Learning Prefixes on the OER Master Controller (MC) - Part 2


I enable learning on the OER Master.

oer master
learn
throughput
delay

I examine the ‘Learn Settings’s 

R4#s oer master | b Learn Settings

Learn Settings:
  current state : SLEEP
  time remaining in current state : 6957 seconds
  throughput
  no delay
  no inside bgp
  no protocol
  monitor-period 5
  periodic-interval 120
  aggregation-type prefix-length 24
  prefixes 100
  expire after time 720

I notice the OER Master is 'sleeping' i.e. not learning prefixes. By default the OER master learns for 5 mins and then sleeps for 120 minutes!  To increase the frequency of OER master learning from the default settings I adjust the monitor period (length of time MC learns prefixes) and the periodic-interval (time between monitoring periods).

oer master
learn
monitor-period 1
periodic-interval 0

I generate some traffic from the MC to R9

ip sla 1
 icmp-echo 9.9.9.9
 frequency 10
ip sla schedule 1 life forever start-time now
ip sla 2
 tcp-connect 9.9.9.9 23 source-ip 1.1.1.1
 timeout 500
 frequency 2
ip sla schedule 2 life forever start-time now

On R9
Ip sla responder

I now check the MC is learning about this prefix

R4#show oer master prefix learned throughput
OER Prefix Statistics:
 Pas - Passive, Act - Active, S - Short term, L - Long term, Dly - Delay (ms),
 P - Percentage below threshold, Jit - Jitter (ms),
 MOS - Mean Opinion Score
 Los - Packet Loss (packets-per-million), Un - Unreachable (flows-per-million),
 E - Egress, I - Ingress, Bw - Bandwidth (kbps), N - Not applicable
 U - unknown, * - uncontrolled, + - control more specific, @ - active probe all
 # - Prefix monitor mode is Special, & - Blackholed Prefix
 % - Force Next-Hop, ^ - Prefix is denied

Prefix                  State     Time Curr BR         CurrI/F         Protocol
                      PasSDly  PasLDly   PasSUn   PasLUn  PasSLos  PasLLos
                      ActSDly  ActLDly   ActSUn   ActLUn      EBw      IBw
                      ActSJit  ActPMOS  ActSLos  ActLLos
--------------------------------------------------------------------------------
9.9.9.9/32             INPOLICY*        0 4.4.4.4         Fa0/1           U    
                             123      127        0        0        0        0
                               U       75        0        0        1        1
                               N        N

My next post is on how to get the MC to choose the best route.

Wednesday, February 1, 2012

Optimised Edge Routing / Performance Routing - Part 1

Here is a simple OER/PFR configuration example. R1 is the master controller with R4 and R5 the border routers. 


Border router 1

 key chain OER
 key 1
   key-string CISCO


oer border
 local Loopback0
 master 1.1.1.1 key-chain OER


Border router 2

 key chain OER
 key 1
   key-string CISCO



oer border
local Loopback0
master 1.1.1.1 key-chain OER


Master router

 oer master
 border 4.4.4.4 key-chain OER
  interface FastEthernet0/1 external
  interface FastEthernet0/0 internal

 border 5.5.5.5 key-chain OER
   interface FastEthernet0/1 external
   interface FastEthernet0/0 internal

 Verification Commands


R1#s oer master border
Border           Status   UP/DOWN             AuthFail  Version
5.5.5.5          ACTIVE   UP       00:13:00          0  2.1
4.4.4.4          ACTIVE   UP       00:13:32          0  2.1


R1#show oer border
OER BR 5.5.5.5 ACTIVE, MC 1.1.1.1 UP/DOWN: UP 00:22:18,
  Auth Failures: 0
  Conn Status: SUCCESS, PORT: 3949
  Version: 2.1  MC Version: 2.1
  Fa0/0           INTERNAL
  Fa0/1           EXTERNAL