Thursday, July 23, 2009

MPLS - BGP L3 VPN




With CCIE version 4 coming round the corner in October i thought i would turn my attention to some of the new topics on the syllabus. Here i look at MPLS VPNS and in the post i configure an MPLS L3 VPN.

There are 3 customers: A, B and C. These are connected across the shared MPLS infrastructure. The goal is to allow each customer to see their partner sites routes, and their routes only, across the MPLS cloud.

In this post i do not plan to look at the detailed workings of MPLS VPNS but rather just detail the steps necessary to build and configure.

In the MPLS cloud, BGP peering to the customer sites is implemented. The IGP routing protocol in the PE network is OSPF. The config i used to achieve this is layed out below.

First step is to define the customer VRFS and i apply the following config on each of the PE routers.

ip vrf CUSTA
rd 1:1
route-target export 1:1
route-target import 1:1
!
ip vrf CUSTB
rd 2:2
route-target export 2:2
route-target import 2:2
!
ip vrf CUSTC
rd 3:3
route-target export 3:3
route-target import 3:3


Second step is to apply the vrf config to the customer facing interfaces on the PE routers. At each step i verify my config with the show ip vrf command.


PE1(config-if)#DO SIIB
Interface IP-Address OK? Method Status Protocol
FastEthernet0/0 13.0.0.1 YES NVRAM up up
Serial2/0 10.0.0.2 YES manual up up
Serial2/1 10.0.0.6 YES NVRAM up up
Serial2/2 10.0.0.10 YES NVRAM up up


PE1(CONFIG)#int s2/0
PE1(config-if)#ip vrf forwarding CUSTA
% Interface Serial2/0 IP address 10.0.0.2 removed due to enabling VRF CUSTA
PE1(config-if)#ip address 10.0.0.2 255.255.255.252
PE1(config)#int s2/1
PE1(config-if)#ip vrf forwarding CUSTB
% Interface Serial2/1 IP address 10.0.0.6 removed due to enabling VRF CUSTB
PE1(config-if)#ip address 10.0.0.6 255.255.255.252
PE1(config-if)#int s2/2
PE1(config-if)#ip vrf forwarding CUSTC
% Interface Serial2/2 IP address 10.0.0.10 removed due to enabling VRF CUSTC
PE1(config-if)#ip address 10.0.0.10 255.255.255.252


PE1#s ip vrf
Name Default RD Interfaces
CUSTA 1:1 Se2/0
CUSTB 2:2 Se2/1
CUSTC 3:3 Se2/2


PE2#siib
Interface IP-Address OK? Method Status Protocol
FastEthernet0/0 13.0.0.2 YES NVRAM up up
Serial2/0 12.0.0.2 YES NVRAM up up
Serial2/1 12.0.0.6 YES NVRAM up up


PE2(config)#int s2/0
PE2(config-if)#ip vrf for
PE2(config-if)#ip vrf forwarding CUSTA
% Interface Serial2/0 IP address 12.0.0.2 removed due to enabling VRF CUSTA
PE2(config-if)#ip address 12.0.0.2 255.255.255.252
PE2(config-if)#int s2/1
PE2(config-if)#ip vrf forwarding CUSTC
% Interface Serial2/1 IP address 12.0.0.6 removed due to enabling VRF CUSTC
PE2(config-if)#ip address 12.0.0.6 255.255.255.252

PE2(config-if)#do s ip vrf
Name Default RD Interfaces
CUSTA 1:1 Se2/0
CUSTB 2:2
CUSTC 3:3 Se2/1


PE3#siib
Interface IP-Address OK? Method Status Protocol
FastEthernet0/0 13.0.0.3 YES manual up up
Serial2/0 11.0.0.2 YES NVRAM up up
Serial2/1 11.0.0.6 YES NVRAM up up

PE3(config-vrf)#int s2/0
PE3(config-if)#ip vrf forwarding CUSTB
% Interface Serial2/0 IP address 11.0.0.2 removed due to enabling VRF CUSTB
PE3(config-if)#ip address 11.0.0.2 255.255.255.252
PE3(config-if)#int s2/1
PE3(config-if)#ip vrf forwarding CUSTC
% Interface Serial2/1 IP address 11.0.0.6 removed due to enabling VRF CUSTC
PE3(config-if)#ip address 11.0.0.6 255.255.255.252

PE3(config-if)#DO S IP VRF
Name Default RD Interfaces
CUSTA 1:1
CUSTB 2:2 Se2/0
CUSTC 3:3 Se2/1


The 3rd step is to configure the PE to CE BGP adjacencies. N.B. The CE to PE adjacencies are standard BGP config and i do not detail here. To keep the output down i have detailed the config required on PE1 only, as the config on the other PE routers is very similar.

router bgp 1000
no bgp default ipv4-unicast
bgp log-neighbor-changes
neighbor 10.0.0.1 remote-as 1
neighbor 10.0.0.5 remote-as 2
neighbor 10.0.0.9 remote-as 3

address-family ipv4 vrf CUSTC
neighbor 10.0.0.9 remote-as 3
neighbor 10.0.0.9 activate

address-family ipv4 vrf CUSTB
neighbor 10.0.0.5 remote-as 2
neighbor 10.0.0.5 activate

address-family ipv4 vrf CUSTA
neighbor 10.0.0.1 remote-as 1
neighbor 10.0.0.1 activate


The 4th step is to configure the PE to PE adjacencies. Again i have detailed PE1 config only here

PE1
router bgp 1000
neighbor 13.0.0.2 remote-as 1000
neighbor 13.0.0.3 remote-as 1000
!
address-family vpnv4
neighbor 13.0.0.2 activate
neighbor 13.0.0.2 send-community extended
neighbor 13.0.0.3 activate
neighbor 13.0.0.3 send-community extended
exit-address-family


The 5th step is to enable mpls in the provide network.
On PE1, PE2 and PE3

conf t
mpls ip
int fa0/0
mpls ip



For verification

PE1#s ip bgp vpnv4 all sum | beg Neigh
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.0.0.1 4 1 46 49 24 0 0 00:17:01 2
10.0.0.5 4 2 6 6 24 0 0 00:01:33 2
10.0.0.9 4 3 5 4 19 0 0 00:00:43 2
101.101.101.101 4 1000 30 32 24 0 0 00:17:31 2
102.102.102.102 4 1000 31 34 24 0 0 00:17:19 2

PE1#s ip bgp vpnv4 *
BGP table version is 30, local router ID is 100.100.100.100
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 1:1 (default for vrf CUSTA)
*> 1.1.1.0/24 10.0.0.1 0 0 1 ?
r> 10.0.0.0/30 10.0.0.1 0 0 1 ?
*>i12.0.0.0/30 101.101.101.101 0 100 0 7 ?
*>i102.102.102.0/24 101.101.101.101 0 100 0 7 ?
Route Distinguisher: 2:2 (default for vrf CUSTB)
*> 2.2.2.0/24 10.0.0.5 0 0 2 ?
*>i4.4.4.0/24 102.102.102.102 0 100 0 4 ?
r> 10.0.0.4/30 10.0.0.5 0 0 2 ?
*>i11.0.0.0/30 102.102.102.102 0 100 0 4 ?
Route Distinguisher: 3:3 (default for vrf CUSTC)
*> 3.3.3.0/24 10.0.0.9 0 0 3 ?
*>i5.5.5.0/24 102.102.102.102 0 100 0 5 ?
*>i6.6.6.0/24 101.101.101.101 0 100 0 6 ?
r> 10.0.0.8/30 10.0.0.9 0 0 3 ?
*>i11.0.0.4/30 102.102.102.102 0 100 0 5 ?
*>i12.0.0.4/30 101.101.101.101 0 100 0 6 ?



Finally i examine the routing tables on peer customer sites to check routes have been shared. Here i dump the Customer A routing tables

CUSTA1>s ip route

1.0.0.0/24 is subnetted, 1 subnets
C 1.1.1.0 is directly connected, Loopback0
7.0.0.0/24 is subnetted, 1 subnets
B 7.7.7.0 [20/0] via 10.0.0.2, 00:30:23
10.0.0.0/30 is subnetted, 1 subnets
C 10.0.0.0 is directly connected, Serial2/0
12.0.0.0/30 is subnetted, 1 subnets
B 12.0.0.0 [20/0] via 10.0.0.2, 00:37:16

CUSTA2#sir
1.0.0.0/24 is subnetted, 1 subnets
B 1.1.1.0 [20/0] via 12.0.0.2, 00:37:54
7.0.0.0/24 is subnetted, 1 subnets
C 7.7.7.0 is directly connected, Loopback0
10.0.0.0/30 is subnetted, 1 subnets
B 10.0.0.0 [20/0] via 12.0.0.2, 00:37:54
12.0.0.0/30 is subnetted, 1 subnets
C 12.0.0.0 is directly connected, Serial2/0


I ping across the cloud

For Customer C
CUSTC3>ping 3.3.3.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 544/1161/1684 ms
CUSTC3>ping 6.6.6.6

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 6.6.6.6, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 624/1000/1700 ms

For Customer B
CUSTB2>p 2.2.2.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 692/1121/1388 ms

CUSTA1#ping 7.7.7.7

For customer A
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 7.7.7.7, timeout is 2 seconds:
!!!!!
Success rate is 60 percent (5/5), round-trip min/avg/max = 792/1004/1228 ms

Monday, July 20, 2009

MPLS Troubleshooting

MPLS Basic Troubleshooting

1) Is CEF enabled? This is a pre-requisite for MPLS.

show ip cef summary

On Router 1 where CEF is explicitly disabled....

Router_1#s ip cef sum
IP CEF without switching (Table Version 15), flags=0x0
4294967293 routes, 0 reresolve, 0 unresolved (0 old, 0 new), peak 0
0 leaves, 0 nodes, 0 bytes, 18 inserts, 18 invalidations
0 load sharing elements, 0 bytes, 0 references
universal per-destination load sharing algorithm, id 1D530DA6
2(1) CEF resets, 0 revisions of existing leaves
Resolution Timer: Exponential (currently 1s, peak 1s)
0 in-place/0 aborted modifications
refcounts: 0 leaf, 0 node

Table epoch: 0

%CEF not running

On Router 2 with cef globally enabled....

Router_2#show ip cef summ
IP CEF with switching (Table Version 11), flags=0x0
10 routes, 0 reresolve, 0 unresolved (0 old, 0 new), peak 0
13 leaves, 14 nodes, 16328 bytes, 14 inserts, 1 invalidations
0 load sharing elements, 0 bytes, 0 references
universal per-destination load sharing algorithm, id 0CFD412D
2(0) CEF resets, 0 revisions of existing leaves
Resolution Timer: Exponential (currently 1s, peak 1s)
0 in-place/0 aborted modifications
refcounts: 3851 leaf, 3840 node

Table epoch: 0 (13 entries at this epoch)

Adjacency Table has 1 adjacency


2) Is MPLS enabled?

On Router 1 where MPLS is NOT enabled....

Router_1#show mpls for
Tag switching is not operational.
CEF or tag switching has not been enabled.
No TFIB currently allocated.

and on Router 3 where MPLS is enabled....

Router_3#show mpls forwarding-table
Local Outgoing Prefix Bytes tag Outgoing Next Hop
tag tag or VC or Tunnel Id switched interface
16 Untagged 150.1.4.4/32 0 Fa0/0 192.168.34.4
17 Untagged 192.168.45.0/24 0 Fa0/0 192.168.34.4
18 Untagged 150.1.1.0/24[V] 0 Se2/1 point2point
19 Aggregate 192.168.13.0/24[V] \
0



3) Is MPLS enabled on all required interfaces?

Normally MPLS will be enabled on all links between core routers and disabled on links connecting to insecure devices.

Router_3#show mpls int
Interface IP Tunnel Operational
FastEthernet0/0 Yes (tdp) No Yes




Thursday, July 16, 2009

QOS - MQC shape average vs shape peak

The operation of shape peak is exactly the same as shape average:it calculates the default bc in the same manner, except, that each interval it gets to fill up the Be bucket as well. With the shape average command the excess burst is only sent if the bc bucket is full i.e. after periods of inactivity.

If a network has additional bandwidth available (over the provisioned CIR) and the application can tolerate occasional packet loss, the extra bandwidth can be exploited through the use of peak rate shaping. There may be occasional packet drops when network congestion occurs.

If the traffic being sent to the network must strictly conform to the configured network provisioned CIR, then use average traffic shaping.

If you had: shape peak 8000

We get a default tc of 1/8th of a second which gives us a bc of 1000 and a be of the same value. So the sending rate equals bc + be per tc or 16000. The rate equation is as follows

peak rate = CIR * (1 + Be/Bc)
peak rate = 8000 * (1 + 10/10) = 16000


In summary

shape average 8000 equates to a tcp traffic flow of 8000 bps.
shape peak 8000 equates to tcp traffic flow of 16000 bps.

R1
policy-map RICH
class class-default
shape average 8000 1000 1000

int fa0/0
policy-map output RICH


show policy-map int fa0/0
Service-policy output: RICH

Class-map: class-default (match-any)
86 packets, 7295 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: any
Traffic Shaping
Target/Average Byte Sustain Excess Interval Increment
Rate Limit bits/int bits/int (ms) (bytes)
8000/8000 250 1000 1000 125 125

Adapt Queue Packets Bytes Packets Bytes Shaping
Active Depth Delayed Delayed Active
- 0 86 7295 0 0 no

R1
policy-map RICH
class class-default
shape peak 8000 1000 1000

int fa0/0
policy-map output RICH



show policy-map int fa0/0
Service-policy output: RICH

Class-map: class-default (match-any)
159 packets, 13622 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: any
Traffic Shaping
Target/Average Byte Sustain Excess Interval Increment
Rate Limit bits/int bits/int (ms) (bytes)
16000/8000 250 1000 1000 125 250

Adapt Queue Packets Bytes Packets Bytes Shaping
Active Depth Delayed Delayed Active
- 0 4 298 0 0 no

Monday, July 13, 2009

QOS - MQC Policer


The lab requirement here is to meter incoming HTTP traffic. When the traffic rate is less than 256kbps packets should be marked with precedence 4, and when the traffic exceeds 256kbps the traffic should be marked with precedence 0. The normal burst duration is 100 ms amd and an excess burst of 100ms should be allowed. Traffic exceeding these parameters should be dropped.

With the policing config the traffic rate is configured as bps wherease the burst size is configured in bytes. For a burst duration of 100ms then the burst size is calculated as follows: 256000 / 10 / 8 = 3200

I apply the configuration on R1 as follows

R1
class-map HTTP
match protocol http

policy-map POLICE
class HTTP
police 256000 bc 3200 be 3200 conform-action set-prec-transmit 4 exceed-action set-prec-transmit 0 violate-action drop

int fa0/0
service-policy input POLICE



Verification


Router_1#show policy-map int fa0/0
FastEthernet0/0

Service-policy input: POLICE

Class-map: HTTP (match-all)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: protocol http
police:
cir 256000 bps, bc 3200 bytes, be 3200 bytes
conformed 0 packets, 0 bytes; actions:
set-prec-transmit 4
exceeded 0 packets, 0 bytes; actions:
set-prec-transmit 0
violated 0 packets, 0 bytes; actions:
drop
conformed 0 bps, exceed 0 bps, violate 0 bps

Class-map: class-default (match-any)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: any


A further addendum to this post is the ability to police individual traffic flows inside an pre-existing policer!

For example, R1 is on a LAN segment connected to R6 and R4. A further requirement might be that traffic flows from these routers should only be able to consume half of the available bandwidth i.e. 128kbps each. This can be achieved by nesting policers as follows.

ip access-list extended R4
permit ip host 155.1.146.4 any
ip access-list extended R6
permit ip host 155.1.146.6 any

class-map R4
match access-group name R4
class-map R6
match access-group name R6

policy-map POLICE2
class R4
POLICE 128000 1600 1600 conform-action set-prec-transmit 4 exceed-action set-prec-transmit 0 violate-action drop
class R6
POLICE 128000 1600 1600 conform-action set-prec-transmit 4 exceed-action set-prec-transmit 0 violate-action drop

policy-map POLICE
class HTTP
police 256000 bc 3200 be 3200 conform-action transmit exceed-action set-prec-transmit 0 violate-action drop
service-policy POLICE2





Verification

Router_1#s policy-map int fa0/0
FastEthernet0/0

Service-policy input: POLICE

Class-map: HTTP (match-all)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: protocol http
police:
cir 256000 bps, bc 3200 bytes, be 3200 bytes
conformed 0 packets, 0 bytes; actions:
set-prec-transmit 4
transmit
exceeded 0 packets, 0 bytes; actions:
set-prec-transmit 0
violated 0 packets, 0 bytes; actions:
drop
conformed 0 bps, exceed 0 bps, violate 0 bps

Service-policy : POLICE2

Class-map: R4 (match-all)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: access-group name R4
police:
cir 128000 bps, bc 1600 bytes, be 1600 bytes
conformed 0 packets, 0 bytes; actions:
set-prec-transmit 4
exceeded 0 packets, 0 bytes; actions:
set-prec-transmit 0
violated 0 packets, 0 bytes; actions:
drop
conformed 0 bps, exceed 0 bps, violate 0 bps

Class-map: R6 (match-all)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: access-group name R6
police:
cir 128000 bps, bc 1600 bytes, be 1600 bytes
conformed 0 packets, 0 bytes; actions:
set-prec-transmit 4
exceeded 0 packets, 0 bytes; actions:
set-prec-transmit 0
violated 0 packets, 0 bytes; actions:
drop
conformed 0 bps, exceed 0 bps, violate 0 bps

Class-map: class-default (match-any)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: any

Class-map: class-default (match-any)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: any

Friday, July 10, 2009

QOS - FRTS PVC Priority Queue

Within frame relay it is possible to prioritise one vcs traffic over and above another.
A priority queue for PVCS!!


map-class frame-relay DLCI_503
frame-relay interface-queue priority high
map-class frame-relay DLCI_502
frame-relay interface-queue priority medium
map-class frame-relay DEFAULT
frame-relay interface-queue priority low

int s2/0
frame-relay interface-queue priority
frame-relay class DEFAULT
frame-relay interface-dlci 503
class DLCI_503
frame-relay interface-dlci 502
class DLCI_502

Thursday, July 9, 2009

QOS - FRTS custom queue

Here i define a Frame Relay custom queue on dlci 502 on R5.

www traffic is defined to use 50% of the bandwidth, telnet traffic 30% and everything else is defined to use the default queue with a 20% share of the bandwitdh.

Additionally the queue size is set to 40 packets for the queues in use i.e. 1-3.

queue-list 2 protocol ip 1 tcp www
queue-list 2 protocol ip 2 tcp telnet
queue-list 2 default 3
queue-list 2 queue 1 byte-count 500 limit 40
queue-list 2 queue 2 byte-count 300 limit 40
queue-list 2 queue 3 byte-count 200 limit 40


map-class frame-relay DLCI_502
frame-relay cir 128000
frame-relay bc 1280
frame-relay be 0
frame-relay custom-queue-list 2


For verification i use the show traffic-shape queue command and the show frame pvc 502 command.

Router_5#show traffic-shape queue
Traffic queued in shaping queue on Serial2/0 dlci 501
Queueing strategy: fcfs
Traffic queued in shaping queue on Serial2/0 dlci 504
Queueing strategy: fcfs
Traffic queued in shaping queue on Serial2/0 dlci 503
Queueing strategy: fcfs
Traffic queued in shaping queue on Serial2/0 dlci 502
Queueing strategy: custom-queue list 2


Router_5#show frame pvc 502

PVC Statistics for interface Serial2/0 (Frame Relay DTE)

DLCI = 502, DLCI USAGE = LOCAL, PVC STATUS = ACTIVE, INTERFACE = Serial2/0

input pkts 48 output pkts 67 in bytes 1548
out bytes 2227 dropped pkts 0 in pkts dropped 0
out pkts dropped 0 out bytes dropped 0
in FECN pkts 0 in BECN pkts 0 out FECN pkts 0
out BECN pkts 0 in DE pkts 0 out DE pkts 0
out bcast pkts 0 out bcast bytes 0
5 minute input rate 0 bits/sec, 0 packets/sec
5 minute output rate 0 bits/sec, 0 packets/sec
pvc create time 00:19:07, last time pvc status changed 00:17:28
cir 128000 bc 1280 be 0 byte limit 160 interval 10
mincir 64000 byte increment 160 Adaptive Shaping none
pkts 79 bytes 2371 pkts delayed 0 bytes delayed 0
shaping inactive
traffic shaping drops 0
Queueing strategy: custom-list 2

List Queue Args
2 3 default

List Queue Args
2 1 protocol ip tcp port www
2 2 protocol ip tcp port telnet
2 1 byte-count 500 limit 40
2 2 byte-count 300 limit 40
2 3 byte-count 200 limit 40
Output queues: (queue #: size/max/drops/dequeued)
0: 0/20/0/0 1: 0/40/0/0 2: 0/40/0/0 3: 0/40/0/0 4: 0/20/0/0
5: 0/20/0/0 6: 0/20/0/0 7: 0/20/0/0 8: 0/20/0/0 9: 0/20/0/0
10: 0/20/0/0 11: 0/20/0/0 12: 0/20/0/0 13: 0/20/0/0 14: 0/20/0/0
15: 0/20/0/0 16: 0/20/0/0

Wednesday, July 8, 2009

QOS - FRTS priority queue

Priority queueing can also be applied on a per VC basis. Configuration of the priority queues and flows is done in the standard manner.

access-list 150 permit tcp any any eq www
access-list 151 permit udp any any eq tftp
access-list 152 permit tcp any any eq cmd

priority-list 1 protocol ip high list 151
priority-list 1 protocol ip medium list 150
priority-list 1 protocol ip normal list 152
priority-list 1 default low


To apply the priority queueing this is NOT done under the interface. Without thinking i tried this and received the following error.

Router_5(config)#int s2/0
Router_5(config-if)#priority-group 1
Cannot change interface queuing when Frame-Relay traffic-shapi
ng is configured

For per VC priority queueing the priority queue must be applied to the map class used by the VC.

map-class DLCI_503
frame-relay priority-group 1


This can then be verified using the show traffic-shape queue command and the show frame pvc 503 commands.



Router_5#show traffic-shape queue
Traffic queued in shaping queue on Serial2/0 dlci 501
Queueing strategy: fcfs
Traffic queued in shaping queue on Serial2/0 dlci 504
Queueing strategy: fcfs
Traffic queued in shaping queue on Serial2/0 dlci 503
Queueing strategy: priority-group 1

Traffic queued in shaping queue on Serial2/0 dlci 502
Queueing strategy: fcfs



Router_5#show frame-relay pvc 503

PVC Statistics for interface Serial2/0 (Frame Relay DTE)

DLCI = 503, DLCI USAGE = LOCAL, PVC STATUS = ACTIVE, INTERFACE = Serial2/

input pkts 71 output pkts 68 in bytes 6372
out bytes 6608 dropped pkts 0 in pkts dropped 0
out pkts dropped 0 out bytes dropped 0
in FECN pkts 0 in BECN pkts 0 out FECN pkts 0
out BECN pkts 0 in DE pkts 0 out DE pkts 0
out bcast pkts 0 out bcast bytes 0
5 minute input rate 0 bits/sec, 0 packets/sec
5 minute output rate 0 bits/sec, 0 packets/sec
pvc create time 00:30:32, last time pvc status changed 00:28:53
cir 256000 bc 2560 be 0 byte limit 320 interval 10
mincir 128000 byte increment 320 Adaptive Shaping none
pkts 43 bytes 3952 pkts delayed 0 bytes delayed 0
shaping inactive
traffic shaping drops 0
Queueing strategy: priority-list 1

List Queue Args
1 low default

List Queue Args
1 high protocol ip list 151
1 medium protocol ip list 150
1 normal protocol ip list 152
Output queue: high 0/20/0, medium 0/40/0, normal 0/60/0, low 0/80/0

Tuesday, July 7, 2009

QOS - FRTS fair queue

By default the queue will be FIFO or fcfs (first come first served). This can be seen by executing the show traffic-shape queue command.

Router_3#show traffic-shape queue
Traffic queued in shaping queue on Serial2/0 dlci 305
Queueing strategy: fcfs


Effectively this queueing mechanism is a 'per interface' queue. Per-VC WFQ can be enabled using the frame-relay fair-queue command.

map-class frame-relay DLCI_305
frame-relay cir 384000
frame-relay bc 3840
frame-relay be 0
frame-relay mincir 256000
frame-relay adaptive-shaping becn
frame-relay adaptive-shaping interface-congestion
frame-relay fair-queue 16 32 0 512

The change can be verified using the show traffic-shape queue command

Router_3#show traffic-shape queue
Traffic queued in shaping queue on Serial2/0 dlci 305
Queueing strategy: weighted fair
Queueing Stats: 0/512/16/0 (size/max total/threshold/drops)
Conversations 0/0/32 (active/max active/max total)
Reserved Conversations 0/0 (allocated/max allocated)
Available Bandwidth 256 kilobits/sec

QOS - FRTS adaptive shaping

R3 ------- R5

The lab requirement is to allow router 3 to 'oversubcribe' the link TO router 5 at 384k. The provider actually guarantees 256k.
In this situation the cir is set to the 'oversubcribed' rate and the mincir is set to the providers guaranteed rate.

Router 3
map-class frame-relay DLCI_305
frame-relay cir 384000
frame-relay mincir 256000
frame-relay bc 3840
frame-relay be 0
frame-relay adaptive-shaping becn
frame-relay adaptive-shaping interface-congestion


frame-relay adaptive-shaping becn allows the router to adjust its sending rate down to minCIR when BECNs are received from the provider cloud.

frame-relay fecn-adapt when set on the receiving router enables it to generate BECNs when a FECN is received from the provider cloud.

Router 5
map-class frame-relay DLCI_503
frame-relay fecn-adapt


Also in this scenario, the feature frame-relay adaptive-shaping interface-congestion enables the sending router to slow down transmission when the interface queue reaches its threshold.

QOS - FRTS

FRTS or Frame Relay Traffic shaping was intended as a replacement to GTS. It allows a more granular approach to QOS with shaping per VC.

Once traffic shaping is enabled on a physical interafce a CIR of 56kbps and a tc of 125ms applies. Configuration parameters are defined using the map-class frame-relay command and are then applied to the interface using frame-relay interface-dlci xxx.

In this example R3 and R5 and connected via a frame-relay circuit. The CIR is 256k, with bursts up to 384k permitted. R5 must not overwhelm R3 i.e. it must conform to the CIR of R3.

Router 3

map-class frame-relay DLCI_305
frame-relay cir 256000
frame-relay bc 2560
frame-relay be 1280

interface Serial2/0
frame-relay traffic-shaping
frame-relay interface-dlci 305
class DLCI_305


Router 5

map-class frame-relay DLCI_503
frame-relay cir 256000
frame-relay bc 2560
frame-relay be 0

interface Serial2/0
frame-relay traffic-shaping
frame-relay interface-dlci 503
class DLCI_503


Once applied the configuration can be verified with the show traffic-shape command


Router_3#show traffic-shape

Interface Se2/0
Access Target Byte Sustain Excess Interval Increment Adapt
VC List Rate Limit bits/int bits/int (ms) (bytes) Active
305 256000 480 2560 1280 10 320 -



FRTS employs a three tiered approach to queueing. The per vc queues, then the main interface queue, followed by the physical interface transmit ring. These can all be adjusted as follows

per vc FIFO: frame-relay holdq
interface FIFO: hold-queue
transmit-ring: tx-ring-limit

Monday, July 6, 2009

QOS - CAR

CAR or Committed Access Rate

The lab topology and requirement in the post is the same as with the QOS - GTS post.
This time it is achieved using CAR.

TOPOLOGY: R2 ------ R3 ------ R1 ------ R4

The LAB requirement is to use CAR to limit the traffic flow to R4.
Packets destined to R4 loopback (150.1.4.4) is allowed 16k and packets destined to R4 fa0/0 (155.1.148.4) is allowed only 8K.



Router_1
int fa0/0
rate-limit access-group 100 8000 1000 2000 conform-action continue exceed-action drop
rate-limit access-group 101 16000 1000 2000 conform-action continue exceed-action drop


I execute pings from R3 and R2

Router_3#ping 155.1.148.4 size 4000 repeat 1000 timeout 1
Router_2#ping 155.1.148.4 size 4000 repeat 1000 timeout 1


I now verify the rate limiting on R1

Router 1
show int fa0/0 rate-limit

FastEthernet0/0
Output
matches: access-group 100
params: 8000 bps, 1500 limit, 2000 extended limit
conformed 59 packets, 65866 bytes; action: transmit
exceeded 136 packets, 199464 bytes; action: drop
last packet: 20ms ago, current burst: 1810 bytes
last cleared 00:01:04 ago, conformed 8000 bps, exceeded 24000 bps
matches: access-group 101
params: 16000 bps, 1500 limit, 2000 extended limit
conformed 95 packets, 130030 bytes; action: transmit
exceeded 100 packets, 135300 bytes; action: drop
last packet: 36ms ago, current burst: 1676 bytes
last cleared 00:01:04 ago, conformed 16000 bps, exceeded 16000 bps

QOS - GTS

TOPOLOGY: R2 ------ R3 ------ R1 ------ R4

GTS or generic traffic shaping. This predates MQC.
The LAB requirement here is to use GTS to limit the traffic flow to R4. Packets destined to R4 loopback (150.1.4.4) is allowed 16k and packets destined to R4 fa0/0 (155.1.148.4) is allowed only 8K.

I create two acls matching the required traffic flows on R1

access-list 100 permit ip any 155.1.148.0 0.0.0.255
access-list 101 permit ip any 150.1.4.0 0.0.0.255



I apply 2 traffic-shape commands to the fa0/0 interface on R1.

interface FastEthernet0/0
ip address 155.1.148.1 255.255.255.0
duplex half
traffic-shape group 100 8000 1000 2000 10
traffic-shape group 101 16000 1000 2000 10


I execute 2 pings: One from R3 to fa0/0 on R4, and one from R2 to lo0 on R4.

Router_3#ping 155.1.148.4 size 2000 repeat 1000 timeout 1
Router_2#ping 150.1.4.4 size 2000 repeat 1000 timeout 1


I then examine the traffic shaping with the following three commands on R1.

show traffic-shape
show traffic-shape statistics
show traffic-shape queue fa0/0



Router_1#show traffic-shape

Interface Fa0/0
Access Target Byte Sustain Excess Interval Increment Adapt
VC List Rate Limit bits/int bits/int (ms) (bytes) Active
- 100 8000 375 1000 2000 125 125 -
- 101 16000 375 1000 2000 62 125 -


Router_1#show traffic-shape queue fa0/0
Traffic queued in shaping queue on FastEthernet0/0
Traffic shape group: 100
Queueing strategy: weighted fair
Queueing Stats: 10/10/64/46 (size/max total/threshold/drops)
Conversations 1/1/16 (active/max active/max total)
Reserved Conversations 0/0 (allocated/max allocated)
Available Bandwidth 8 kilobits/sec

(depth/weight/total drops/no-buffer drops/interleaves) 10/4048/46/0/0
Conversation 1, linktype: ip, length: 1514
source: 155.1.13.3, destination: 155.1.148.4, id: 0x002B, ttl: 254, prot: 1


Traffic shape group: 101
Queueing strategy: weighted fair
Queueing Stats: 9/10/64/61 (size/max total/threshold/drops)
Conversations 1/1/16 (active/max active/max total)
Reserved Conversations 0/0 (allocated/max allocated)
Available Bandwidth 16 kilobits/sec

(depth/weight/total drops/no-buffer drops/interleaves) 9/4048/61/0/0
Conversation 5, linktype: ip, length: 1514
source: 155.1.23.2, destination: 150.1.4.4, id: 0x004E, ttl: 253, prot: 1


Lastly the show traffic-shape statistics command indicates the ping to R4 loopback is transmitting twice the amount of data in comparison to the ping to R4 fa0/0 interface.

Router_1#show traffic-shape stat
Acc. Queue Packets Bytes Packets Bytes Shaping
I/F List Depth Delayed Delayed Active
Fa0/0 100 10 98 130592 92 128508 yes
Fa0/0 101 10 277 329510 254 309188 yes



GTS can also be applied to the interface as a whole with the traffic-shape rate command. For example.

int fa0/0
traffic-shape rate 128000 8000 8000 100

Saturday, June 27, 2009

QOS - Hardware Queue


Here i look at QOS starting from the ground up.

First for traffic in the outbound direction. Each interface has a hardware queue also known as the tx-ring or transmit ring. This is always serviced FIFO.

The size of this queue can be viewed

Router_1#show controllers fa0/0 | inc tx_lim
tx_limited=0(256)

In this example the default size is 256 packets. This can be adjusted. In this example i reduce the size to 50 packets.

config-if=tx-ring-limit 50

If the hardware queue becomes full then the output software queue is used for buffering traffic. When adjusting queueing mechanisms it is this logic for emptying this queue that is adjusted e.g. PQ,CBWFQ, CQ etc

The size of this queue can be seen using the standard show interface command. by default it has a size of 40 packets.

Router_1#show int fa0/0 | inc Output queue
Output queue: 0/40 (size/max)


The size of the queue can be adjusted using the following command

conf-if# hold-queue 20 out

N.B. The hold-queue size applies when default FIFO queueing is in use on the interface. When other queuing methods are in use this command does not apply and the software queue sizes are set by the relevant queuing commands.


Now Input queueing....

Packets in an inbound direction are immediately handled by the interface drivers, router cpu etc. If buffering is needed due to high throughput or router load then the input queue is used.

The size of this queue is 75 packets by default and this can be viewed using the show interface command.

Router_1#show int fa0/0 | inc Input queue
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0


This can be adjusted as follows

config-if#hold-queue 20 in

Saturday, June 20, 2009

PIM - SSM

PIM SSM or Source Specific Multicast, contrary to PIM BiDir, does NOT require the PIM shared tree, and does not use it. No RPS are required and RP protocols such as Auto RP or BSR are not needed. With SSM the SPT is always used.

Like BiDir PIM configuration is pretty straightforward.

The range of multicast groups that are using ssm signaling must be specified on all routers in the mcast domain

To enable pim ssm on the default range of 232.0.0.0/8

#conf t
config#ip pim ssm range default


Note for groups in the ssm range no shared trees are allowed and any (*,G) joins will be dropped.

The final step is to enable IGMP version 3 needs on the receiver facing interfaces.


for SW1 TO JOIN 232.8.8.8 for source 150.1.5.5

config-if#ip igmp version 3
config-if#ip igmp join 232.8.8.8 source 150.1.5.5

PIM - BIDIR


Bidirectional PIM can be used when most receivers of mcast traffic are also senders at the same time. It is an extension to PIM sparse mode that only uses the shared tree for multicast distribution. Packets flow to and from the RP only.

It is relatively easy to configure, although the BIDIR configuration example on the CISCO web site doesn't quite give the full picture, as it only shows configuration on as single router.

BiDir PIM must be enabled on all multicast routers and specified multicast groups need to be configured as BiDir. This can be done using static rp, autop rp or BSR.

I use the simple router topology SW1 ----- R3 ------ R5

On each router i enable BiDir PIM.

conf t
ip pim bidir-enable


The RP (R5 in this case) must specify which bidir groups it services.

ip access-list st 45
permit 238.0.0.0 0.255.255.255


For BSR
ip pim rp-candidate lo0 group-list 45 bidir

For AUTO-RP
ip pim send-rp-announce lo0 scope 16 group-list 45 bidir

For static RP
ip pim rp-address 150.1.5.5 45 bidir

On SW1 i join the bidir mcast group 238.0.0.1

conf t
int fa0/0
ip igmp join-group 238.0.0.1


On R3 i examine the mroute table

Router_3#s ip mroute 238.0.0.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 238.0.0.1), 00:16:10/00:02:22, RP 150.1.5.5, flags: BC
Bidir-Upstream: Serial2/0, RPF nbr 155.1.0.5
Outgoing interface list:
FastEthernet0/0, Forward/Sparse, 00:15:52/00:02:34
Serial2/0, Bidir-Upstream/Sparse, 00:16:10/00:00:00


From R5 i verify solution with a ping to multicast group 238.0.0.1

Router_5#ping 238.0.0.1

Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 238.0.0.1, timeout is 2 seconds:

Reply to request 0 from 155.1.0.3, 88 ms
Reply to request 0 from 155.1.37.7, 228 ms
Reply to request 0 from 155.1.0.3, 88 ms

Monday, June 15, 2009

Multicast - Rate Limiting

SW1 -------- R3 ---------- R5

Here i make use of the multicast rate limiting function on R3 to control the amount of multicast traffic allowed to reach SW1.

First SW1 joins multicast groups 225.0.0.1 and 225.0.0.3

SW1
conf t
int fa0/0
ip igmp join-group 225.0.0.1
ip igmp join-group 225.0.0.3



On R3 the requirement is to limit the mcast traffic to 225.0.0.1 to 1k and 225.0.0.3 to 3k. The aggregate multicast traffic rate must not exceeed 5k.

This requirement can be achieved via the multicast rate limit function. Multiple rate limit statements can be applied to an interface and they are processed in a linear top down fashion. Hence careful consideration must be given to the order of the statements as applied.

First i use ACLS to define the mcast groups

R3
ip access-list standard GROUP_1
permit 225.0.0.1
ip access-list standard GROUP_3
permit 225.0.0.3


Then i apply the rate limit function.

interface FastEthernet0/0
ip multicast rate-limit out group-list GROUP_1 1
ip multicast rate-limit out group-list GROUP_3 3
ip multicast rate-limit out 5


After applying the mcast routes can be viewed and the bandwidth limits are shown

Router_3#s ip mroute 225.0.0.1
(*, 225.0.0.1), 00:13:28/00:02:33, RP 150.1.5.5, flags: SJC
Incoming interface: Serial2/0, RPF nbr 155.1.0.5
Outgoing interface list:
FastEthernet0/0, Forward/Sparse, 00:13:28/00:02:51, limit 1 kbps

Router_3#s ip mroute 225.0.0.3
(*, 225.0.0.3), 00:07:34/00:02:51, RP 150.1.5.5, flags: SJC
Incoming interface: Serial2/0, RPF nbr 155.1.0.5
Outgoing interface list:
FastEthernet0/0, Forward/Sparse, 00:07:34/00:02:51, limit 3 kbps


I then test the rate limiting via ping tests from R5. First i try a ping rate that conforms to the bandwidth limit.

Router_5#pin 225.0.0.1 size 100 repeat 2

Type escape sequence to abort.
Sending 2, 100-byte ICMP Echos to 225.0.0.1, timeout is 2 seconds:

Reply to request 0 from 155.1.37.7, 184 ms
Reply to request 0 from 155.1.37.7, 184 ms
Reply to request 1 from 155.1.37.7, 212 ms
Reply to request 1 from 155.1.37.7, 212 ms


Now i try a ping with an increased data size that exceeds the 1 kbps limit and, as expected, the traffic is dropped.

Router_5#pin 225.0.0.1 size 200 repeat 2

Type escape sequence to abort.
Sending 2, 200-byte ICMP Echos to 225.0.0.1, timeout is 2 seconds:
..



I repeat the above test on the mcast group 225.0.0.3 that has a higher bandwidth limit of 3k.

Router_5#pin 225.0.0.3 size 360 repeat 2

Type escape sequence to abort.
Sending 2, 360-byte ICMP Echos to 225.0.0.3, timeout is 2 seconds:

Reply to request 0 from 155.1.37.7, 208 ms
Reply to request 0 from 155.1.37.7, 212 ms
Reply to request 1 from 155.1.37.7, 268 ms
Reply to request 1 from 155.1.37.7, 268 ms
Router_5#pin 225.0.0.3 size 400 repeat 2

Type escape sequence to abort.
Sending 2, 400-byte ICMP Echos to 225.0.0.3, timeout is 2 seconds:
..
Router_5#

Sunday, June 14, 2009

PIM - IGMP GROUP LEAVE TIMERS


If a host on the LAN leaves a group it sends an IGMP leave message (assuming PIM version 2). Upon receipt the elected IGMP querier will send out an IGMP last member group query to ascertain if there are still other hosts on the LAN segment who are members of the group. If no hosts reply, after the message has been repeated, then the IGMP querier router removes the (*,G) mroute.

The timers/counters involved in this exchange are

ip igmp last-member-query-count (default 2)

ip igmp last-member-query-interval (default 1000ms)

PIM - IGMP QUERIER TIMEOUT


The router elected as IGMP querier will send out an IGMP query each configured interval. If non IGMP querier routers on the same LAN segment do not hear any IGMP queries for a period of time they will try and assume the IGMP querier role.

The timers used in this exchange are as follows

ip igmp query-interval (default 60 seconds)

ip igmp querier-timeout (default 120 seconds - 2 * ip igmp query interval)

PIM - IGMP GROUP QUERY TIMERS



The IGMP querier will send out an igmp group query to check group memmbership on the connected LAN segment. Normallly an IGMP group report response will be received. If no group report is received then the (*,G) mroute is removed.

In this message exchange the following timers can be influenced.

ip igmp query-interval (default 60 seconds)

ip igmp query-max-response-time (default 10 seconds).

Saturday, June 13, 2009

PIM/IGMP Elections


On a shared LAN segment, amongst the PIM enabled routers, a selected router must assume the responsibilty for i) sending any PIM register/prune messages to the RP and for ii) sending IGMP query messages.

I was until recently under the misunderstanding that the PIM DR router performed both of these functions - wrong!! These functions are completely decoupled and in fact they have a different election process and selection criteria.

First the Querier Election Process.
At start up each router sends a query message to the all systems group 224.0.0.1 from its own interface address. The router with the lowest ip address is elected IGMP querier.

Second the PIM DR Election Process
The router with the highest ip address is elected as PIM DR. This selection process can also be influenced by configuring a pim DR priority. By default all routers have priority 1, hence highest ip address wins by default. However if DR priority is used then highest DR priority wins.

The show igmp interface command can be used to show elected DR and querier. Here 155.1.148.1 is elected querier (lowest ip address on LAN segment) and 155.1.148.6 is elected DR (highest ip address on LAN segment).

Router_1(config)#do s ip igmp int fa0/0
FastEthernet0/0 is up, line protocol is up
Internet address is 155.1.148.1/24
IGMP is enabled on interface
Current IGMP host version is 2
Current IGMP router version is 2
IGMP query interval is 20 seconds
IGMP querier timeout is 40 seconds
IGMP max query response time is 4 seconds
Last member query count is 2
Last member query response interval is 1000 ms
Inbound IGMP access group is not set
IGMP activity: 0 joins, 0 leaves
Multicast routing is enabled on interface
Multicast TTL threshold is 0
Multicast designated router (DR) is 155.1.148.6
IGMP querying router is 155.1.148.1 (this system)
No multicast groups joined by this system

Friday, June 12, 2009

PIM - BSR load balancing

With BSR if multiple RPs are defined to service the same multicast groups then the BSR candidate router will distribute the load amongst these RPS. This is done using an algorithm based on the HASH length defined. The longer the hash length the more random the assignment.

Based on the following config, the RP assignment can be examined on the routers.


R1
ip pim rp-candidate Lo0

R3
ip pim rp-candidate Lo0

R5
ip pim bsr-candidate Lo0 32


Here i examine the RP for group 238.1.1.1 and 237.1.1.1. The BSR has assigned R3 as the R3 (150.1.9.9) as the RP for 238.1.1.1 and R1 (150.1.7.7) AS THE rp FOR 237.1.1.1.


R1#show ip pim rp-hash 238.1.1.1
RP 150.1.9.9 (?), v2
Info source: 150.1.5.5 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:00:48, expires: 00:01:46
PIMv2 Hash Value (mask 255.255.255.255)
RP 150.1.7.7, via bootstrap, priority 0, hash value 377749190
RP 150.1.9.9, via bootstrap, priority 0, hash value 1884030652
R1#show ip pim rp-hash 237.1.1.1
RP 150.1.7.7 (?), v2
Info source: 150.1.5.5 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:01:12, expires: 00:01:39
PIMv2 Hash Value (mask 255.255.255.255)
RP 150.1.7.7, via bootstrap, priority 0, hash value 1501822662
RP 150.1.9.9, via bootstrap, priority 0, hash value 860620476

PIM - BSR


PIM BSR (Bootstrap Routing) - the basics

The BSR mechanism is a nonproprietary method of defining RPs that can be used with third-party routers. There is no configuration necessary on every router separately (except on candidate-BSRs and candidate-RPs). The canidate-RPs are analagous with Auto-RP candidate RPs and the candidate-BSRs are analagous with the Auto RP mapping agent.

Thes can be defined as follows.

R1
ip pim rp-candidate Loopback0

R3
ip pim rp-candidate Loopback0

R5
ip pim bsr-candidate Loopback0 31


Router_5#show ip pim rp-hash 224.1.1.1
RP 150.1.7.7 (?), v2
Info source: 155.1.37.7 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:15:07, expires: 00:02:21
PIMv2 Hash Value (mask 255.255.255.254)
RP 150.1.7.7, via bootstrap, priority 0, hash value 1852227743
RP 150.1.9.9, via bootstrap, priority 0, hash value 800581801

Thursday, June 11, 2009

PIM - Multicast Boundary

The multicast boundary feature, when used with a standard acl, can be used to filter (S,G) and (*,G) join messages to the RP as well as filter mcast traffic destined to a multicast group. Note it does not filter PIM register messages as these are sent as unicast messages from the PIM DR to the PIM RP.

Consider the following configuration

access-list 5 deny 232.0.0.0 7.255.255.255
access-list 5 permit 224.0.0.0 15.255.255.255

int fa0/0
ip multicast boundary 5 filter-autorp



This command will filter multicast traffic for the range 232.0.0.0/5. This includes any traffic in this range plus any ranges that overlap with this range.

The addition of the filter-autorp messages ensures the filtering is applied to rp announcements as well as multicast traffic.

For example the downstream switch SW2 was receiving announcements for the following groups.

SW2#s ip pim rp map
PIM Group-to-RP Mappings

Group(s) 224.0.0.0/5
RP 150.1.7.7 (?), v2v1
Info source: 150.1.1.1 (?), elected via Auto-RP
Uptime: 00:19:45, expires: 00:02:33
Group(s) 224.0.0.0/4
RP 150.1.9.9 (?), v2v1
Info source: 150.1.1.1 (?), elected via Auto-RP
Uptime: 00:19:15, expires: 00:01:34
Group(s) (-)224.50.50.50/32
RP 150.1.9.9 (?), v2v1
Info source: 150.1.1.1 (?), elected via Auto-RP
Uptime: 00:19:15, expires: 00:02:36
Group(s) 232.0.0.0/5
RP 150.1.9.9 (?), v2v1
Info source: 150.1.1.1 (?), elected via Auto-RP
Uptime: 00:19:15, expires: 00:01:37


After the multicast boundary statement wis applied to the upstream neighbor rp announcements for 232.0.0./5 and 224.0.0.0/4 are both removed.

SW2#s ip pim rp map
PIM Group-to-RP Mappings

Group(s) 224.0.0.0/5
RP 150.1.7.7 (?), v2v1
Info source: 150.1.1.1 (?), elected via Auto-RP
Uptime: 00:23:58, expires: 00:02:20
Group(s) (-)224.50.50.50/32
RP 150.1.9.9 (?), v2v1
Info source: 150.1.1.1 (?), elected via Auto-RP

Wednesday, June 10, 2009

PIM - MA Placement in a Frame Relay network




When placing the Mapping Agent in a frame relay hub and spoke environment always aim to locate it at the hub or behind it.

PIM by default will assume NBMA interfaces are broadcast capable. However, by default, when a spoke sends a multicast message the hub will not replicate this to other spokes, obeying the split horizon rule. This can in part be solved by the placement of the ‘ip pim nbma-mode’ command on the hub. It is worth noting however that this only fixes sparse-mode traffic. Dense mode traffic will NOT be replicated.

Hence this poses a problem for auto-rp information which uses dense mode to mcast groups 224.0.1.39 and 224.0.1.40.

If the RP and mapping agent are placed on a spoke then auto-rp messages will only reach the hub node. If the mapping agent is on the hub then RPs could be located on the spokes as long as the announces reach the hub.

There are a couple of resolutions to this problem. First use sub-interfaces on the hub, or secondly create multicast enabled tunnels between the spokes.

The tunnel config for the spokes is shown here.

Router_1#
interface Tunnel0
ip address 155.1.20.20 255.255.255.0
ip pim sparse-mode
tunnel source Loopback1
tunnel destination 150.1.3.3
tunnel mode ipip

Router_3#
interface Tunnel0
ip address 155.1.20.21 255.255.255.0
ip pim sparse-mode
tunnel source Loopback0
tunnel destination 150.1.1.1
tunnel mode ipip


Another caveat is that if the tunnel is not included in the IGP routing then static multicast routes will be required pointing at the tunnel to ensure RPF checks don’t fail.

R1
Ip mroute 150.1.1.1 255.255.255.255 tu0


Note. The problem with the disssemination of traffic to mcast group 224.1.0.40 can be seen on the hub (R5) as the frame relay serial interface S2/0 is missing in the OIL interface list.


(150.1.1.1, 224.0.1.40), 01:14:19/00:02:34, flags: LT
Incoming interface: Serial2/0, RPF nbr 155.1.0.1
Outgoing interface list:
Loopback0, Forward/Sparse, 01:14:19/00:00:00
Serial2/1, Forward/Sparse, 01:14:19/00:00:00
FastEthernet0/0, Forward/Sparse, 01:14:19/00:00:00

PIM misc - dense mode reqd in sparse-dense region

Suppose the lab required one Mcast range ONLY operate in dense mode, whereas the rest of the domain should operate in sparse mode???

This can be achieved by making use of the 'deny' statement in the ACL used to denote the mcast groups serviced by the candidate RP.

SW3#s access-list 11
Standard IP access list 11
40 deny 224.50.50.50
20 permit 232.0.0.0, wildcard bits 7.255.255.255
30 permit 224.0.0.0, wildcard bits 15.255.255.255

When examining the rp mapping then the 'denied' range will be shown with a minus

Group(s) (-)224.50.50.50/32
RP 150.1.9.9 (?), v2v1
Info source: 150.1.9.9 (?), elected via Auto-RP
Uptime: 00:01:31, expires: 00:02:27
RP 150.1.7.7 (?), v2v1
Info source: 150.1.7.7 (?), via Auto-RP
Uptime: 00:01:07, expires: 00:02:50

PIM RP Load Balancing and Redundancy

Here look at achieving load-balancing and redundancy of multicast traffic between RPS.

First load-balancing

Auto-RP is being used. SW1 is configured as RP for 224.0.0.0 - 231.255.255.255 and
SW3 is RP for 232.0.0.0 239.255.255.255.

SW1
ip pim send-rp-announce Loopback0 scope 16 group-list 11
ip access-list st 11
permit 224.0.0.0 7.255.255.255

SW3
ip pim send-rp-announce Loopback0 scope 16 group-list 11
ip access-list st 11
permit 232.0.0.0 7.255.255.255

Now examining the rp mapping on the RP we can see the load is balanced between SW1 (150.1.7.7) and SW3 (150.1.9.9).

Router_5>s ip pim rp map
PIM Group-to-RP Mappings
This system is an RP (Auto-RP)
This system is an RP-mapping agent (Loopback0)

Group(s) 224.0.0.0/5
RP 150.1.7.7 (?), v2v1
Info source: 150.1.7.7 (?), elected via Auto-RP
Uptime: 00:00:06, expires: 00:02:53
Group(s) 224.0.0.0/4
RP 150.1.5.5 (?), v2v1
Info source: 150.1.5.5 (?), elected via Auto-RP
Uptime: 00:13:55, expires: 00:02:11
Group(s) 232.0.0.0/5
RP 150.1.9.9 (?), v2v1
Info source: 150.1.9.9 (?), elected via Auto-RP
Uptime: 00:00:31, expires: 00:02:24



Next step is to achieve redundancy. Make SW1 backup SW3 should it fail and vice versa.

This can be achieved by defining each candidate RP with the same duplicate range. The mapping agent will select the RP with the highest ip address.

So on SW1 and SW3 i update access list 11 as follows:-

ip access-list st 11
permit 224.0.0.0 15.255.255.255

Router_5>show ip pim rp map 224.0.0.0
PIM Group-to-RP Mappings
This system is an RP (Auto-RP)
This system is an RP-mapping agent (Loopback0)

Group(s) 224.0.0.0/5
RP 150.1.7.7 (?), v2v1
Info source: 150.1.7.7 (?), elected via Auto-RP
Uptime: 00:07:38, expires: 00:02:21
Group(s) 224.0.0.0/4
RP 150.1.9.9 (?), v2v1
Info source: 150.1.9.9 (?), elected via Auto-RP

Uptime: 00:02:01, expires: 00:01:56
RP 150.1.7.7 (?), v2v1
Info source: 150.1.7.7 (?), via Auto-RP
Uptime: 00:01:38, expires: 00:02:18
RP 150.1.5.5 (?), v2v1
Info source: 150.1.5.5 (?), via Auto-RP
Uptime: 00:21:27, expires: 00:02:46

The mapping agent shows SW3 as the winning candidate RP for the 224.0.0.0/4 range. On other routers only the winning RP will be shown in the rp map table.


Router_6>s ip pim rp map

PIM Group-to-RP Mappings
Group(s) 224.0.0.0/4
RP 150.1.9.9 (?), v2v1
Info source: 150.1.5.5 (?), elected via Auto-RP
Uptime: 00:04:10, expires: 00:02:42


Note. When selecting ranges to advertise the mapping agent will always advertise the longest match mcast range.

Monday, June 8, 2009

PIM DR


On a multiaccess network there may be multiple IGMP enabled routers. It is the responsibility of one of these IGMP routers to send any PIM join messages towards the RP.

If no PIM DR priority is expilicitly configured the IGMP/PIM router with the highest ip address is elected as the DR and will send the join. The PIM DR priority can be used to influence which router is elected to forward the PIM join messages.

In the above scenario, without any DR priorities configured, R6 is elected DR as it has the highest ip address.

Router_1>S IP PIM NE
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
S - State Refresh Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode
155.1.148.6 FastEthernet0/0 00:01:26/00:01:17 v2 1 / DR S
155.1.148.4 FastEthernet0/0 00:02:16/00:01:18 v2 1 / S
155.1.0.5 Serial2/0 00:00:21/00:01:24 v2 1 / DR S


If the lab requirement states R1 should be the DR for this segment this can be achieved with the use of the 'ip pim dr-priority' message.

config#int fa0/0
config-if#ip pim dr-priority 100


With the above config applied i re-examine the PIM neighbors and R1 has pre-empted the DR position.

Router_4#s ip pim ne
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
S - State Refresh Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode
155.1.148.6 FastEthernet0/0 00:02:14/00:01:28 v2 1 / S
155.1.148.1 FastEthernet0/0 00:02:35/00:01:28 v2 100/ DR S
155.1.46.5 Serial2/1 00:02:49/00:01:22 v2 1 / S

From R1 this can be seen as well using the 'show ip pim interface fa0/0' command.

Router_1#s ip pim interface fa0/0
Address Interface Ver/ Nbr Query DR DR
Mode Count Intvl Prior
155.1.148.1 FastEthernet0/0 v2/SD 2 30 100 155.1.148.1

In summary the PIM DR controls upstream PIM joins, and from my previous post the PIM assert mechanism controls downstream routing of multicast traffic.

Sunday, June 7, 2009

Controlling access to RP

PIM has the functionality to specify the multicast groups that an RP will allow joins from.

This allows central control over the mcast groups serviced by the RP.

The following config will only allow joins to mcast groups 224.11.11.11 and 224.111.111.111 for the RP 150.1.5.5. This can be enabled on the RP itself, or altenatively on routers on the path to the RP.

ip access-list st 5
permit 224.11.11.11
permit 224.111.111.111

ip pim accept-rp 150.1.5.5 5



With 'debug ip pim' enabled failed attempts to the join RP are logged

*Jun 8 07:03:13.039: PIM(0): Join-list: (*, 224.20.20.20),, ignored, invalid RP
150.1.5.5 from 155.1.58.2

PIM Assert


The PIM Assert mechanism is used to shutoff duplicate flows onto the same multiaccess network. Routers detect this condition when they receive an (S,G) packet via a multi-access interface that is already in the (S,G) OIL. This causes the routers to send Assert Messages.

In this scenario the workstation attached to R6 has joined group 239.6.6.6. A multicast feed is started and both R1 and R4 begin sending the mcast.

With 'debug ip pim' enabled on R1 and R4, it can be seen that a PIM assert exhange is initiated between them.

ON R1
*Jun 8 06:18:49.419: PIM(0): Send v2 Assert on FastEthernet0/0 for 239.6.6.6, source 155.1.58.2, metric [80/65]
*Jun 8 06:18:49.423: PIM(0): Assert metric to source 155.1.58.2 is [80/65]
*Jun 8 06:18:49.423: PIM(0): We win, our metric [80/65]


ON R4
*Jun 8 06:18:49.359: PIM(0): Received v2 Assert on FastEthernet0/0 from 155.1.1
48.1
*Jun 8 06:18:49.367: PIM(0): Assert metric to source 155.1.58.2 is [80/65]
Router_4#
*Jun 8 06:18:49.371: PIM(0): We lose, our metric [90/2172416]

The winner of the assert exchange is the router with best (AD,Metric). In the above case, R1 has an AD of 80 and R4 has an AD of 90. R1 wins!

As a result R4 prunes the S,G entries in its routing table


Router_4#s ip mroute 239.6.6.6
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.6.6.6), 00:04:43/stopped, RP 0.0.0.0, flags: D
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Serial2/1, Forward/Sparse-Dense, 00:04:43/00:00:00
FastEthernet0/0, Forward/Sparse-Dense, 00:04:43/00:00:00

(150.1.8.8, 239.6.6.6), 00:00:56/00:02:05, flags: PT
Incoming interface: Serial2/1, RPF nbr 155.1.46.5
Outgoing interface list:
FastEthernet0/0, Prune/Sparse-Dense, 00:00:56/00:02:03

(155.1.58.2, 239.6.6.6), 00:00:56/00:02:05, flags: PT
Incoming interface: Serial2/1, RPF nbr 155.1.46.5
Outgoing interface list:
FastEthernet0/0, Prune/Sparse-Dense, 00:00:56/00:02:03


On R1 the S,G entries remain with an 'A' by them denoting Assert winner!

Router_1#s ip mroute 239.6.6.6
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.6.6.6), 00:05:01/stopped, RP 0.0.0.0, flags: D
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Serial2/0, Forward/Sparse-Dense, 00:05:01/00:00:00
FastEthernet0/0, Forward/Sparse-Dense, 00:05:01/00:00:00

(150.1.8.8, 239.6.6.6), 00:01:14/00:01:46, flags: T
Incoming interface: Serial2/0, RPF nbr 155.1.0.5
Outgoing interface list:
FastEthernet0/0, Forward/Sparse-Dense, 00:01:14/00:00:00, A

(155.1.58.2, 239.6.6.6), 00:01:14/00:01:46, flags: T
Incoming interface: Serial2/0, RPF nbr 155.1.0.5
Outgoing interface list:
FastEthernet0/0, Forward/Sparse-Dense, 00:01:14/00:00:00, A

Sunday, May 31, 2009

SNMP version 3

SNMP version 3 incorporates security enhancements into the SNMP protocol.

To utilise this new functionality SNMP groups with associated user names and passwords must be created.

The 1st step is to specify an acl of users allowed to access the group

config#ip access-list st 1
config-std-acl#permit 130.1.1.1


config#snmp-server group IELAB v3 auth access 1

The 2nd step is to specify the user names and passwords

config#snmp-server user rich IELAB v3 auth md5 CISCO


Verification can be done with the command

R4#s snmp user
User name: rich
Engine ID: 800000090300CA0309800000
storage-type: nonvolatile active


When defining the snmp host the authentication method can then be specified.

config#snmp-server host 154.1.3.100 version 3 auth IELAB

Thursday, May 21, 2009

Multicast - Shared Trees



I thought i would write about the multicast 'shared tree' and how it is built. Once understood, I feel this has certainly helped me with multicast and troubleshooting along the way.

This is a two stage process: the server 'registers' to the RP and the client 'joins' the RP. These 2 processes are independent of each other and is the same regardless of the underlying PIM RP selection protocol e.g. Auto rp, static rp, or BSR.

In this example i use multicast servers attached to R6 and clients attached to R4.
The RP is 150.1.5.5 on R5

First the registration process.
A server at 204.12.1.254 starts to send to a multicast address multicast address 224.4.4.4. The PIM interface on the local LAN segment receives the multicast packet and sends a PIM 'register' to the RP, 150.1.5.5 in this example. This message is actually encapsulated as a unicast message to 150.1.5.5.

When the RP receives this register message it acknowledges receipt

The output of the debug ip pim on R6 (local PIM interface) and R5 (the RP) shows this...

R6#
*May 21 06:35:16.647: PIM(0): Check RP 150.1.5.5 into the (*, 224.4.4.4) entry
*May 21 06:35:16.655: PIM(0): Send v2 Register to 150.1.5.5 for 204.12.1.254, gr
oup 224.4.4.4
*May 21 06:35:17.143: PIM(0): Received v2 Register-Stop on FastEthernet0/0 from
150.1.5.5
*May 21 06:35:17.147: PIM(0): for source 204.12.1.254, group 224.4.4.4
*May 21 06:35:17.147: PIM(0): Clear Registering flag to 150.1.5.5 for (204.12.1.
254/32, 224.4.4.4)

R5#
*May 21 06:35:16.483: PIM(0): Received v2 Register on Serial2/0 from 192.10.1.6
*May 21 06:35:16.487: for 204.12.1.254, group 224.4.4.4
*May 21 06:35:16.491: PIM(0): Check RP 150.1.5.5 into the (*, 224.4.4.4) entry
*May 21 06:35:16.495: PIM(0): Send v2 Register-Stop to 192.10.1.6 for 204.12.1.2
54, group 224.4.4.4

Following this registration process entries are placed in the mroute table on R6 and R5, but not any intervening routers in the unicast path between R6 and R5.

On R6 2 mroute entries are created: the (*,G) entry and the (S,G) entry. Both entries at this stage have a null output interface as no client has yet registered for this multicast feed. The (S,G) entry denotes that a server is sending to the multicast group.

R6#s ip mroute 224.4.4.4
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.4.4.4), 00:11:12/stopped, RP 150.1.5.5, flags: SPF
Incoming interface: FastEthernet0/0, RPF nbr 192.10.1.1
Outgoing interface list: Null


(204.12.1.254, 224.4.4.4), 00:11:12/00:02:51, flags: PFT
Incoming interface: FastEthernet1/0, RPF nbr 0.0.0.0
Outgoing interface list: Null


On R5, the RP, a similar 2 entries are created.

R5#s ip mroute 224.4.4.4
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.4.4.4), 00:10:16/stopped, RP 150.1.5.5, flags: SP
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list: Null

(204.12.1.254, 224.4.4.4), 00:10:16/00:02:47, flags: P
Incoming interface: Tunnel53, RPF nbr 154.1.0.3, Mroute
Outgoing interface list: Null


This completes the register process.


Second the client 'join' process.
A mcast client is connected to R4 and sends an IGMP join for the multicast group 224.4.5.6. Upon receipt of the IGMP join R4 sends a PIM join message towards the RP (R5).

Debug output on R5 shows receipt of the join message.

R5#
*May 21 06:59:48.519: PIM(0): Received v2 Join/Prune on Tunnel53 from 154.1.0.3,
to us
*May 21 06:59:48.523: PIM(0): Join-list: (*, 224.4.5.6), RPT-bit set, WC-bit set
, S-bit set
*May 21 06:59:48.527: PIM(0): Add Tunnel53/154.1.0.3 to (*, 224.4.5.6), Forward
state, by PIM *G Join


On the RP (R5) a (*,G) entry is created in the mroute table

R5#s ip mroute 224.4.5.6
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.4.5.6), 00:02:47/00:02:58, RP 150.1.5.5, flags: SJC
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Tunnel53, Forward/Sparse-Dense, 00:01:48/00:02:40
FastEthernet1/0, Forward/Sparse-Dense, 00:02:47/00:02:58


Notice this time the entry has a populated outgoing interface. With the join process all PIM enabled routers in the path to the RP also build such an entry in their mroute table.

This completes the join process.

Tying 'register' and 'join' together
The RP ties the join and register processes together. I initiate a server multicast feed to the multicast address 224.4.5.6.

I start a ping to 224.4.5.6

ping 224.4.5.6 repeat 10000
Reply to request 23 from 154.1.0.4, 552 ms
Reply to request 24 from 154.1.0.4, 676 ms
Reply to request 25 from 154.1.0.4, 788 ms
Reply to request 26 from 154.1.0.4, 964 ms
Reply to request 27 from 154.1.0.4, 824 ms

This will again initiate a new register from the local PIM interface to the RP for this multicast group.

I examine the mroute table on the RP with this multicast ping in process.

R5#s ip mroute 224.4.5.6
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.4.5.6), 00:03:55/00:03:29, RP 150.1.5.5, flags: SJC
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Tunnel53, Forward/Sparse-Dense, 00:02:56/00:03:29
FastEthernet1/0, Forward/Sparse-Dense, 00:03:55/00:02:50

(204.12.1.254, 224.4.5.6), 00:00:22/00:02:59, flags: T
Incoming interface: Tunnel53, RPF nbr 154.1.0.3, Mroute
Outgoing interface list:
FastEthernet1/0, Forward/Sparse-Dense, 00:00:22/00:02:50


As before the RP contains both the (*,G) and (S,G) entries. Since a client has registered for this feed both entries also contain an interface in the outgoing interface list (OIL).


I examine the mroute table on R6 (the PIM router connected to the multicast server). Similarly there are 2 entries for the multicats group 224.4.5.6. Note the (S,G) entry has an OIL entry, however the (*,G) entry does not.

Only those routers in the path between the multicast client an the RP will have a populated OIL for this entry.


R6#s ip mroute 224.4.5.6
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.4.5.6), 00:03:13/stopped, RP 150.1.5.5, flags: SPF
Incoming interface: FastEthernet0/0, RPF nbr 192.10.1.1
Outgoing interface list: Null

(204.12.1.254, 224.4.5.6), 00:03:13/00:03:25, flags: FT
Incoming interface: FastEthernet1/0, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet0/0, Forward/Sparse-Dense, 00:03:12/00:03:16

Sunday, May 17, 2009

BGP preferred path



In this scenario R3 is advertising network 154.1.5.0/24 to bgp peers R1 and R2. The lab requirement is for AS 300 to be configured so that the link R1 to R3 is the preferred path to reach this network.

In such a scenario there are 2 usual candidates to meet this requirement: as-path prepending and MED. In this scenario as-path pre-pending is not allowed.

So the configuration for MED on R3 is as follows

ip prefix-list VLAN10 permit 154.1.5.0/24
route-map R1 permit 10
match ip address prefix-list VLAN10
set metric 100
route-map R1 permit 20


route-map R2 permit 10
match ip address prefix-list VLAN10
set metric 200
route-map R2 permit 20


router bgp 300
neighbor 154.1.13.1 route-map R1 out
neighbor 154.1.23.2 route-map R2 out


The above was my first configured solution. I examined the bgp routing table on R2 to verify my results:-

R2#s ip bgp
Network Next Hop Metric LocPrf Weight Path
*>i154.1.5.0/24 154.1.13.3 100 100 0 300 400 i
* 154.1.23.3 200 0 300 400 i

As expected R2 had 2 paths to network 154.1.5.0/24. However R2s next hop for both learned routes was R3! I had missed one vital configuration element in terms of meeting the lab requirement. R1s advertisement of 154.1.5.0/24 to R2 was NOT adjusting the next hop. This is the correct beahviour since R1 has an EBGP peer relationship with R3.

To ensure traffic from R2 destined to R3 goes via R1 it is necessary for R1 to adjust the next hop of EBGP learned routes to itself.

R1
router bgp 200
neighbor 192.10.1.2 next-hop-self


Once applied i examined the bgp table on R2 again..

Rack1R2#s ip bgp
Network Next Hop Metric LocPrf Weight Path
*>i154.1.5.0/24 192.10.1.1 100 100 0 300 400 i
* 154.1.23.3 200 0 300 400 i


Now R2s preferred route to reach R3 is via R1. Job done!

Saturday, May 16, 2009

srr-queue commands - part IV

The final part in my look at the srr queues is how DSCP or COS marked packets are assigned to the srr queues.

First the default settings

dscp default values 0-63

0-15 queue 2
16-31 queue 3
32-39, 48-63 queue 4
40-47 queue 1

cos default values 0-7
0,1 queue 2
2,3 queue 3
4,6,7 queue 4
5 queue 1

Within each queue marked packets can be placed in one of three threshold queues. By default all packets are placed in threshold 1. By default threshold 1 has the lowest tolerance to WTD.

The above default settings can all be adjusted depending on the requirements to be met with the following commands

mls qos srr-queue output dscp-map queue {queue-id} threshold {threshold id} {dscp1} ....{dscp8}
mls qos srr-queue output cos-map queue {queue-id} threshold {threshold id} {cos1} ....{cos8}

To ensure higher priority dscp or cos values are not dropped first they can be assigned to a threshold id with a higher value {2 or 3}. By default the higher threshold id values will have a higher tolerance to WTD.

Finally to review assignments use the show mls qos maps command.

Thats it for srr-queues. In my opinion, an absolute beast of a subject. Good to have an understanding of the configurable parameters, but the doc cd will be my friend should this come up.

Friday, May 15, 2009

srr-queue commands - part III

Before i write about how traffic is allocated to queues, i realised there is another important piece to the srr queue puzzle. Namely how buffers are allocated and managed on the 4 srr queues.

This in itself appears to be a science best approached in a dark room!:-)

Buffers can be set up in advance and mapped to queue set in advance. 2 queue sets are available. An interface is then assigned to a queue set, thus applying the required buffers accordingly. By default an interface uses queue set 2.

An interface is assigned a queue-set as follows
config-if#queue-set 2
or
config-if#queue-set 1



As we know already there are 4 srr queues. A number of values can be set for each of these queues.

1) Buffer allocation
In percentage terms how much of the available interface buffer space is mapped to this queue. Allocation for the 4 srr queues must total 100%.

2) Buffer thresholds - of which there are 4
2 drop WTD (weighted tail drop) thresholds
1 reserved threshold
1 maximum threshold

First buffer allocation
mls qos queue-set {1-2} buffers {%1,%2,%3,%4}

e.g. mls qos queue-set 1 buffers 30 30 30 10
This sets the buffer allocation for srr queue 1 to 30%, queue 2 30%, queue 3 30% and queue 4 10%. N.B. if this command is not used the default allocation is 25% for each queue.

Second buffer thresholds
As mentioned there are 4 thresholds. If none are explicitly set then the following percentage defaults apply to the available buffer space:

queue 1 100 100 50 400
queue 2 200 200 50 400
queue 3 100 100 50 400
queue 4 100 100 50 400

e.g. for queue 1
100 wtd threshold 1
100 wtd threshold 2
50 reserved threshold
400 maximum threshold


So bringing the above alltogether in one example

mls qos queue-set output 1 buffers 30 20 30 20
mls qos queue-set output 1 threshold 1 40 60 100 200
mls qos queue-set output 1 threshold 2 40 60 100 200
mls qos queue-set output 1 threshold 3 40 60 100 200
mls qos queue-set output 1 threshold 4 40 60 100 200
int gi1/1
queue-set 1

i)srr buffer allocation for queues 1-4 is 30%,20%,30% and 20% respectively
ii)the srr queue thresholds are set identically for all 4 queues to 40%,60%,100% and 200%
iii) all the above config is applied to queue set 1, which is then applied to interface gi1/1

As mentioned at the start, when i first looked at this, it appears to be a another science in itself. I know i had to read the cisco doc at least a couple of times to get it straight - or maybe thats just me:-)

Monday, May 11, 2009

srr-queue commands - part II

In this post i look at the srr-queue shape and share commands, what they do and how they interact.

There are 4 interface queues serviced by SRR. Each queue can be configured for either shaping or sharing, but not both. If shaping is configured then this takes precedence.
( The way i remember this is that shaping comes alphabetically before sharing )

Shaping guarantees a percentage of the bandwidth and limits the traffic to the configured amount. Conversely sharing allocates the bandwidth amongst the sharing queues according to the ratios configured, but does NOT limit it to this level.

Shaped and shared settings are configured using

config-if#srr-queue bandwidth shape {n} {n} {n} {n}
config-if#srr-queue bandwidth share {n} {n} {n} {n}


If the values are not set then the following default values apply

config-if#srr-queue bandwitdh shape 25 0 0 0
config-if#srr-queue bandwitdh share 25 25 25 25


Bandwidth allocation for a 10 mb link can be calculated as follows:-

SHAPED Q
Bandwidth allocated = 1/25 * BW
Hence for a 10 mb interface BW for queue 1 would be 400kbps

SHARED Qs
10mb - 400kps = 9.6 mb
Hence for queue 2,3 and 4 BW = 25/(25+25+25) * 9.6 = 3.2 mb


Supposing a lab requirement was to guarantee queue 1 2 mb, queue 2 2 mb and queue 3 and 4 to share the remainder this could be achieved with the following configuration

config-if#srr-queue bandwitdh shape 5 5 0 0
config-if#srr-queue bandwitdh share 0 0 25 25


In the next post i look at how traffic is mapped to the srr queues.