Saturday, April 12, 2008

IVR concepts and some troubleshooting!

IVR TEST SETUP ( i will put some pictures)

show run of the switches:
9216
9509
Steps:
9216 Config Steps
9509 Config Steps

When you two different IVR fabric, even if you link one
VSAN between them (even if it is non-ivr), the ivr
zonesets will merge.

Also please make sure you use ivr distribute option
to keep the configs same in all ivr edge switches.

IVR NAT
I. connected ISL between 9216 and 9509 and had both
Xiotech and Dell PC in same VSAN and checked the Qlogic
SANSurfer to check if I am seeing the Xiotech Disks.
( set Xiotech to new servers can see all the luns
and Zoning to permit (note that it may be disruptive
when creating ivr zone)

II. moved interface fc1/8 on 9509 (Xiotech) to vsan 26
moved interface fc1/2 on 9216 (PC) to vsan 24
created empty vsan 23 on both switches.
default zone permit on vsan 26 and 24. (this is test,
so no problem if activating ivr zone is disruptive).

III. created ivr topology database and activated it.

IV. created ivr zoneset and activated it.
V. SAN BLADE on PC
VI. IVR Topology: show ivr internal top
ZoneStatus on the switches
VII. Debug IVR trace
VIII. show ivr (fsm ) commands are good to debug.

Links: Config Example in CCO
Uses of IVR:
IVR between brocade and MDS ( how to prevent ISL failure due to read-only zones ) -Dallas McCloon's Webpage.
IVR between SN5428 and MDS ( read-only zones and ivr virtual-fcdomain-add ) - Paul's Webpage.

IVR troubleshooting:

- please check if IVR topology is activated show ivr vsan-topology
- show ivr zoneset active ( * on all the ports)
- show zoneset active ( check ivr zone names are there and * active) ( show run).
if not do show ivr zone status ( check for failures) - mostly remove default permit to
valid zone/zonesets.
show ivr internal event-history error-log

------
Case # 600921106
Let us assume you have four switches connected via vsan 1000

switch 1 and 2 ---> common vsans 200,220,240,1000 : Fabric 1
switch 3 and 4 ---> common vsans 300,320,340,1000 : Fabric 2
switch1,2,3,4 - common vsan 1000

if you want to configure IVR to share resources among vsans (220,240 in Fabric I
and then you want to share 300,320 for vsan 2)

Ivr topology would be (same on all 4 switches)
Transist vsan will be 1000. ( without this, even though
fullzoneset may have appropriate zones/zonesets via FM, without
activating zoneset locally , you will not have active zoneset
with ivr zoneset, with transist vsan when you activate IVR ZS from
one switch).

autonomous fabric 1 switch-wwn switch1 vsan-ranges 220,240,1000
( max vsans you could enter
will be 5, use gui if need more or use vsan 220-240,1000)
autonomous fabric 1 switch-wwn switch2 vsan-ranges 220,240,1000
autonomous fabric 1 switch-wwn switch3 vsan-ranges 300,320,1000
autonomous fabric 1 switch-wwn switch4 vsan-ranges 300,320,1000

(if you type no autonomous fabric 1 xxxx , you will see it show run
until you activate the topology).

( GUI you should see 16 entries in active topology. if you have
4 switches)

and then activate topology.

Ivr zones and zonesets will be same across all 4 switches (FM will take care of
it)

ivr_zone1 - pwwnhost vsan 220, pwwnstorage vsan 240
ivr_zone2- pwwnhost vsan 300, pwwnstorage vsan 320
ivr_Zoneset- ivr_zone1+ivr_zone2
and activate the zoneset

If ivr zoneset activation (show ivr zoneset status ), stuck in activating or
as per show log logfile (waiting for lowest wwn), then it may be a bug in
1.3.4a CSCeh02256. At that point, only way to get out is to deactivate
ivr zoneset (disruptive if ivr zones are used) and then activate it.

It might be due to not having correct IVR topology, correct the ivr
topology and activate it again.

However to avoid this bug ( best is to have correct topology), but
you can use this work around:

From the list I have ( show ivr and show wwn switch will
give switch's wwn)

ivr vsan-topology database
autonomous-fabric-id 1 switch-wwn 20:00:00:0d:ec:0f:4f:80
vsan-ranges 220,230,240,270,290,1000
autonomous-fabric-id 1 switch-wwn 20:00:00:0d:ec:0f:2e:00
vsan-ranges 220,230,240,270,290,1000
autonomous-fabric-id 1 switch-wwn 20:00:00:0d:ec:0f:2d:c0
vsan-ranges 320,330,340,370,390,1000
autonomous-fabric-id 1 switch-wwn 20:00:00:0d:ec:0f:50:40 vsan-ranges
320,330,340,370,390,1000

You can verify it one more time.
Lowest wwn is 20:00:00:0d:ec:0f:2d:c0 TACOMA_BD4
20:00:00:0d:ec:0f:2e:00 TUNDRA_AD1
20:00:00:0d:ec:0f:4f:80 TACOMA_AD3
Highest wwn is 20:00:00:0d:ec:0f:50:40 TUNDRA_BD2


Step I:
You can push the ivr zones and zoneset from Fabric manager.
(from IVR zone dialog)

Step II onwards should be done from lowest wwn to highest wwn in that order.

StepII:(verification of the ivr zones to see if they got pushed
from FM correctly)

From CLI of lowest wwn, verify if the zoneset in the config is correct by
show ivr zoneset

or show ivr zones

Step III: (activation)
and then activate it using
config t
ivr zoneset activate name
exit

Step IV ( commands to verify if activation completed in this switch)

show ivr zoneset active
show ivr zoneset status (look for activation sucess on all vsans).

If it is success you can move on to higher wwn switches.


_________________________________________________________________________________

Some more commands:

show ivr internal capa vsa
show ivr internal zone-per-vsan vsan
show ivr internal device

clear ivr zone
clear ivr session

Waiting for lowest wwn
in ivr zoneset activation:

show ivr zoneset status
show fcdomain domain vsan X ( whichever vsan it is waiting)
look at the lowest wwn and check that switch.
---------------------------

if you want to ivr vsan 70 in switch 1 to 240 in switch 2,
you need not have vsan 70 in switch2 as well as 240 in switch 1.

-----

ivr withdraw domain
ivr refresh

similar to reactivating ivr zoneset.
-------
transient vsans will have all the devices.
----

ivr zoneset activation failing with fabric unstable

look at show ivr internal global, see domain id false and
recreate that VSAN that complains about domain id.
show ivr int vdri summary
-----

if you two ivr switches and one of them does not have
virtual domains added to all ivr switches
then when activating ivr zoneset , you will see errors
that certain fcid already there.

if that is the case
show ivr virtual-fcdomain

and add
ivr virtual-domain vsan 5 - enables rdi.
--------

ivr withdraw domain and ivr refresh is useful for
swwn00:00:00 error.

show vsan usage


show ivr int zone-fsm - tells extra info
from show ivr zoneset status

show ivr tells ivr enabled switches.

if a vsan is missing from show ivr zoneset status ,

and shows up in global. then suspend and no suspend
the vsan
--------
Problem Description

a. swwn 00:00 in vsan 5
(due to RDI issue when upgrading to 2.1.1b without following
the right steps)
b. fswb stuck in waiting for lowest wwn due to lower wwn
switch running 2.1.1 and higher wwn switch running 2.0.x code.
c. Qlogic - zone activation failure in vsan 5 due to problem a.

a. we saw extra zone vsan_1_test_zone in vsan 5 ivr config, so
we tried to add it and activate the ivr zoneset, but fswb was
still stuck at waiting for lowest wwn
b. so we disabled ivr, enabled and added vsan topology
and copied ivr zone config to bootflash:dave.txt and
copy bootflash:dave.txt run
c. then activated the ivr zoneset and there were issues
with certain fcids not added because already there,
so to resolve that we did step b. again
d. before activating the zoneset we added
ivr virtual vsan 5 on fswb where it was missing.
e. then we activated zoneset. and show ivr zoneset
status showed all vsans are in active mode.
f. we then tested couple of ivr zoneset changes from
FM and it activated fine.

Problem 2:
1. we did ivr withdraw domain X vsan 5 on domain
with swwn 00:00:00
2. and then did ivr refresh, that domain had correct swwn but someother domain failed.
instead doing it on each domain,
3. we removed vsan 5 from fswa vsan-topology
and added back in.
4. then all the domains had correct swwn (show fcdom dom v 5)
5. show ivr zoneset status did not show vsan 5 , even though
internal vdri summary showed vsan 5 in RDI mode (because
of ivr virtual domain add)
6. we initiated Build Fabric (BF) by fcdomain domain restart v 5
but did not help.
7. we did disruptive suspend and no suspend vsan 5, it fixed the
issue.
8. we looked at qlogic and it was able to see the ns (show ns all) and we were able to activate the zoneset
on vsan 5 without any issue ( fixed problem c.)
We had some issues because proposed changes showed
a lot of zones but we added only one zone, which was fixed
by zone copy active full and then going to FM again

New Problem:

1. when looking at show ivr zoneset status on fswb it
had vsan 5 stuck again in lowest wwn issue.
2. we activated the ivr zoneset from FM and it fixed
the issue.
3. So Dave, please show ivr zoneset status after
you activate normal zoneset in any ivr enabled vsans.
if it is stuck, please add a test ivr zone and activate
ivr zoneset from Fabric manager. it should not happen
now, it happened because of too many changes yesterday.
-------------------------

IVR NAT:

When IVR NAT is enabled, two vsans that need to talk to each other can have
same domain id, PLOGI, show fcns database will show one fcid for local vsan
and other fcid for other vsan (persistent fcid and domain id for ivr nat is other
issue).
`show flogi database`
---------------------------------------------------------------------------
INTERFACE VSAN FCID PORT NAME NODE NAME
---------------------------------------------------------------------------
fc1/3 100 0x0a00ef 50:06:01:62:30:60:22:a8 50:06:01:60:b0:60:22:a8
fc1/12 200 0x0a0000 10:00:00:00:c9:4b:4f:de 20:00:00:00:c9:4b:4f:de

Note that vsan 100 and 200 domain id is same 10.

VSAN 100:
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x0a00ef N 50:06:01:62:30:60:22:a8 (Clariion) scsi-fcp
0xc2510c N 10:00:00:00:c9:4b:4f:de (Emulex) ipfc scsi-fcp

Total number of entries = 2

VSAN 200:
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x0a0000 N 10:00:00:00:c9:4b:4f:de (Emulex) ipfc scsi-fcp
0x858419 N 50:06:01:62:30:60:22:a8 (Clariion) scsi-fcp

ivr enable
ivr distribute
ivr nat
ivr vsan-topology database
autonomous-fabric-id 1 switch-wwn 20:00:00:0d:bc:76:76:80 vsan-ranges 100,200
ivr vsan-topology auto

ivr zone name z_udmstest_4fde_cxtest_spa2
member pwwn 50:06:01:62:30:60:22:a8 vsan 100
member pwwn 10:00:00:00:c9:4b:4f:de vsan 200
ivr zoneset name zs_122405_jmm
member z_udmstest_4fde_cxtest_spa2
ivr zoneset activate name zs_122405_jmm force
ivr commit
-----


If the host and storage needs to talk to each other,
host will get 0x858419 as the fcid which is NATed,
plogi into that address.

vsan 200 --- 0a0000 (host)---plogi --- 0x85419(storage)--(ivr nat) --cont nxt line--

(ivr nat)---0xc2510c(host)---plogi ----0a00ef

ACC will trvel same way ., we hold ACC(plogi) for 2 seconds , to fix some
issue. PLOGI and PLOGI acc can wait for RATOV ( 10 s).

IVR TRACES- NAT from Univ of M
note the time difference between plogi and plogi ACC.

Also IVR NATed FCIDs may be different.
--------

Lab setup
MDS9216I-86# show flogi database
---------------------------------------------------------------------------
INTERFACE VSAN FCID PORT NAME NODE NAME
---------------------------------------------------------------------------
fc1/4 1 0x640201 20:00:00:05:ad:22:8e:3c 20:00:00:05:ad:02:8e:3c
fc1/4 1 0x640202 20:01:00:05:ad:22:8e:3c 20:01:00:05:ad:02:8e:3c
fc1/4 1 0x640204 20:02:00:05:ad:22:8e:3c 20:02:00:05:ad:02:8e:3c
fc1/6 2 0x640000 50:06:04:82:c3:a1:2f:52 50:06:04:82:c3:a1:2f:52
fc1/7 1 0x640500 21:00:00:e0:8b:0b:fc:0d 20:00:00:e0:8b:0b:fc:0d

Total number of flogi = 4.
MDS9216I-86# show fcns database

VSAN 1:
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x640201 N 20:00:00:05:ad:22:8e:3c scsi-fcp
0x640202 N 20:01:00:05:ad:22:8e:3c scsi-fcp
0x640204 N 20:02:00:05:ad:22:8e:3c scsi-fcp
0x640500 N 21:00:00:e0:8b:0b:fc:0d (Qlogic) scsi-fcp:init

Total number of entries = 3

VSAN 2:
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x640000 N 50:06:04:82:c3:a1:2f:52 (EMC) scsi-fcp:target 250

---
Config persistent:
ivr fcdomain database autonomous-fabric-num 1 vsan 1
native-autonomous-fabric-num 1 native-vsan 2 domain 223
pwwn 50:06:04:82:c3:a1:2f:52 fcid 0xdf407a
ivr fcdomain database autonomous-fabric-num 1 vsan 2
native-autonomous-fabric-num 1 native-vsan 1 domain 83
pwwn 20:01:00:05:ad:22:8e:3c fcid 0x533f4f
pwwn 20:02:00:05:ad:22:8e:3c fcid 0x533f5f
pwwn 21:00:00:e0:8b:0b:fc:0d fcid 0x530000

ivr vsan-topology auto
ivr nat
ivr distribute
ivr zone name IVR_Zone1
member pwwn 20:02:00:05:ad:22:8e:3c vsan 1
member pwwn 50:06:04:82:c3:a1:2f:52 vsan 2
member pwwn 20:01:00:05:ad:22:8e:3c vsan 1
ivr zoneset name IVR_ZoneSet1
member IVR_Zone1

----

activate the zoneset and commit
zoneset name IVR_ZoneSet1
zone name IVR_Zone1
* pwwn 20:02:00:05:ad:22:8e:3c vsan 1 autonomous-fabric-id 1
* pwwn 50:06:04:82:c3:a1:2f:52 vsan 2 autonomous-fabric-id 1
* pwwn 20:01:00:05:ad:22:8e:3c vsan 1 autonomous-fabric-id 1
MDS9216I-86# show fcns database

VSAN 1:
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x640201 N 20:00:00:05:ad:22:8e:3c scsi-fcp
0x640202 N 20:01:00:05:ad:22:8e:3c scsi-fcp
0x640204 N 20:02:00:05:ad:22:8e:3c scsi-fcp
0xdf407a N 50:06:04:82:c3:a1:2f:52 (EMC) scsi-fcp:target 250
0x640500 N 21:00:00:e0:8b:0b:fc:0d (Qlogic) scsi-fcp:init
Total number of entries = 4

VSAN 2:
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x533f4f N 20:01:00:05:ad:22:8e:3c scsi-fcp
0x533f5f N 20:02:00:05:ad:22:8e:3c scsi-fcp
0x640000 N 50:06:04:82:c3:a1:2f:52 (EMC) scsi-fcp:target 250
0x530000 N 21:00:00:e0:8b:0b:fc:0d (Qlogic) scsi-fcp:init
----

Here is how persistent setup needs to be done.
----
MDS9216I-86# show ivr fcdomain database
----------------------------------------------------
AFID Vsan Native-AFID Native-Vsan Virtual-domain
----------------------------------------------------
1 1 1 2 0xdf(223)
1 2 1 1 0x53(83)

Number of Virtual-domain entries: 2

----------------------------------------------------
AFID Vsan Pwwn Virtual-fcid
----------------------------------------------------
1 1 50:06:04:82:c3:a1:2f:52 0xdf407a
1 2 20:01:00:05:ad:22:8e:3c 0x533f4f
1 2 20:02:00:05:ad:22:8e:3c 0x533f5f
1 2 21:00:00:e0:8b:0b:fc:0d 0x530000

Number of Virtual-fcid entries: 3
MDS9216I-86# show ivr int pnat vdom-info
IVR2 PNAT: Virtual domain info for 1:1:223
--------------------------------------------------
is_owner=true, owner_dom=100, local_dom=100
ID: VDOM-1:1:223
Peer domain list: 100
Response pending list:
IVR2 PNAT: Virtual domain info for 1:2:83
--------------------------------------------------
is_owner=true, owner_dom=100, local_dom=100
ID: VDOM-1:2:83
Peer domain list: 100
Response pending list:

-----
fcnalyser trace will show plogi and prli.

MDS9216I-86(config)# fcanalyzer local br limit-captured-frames 0

8.134606 64.05.00 -> df.40.7a 0x43c0 0xffff FC ELS PLOGI <<<<<<<
2 second delay to pass it on to next VSAN (added in 2.1.2b)
10.139095 53.00.00 -> 64.00.00 0x43c0 0xffff FC ELS PLOGI
10.139793 ff.ff.fc -> 64.00.00 0x8027 0x5e9 dNS ACC (GNN_ID)
10.148506 64.00.00 -> 53.00.00 0x43c0 0x0 FC ELS ACC (PLOGI)
10.148953 df.40.7a -> 64.05.00 0x43c0 0x0 FC ELS ACC (PLOGI
10.149091 64.05.00 -> df.40.7a 0x43c0 0xffff FC ELS PRLI
10.149479 53.00.00 -> 64.00.00 0x43c0 0xffff FC ELS PRLI
10.151872 64.00.00 -> ff.ff.fc 0x8028 0xffff dNS GPN_ID
10.152399 ff.ff.fc -> 64.00.00 0x8028 0x5ea dNS ACC (GPN_ID)
10.163429 64.00.00 -> 53.00.00 0x43c0 0xffff FC ELS ACC (PRLI)
10.163791 df.40.7a -> 64.05.00 0x43c0 0xffff FC ELS ACC (PRLI)
10.171180 64.00.00 -> ff.ff.fc 0x8029 0xffff dNS GSNN_NN

show ivr int pnat debug-history
21:44:12:is_sync_done() called - (1, 223)->TRUE
21:44:12:Received ELS_PLOGI from 0xfffc64 to 0xdf407a
21:44:12:Forwarding to (2, 0xfffc64, 0x640000)
21:44:12:Received ELS_ACC from 0x640000 to 0xfffc64
21:44:12:Forwarding to (1, 0xdf407a, 0xfffc64)
21:44:12:Received ELS_PRLI from 0xfffc64 to 0xdf407a
21:44:12:Forwarding to (2, 0xfffc64, 0x640000)
21:44:12:Received ELS_ACC from 0x640000 to 0xfffc64
21:44:12:Forwarding to (1, 0xdf407a, 0xfffc64)
21:44:12:Received ELS_PRLO from 0xfffc64 to 0xdf407a
21:44:12:Forwarding to (2, 0xfffc64, 0x640000)
21:44:12:Received ELS_ACC from 0x640000 to 0xfffc64
21:44:12:Forwarding to (1, 0xdf407a, 0xfffc64)
21:44:12:Received ELS_LOGO from 0xfffc64 to 0xdf407a
21:44:12:Forwarding to (2, 0xfffc64, 0x640000)
21:44:12:Received ELS_ACC from 0x640000 to 0xfffc64
21:44:12:Forwarding to (1, 0xdf407a, 0xfffc64)
00:12:12:Received ELS_PLOGI from 0xfffc64 to 0x530000
00:12:12:Forwarding to (1, 0xfffc64, 0x640500)
00:12:12:Received ELS_ACC from 0x640500 to 0xfffc64
00:12:12:Forwarding to (2, 0x530000, 0xfffc64)
00:12:12:Received ELS_PRLI from 0xfffc64 to 0x530000
00:12:12:Forwarding to (1, 0xfffc64, 0x640500)
00:12:12:Received ELS_ACC from 0x640500 to 0xfffc64
00:12:12:Forwarding to (2, 0x530000, 0xfffc64)
00:12:12:Received ELS_LOGO from 0xfffc64 to 0x530000
00:12:12:Forwarding to (1, 0xfffc64, 0x640500)
00:12:12:Received ELS_ACC from 0x640500 to 0xfffc64
00:12:12:Forwarding to (2, 0x530000, 0xfffc64)
00:12:15:Received ELS_ACC from 0x640000 to 0x530000
00:12:15:Routing (2, 0x640000, 0x530000) ->(1, 0xdf407a, 0x640500)
00:12:15:Forwarding to (1, 0xdf407a, 0x640500)
00:12:15:Received ELS_PRLI from 0x640500 to 0xdf407a <<<<<<<<<<<<<<<<<<<<<<
00:12:15:Routing (1, 0x640500, 0xdf407a) ->(2, 0x530000, 0x640000)
00:12:15:Forwarding to (2, 0x530000, 0x640000)
00:12:15:Received ELS_ACC from 0x640000 to 0x530000
00:12:15:Routing (2, 0x640000, 0x530000) ->(1, 0xdf407a, 0x640500)
00:12:15:Forwarding to (1, 0xdf407a, 0x640500)


---------------

Troubleshooting:

if fcns database does not have fcid of the storage/hosts in appropriate vsan
and ivr nat enabled.

how ivr internal fcid-rewrite-listshow ivr int event-history fcid-rewrite-fsm vsan 20 did 0x65440show ivr int event-history pv-fsm pwwn 21:00:00:e0:8b:1e:22:82 vsan 20 show ivr int event-history pv-fsm pwwn 21:00:00:e0:8b:1e:32:82 vsan 20 show ivr internal event-history errshow ivr int event-history pv-fsm err sh ivr internal area-port-allocation pwwn 21:00:00:e0:8b:1e:22:82 sh ivr internal area-port-allocation pwwn 21:00:00:e0:8b:1e:32:82 sh ivr internal area-port-allocation pwwn vsan 20Show ivr internal pvm

---

If you have issues with host not talking to storage, eventhough
show ivr zoneset active
show zoneset active
does show those devices are zoned together and active.

look at show fcns database vsan X where VSAN X is host vsan and check in storage edge vsan as well and look for fc4 type, node type. If it is not exported correctly
even though it shows correctly in native VSAN,
then it may be few bugs
but herre are few commands (non-disruptive) that you can try.
x9# ivr dev pwwn fcns register vsan
x9# ivr dev pwwn fcns register vsan
or
Here was my reply to the customer

Symptom:

0xde0010 - 50:---------------:24 on x9 vsan 21.

Action Plan 1:

shut/no shut on the device port 50:-----------------:24 and
50:-------------------:24
Action Plan 2:

ivr withdraw domain 0xde vsan 21
ivr withdraw domain 0x32 vsan 21
ivr refresh and see if it fixes the issue.

Action Plan 3:
ivr pv pwwn vsan 21 ns-query
ivr pv pwwn 50:------------:24 vsan 21 post 28


or Action Plan 4: (customer tried this and it worked)


x9# ivr dev pwwn 50:----------------:24 fcns register vsan 21
x9# ivr dev pwwn 50:------------:24 fcns register vsan 21

Bug might be
CSCsk49761

VSAN 21: then try this and see if fc4 type is populated correctly.
show fcns database | include 50:----------- :24

1 comment:

ARJUN said...
This comment has been removed by the author.