jump to navigation

DSS iSCSI failover solved July 25, 2009

Posted by jamesisaac in Uncategorized.
Tags: , ,
add a comment

Two weeks ago I was in the throes of a confusing puzzle of how to make iSCSI failover work with the open-e DSS. I was searching for the magic switch that would make it work, and I found it – “it” being ASCII character 20h, 32 decimal, good ol’ space.

Here’s the rundown – the DSS high availability construction kit goes like this:

  1. Create a volume on your “source” DSS.
  2. Create an iSCSI lun in that volume.
  3. Create a replica volume on the “target” DSS (“target”, “replica”, whatever you want to call it)
  4. Create an identical iSCSI lun, same name, same lun number.
  5. Configure a volume replication job on your source DSS. Here’s the important thing: don’t create the job name with a space. So, “replicate lv0000” will work for the replication job, but it won’t even show up in the iSCSI failover job list. Create your job and call it “replicate_lv0000” instead.
  6. Start the replication job and wait until the volumes are synchronized.
  7. Configure iSCSI failover – you should see your replication job listed.

It’s amazing and a little disconcerting to think of all the time wasted because one part of the UI allowed a job with a space in the name, and another part of the UI wouldn’t list jobs with spaces in their names.

Now, arguably, I didn’t run this by support, nor have I seen the source code, so I may be barking up the wrong tree. All I know is that 20 hours later, the space character is the one change that I made which allowed everything else to work.

DSS issues July 12, 2009

Posted by jamesisaac in Uncategorized.
1 comment so far

Configuring one open-e DSS for iSCSI? Easy. Takes about 15 minutes. A little longer to format the volume, but actual config time is quick.

Configuring two DSS’es to talk to each other and do the iSCSI automatic failover? Well, I’m ten hours in and it doesn’t work yet. I’m sure there’ s a magic switch somewhere that I’m just not finding, because it looks like it should work. My problem at this point is that I just don’t get, conceptually, how they’re doing the networking. I have a three-segmented network (LAN/management, iSCSI, and replication channel), and I can’t figure out what ip address to put where when doing the configs. I wish that e-open had a set of documentation that said, “Ok, here’s what’s going on behind the scenes. Figure it out.” That actually would help.

VLAN across WAN July 9, 2009

Posted by jamesisaac in Uncategorized.
Tags: ,
add a comment

Stop me if you’ve heard this: you can’t extend a VLAN across a WAN. Or the alternative comment: you can, but why would you want to? After all, a VLAN is a container for a broadcast domain, right? And those are done with local, physical entities. Routers act to block broadcasts, so your broadcast domain can’t extend past a router.

Sure, that’s true to one degree or another. In a bandwidth-constricted environment, forwarding all your broadcasts across a small pipe is a recipe for disaster. But what if you’ve got a larger pipe, say, 10mb ethernet, and you promise that you’ll selectively forward some VLANs and not others? Then can you do it?

I pursued this for practical and theoretical reasons, and found that you can in fact span a VLAN across a WAN with by reaching waaaaay back and building a bridge. Yep, we’re going to bridge that WAN.

I have two routers, with two ethernet interfaces each. Fast0/0 is the inside and Fast0/1 is the outside on both routers. The secret is to create subinterfaces and encapsulate dot1q for your subinterfaces. That puts the VLAN tag on that traffic. Then, just enable bridging for each respective subinterface, and you’re gold.

This config is for Cisco routers. YMMV.

bridge crb

!

!

interface FastEthernet0/0

description Corp local network

no ip address

duplex auto

speed auto

!

interface FastEthernet0/0.1

encapsulation dot1Q 3

ip address 192.168.1.1 255.255.255.0

no snmp trap link-status

!

interface FastEthernet0/0.102

encapsulation dot1Q 102

no snmp trap link-status

bridge-group 102

!

interface FastEthernet0/0.103

encapsulation dot1Q 103

no snmp trap link-status

bridge-group 103

!

interface FastEthernet0/1

description Interface to DC

no ip address

duplex auto

speed auto

!

interface FastEthernet0/1.3

encapsulation dot1Q 3

ip address 192.168.2.1 255.255.255.252

no snmp trap link-status

!

interface FastEthernet0/1.102

encapsulation dot1Q 102

no snmp trap link-status

bridge-group 102

!

interface FastEthernet0/1.103

encapsulation dot1Q 103

no snmp trap link-status

bridge-group 103

!

bridge 102 protocol ieee

bridge 103 protocol ieee

So what I did was, I have built three subinterfaces on this wire. VLAN 3 is routed using a subnet on one side and a different subnet on the other, with a tiny subnet inbetween to glue the two networks together. We use VLANs here even though this is just a routed network because the network ports on either side are full 802.1 trunk ports. VLAN 102 and VLAN 103 are true “broadcast” VLANs. There’s no ip information contained in them, because you don’t use ip routing with a bridge. The secret sauce is configuring a bridge-group for each VLAN and then turning on broadcast traffic with the “bridge 102 protocol ieee” command. This doesn’t show up explicitly in the configs but is not on by default (at least in the version of code I was using).  The other router should be configured identically, except that the VLAN 3 information would be for the local network on the other side. Use the same VLAN encapsulation and bridge-group numbering.

I don’t recommend doing this with your main networks, as you will then be sending all of your broadcast traffic across the wire for (probably) no good reason. I’m doing it to fix some workstation deployment issues using a non-standard PXE boot appliance, as well as just to see if it is possible. Using VLANs in this manner essentially makes your VLAN’ed network portable between physical networks. Since the ip addresses don’t change (remember, it’s not a routed network), you can move your devices around from one site to another without having to renumber them. Keep in mind that their default router may be on the other side of the physical network, though, so you may want to fix that once you finish moving devices around.

WAN and VLAN issues July 8, 2009

Posted by jamesisaac in Uncategorized.
Tags:
add a comment

With the arrival of the DSS SANs, the project is moving ahead with great speed. We purchased two SANs from Silicon Mechanics, with the goal of configuring synchronous replication between them. They arrived each in a large box, containing the 2U chassis and a large pink foam insert with each drive packaged up separately. The Silicon Mechanics technicians had loaded the drives, configured the RAID groups, then taken everything apart and packed the drives along with instructions for reassembling everything at the destination site. I assume this improves the reliability by not shipping the storage server with drives installed. They did a very thorough job of protecting everything. The end result is 3.5 TB of fast, SAS-based storage. I’m extremely happy that we are able to use 1TB SAS drives for one of our RAID groups and not have to burn drive slots just to accomodate our larger, but less i/o-intensive vm’s.

I spent a few hours figuring out how to trunk VLANs across the WAN and into the datacenter setup. On the Cisco router side, this involves configuring subinterfaces on the inside and outside interfaces of each router, and setting the “encap dot1q” for each subinterface. It’s an interesting game to play to try to telnet into the appropriate interface and change the ip config of the other interface, so that you don’t cut off the limb that you’re standing on, so to speak. After a few tries I resorted to the console cable and got everything squared away.

The Netgear switches proved to be another mindbender, though, as even though I’ve done this before (and documented it), the VLAN trunking is just not intuitive for someone with a Cisco background. The other troublesome part of the equation is that some devices natively understand and inject their own VLAN information (i.e., the VMWare host servers), but others do not and have to have their native VLAN set at the port. In the end, I found it easiest to set the native VLAN for each device to something other than 1 – that way I was certain that if I was reaching a device, it was through the appropriate VLAN.