Have you ever had a work-related dream that you’ve woken up from and were thankful it was just a dream? Designing and executing the network for a new data center will definitely give you a few night terrors. I recall the night before our data center move, I dreamed that I woke up late on the day of the move. That’s not too bad right? Well for some reason, in my dream the data center was in another state instead of an hour away. Then my car did not start; I had to “borrow” a car and I got lost on the way there. Again, I was thankful it was a dream. The actual move went a bit smoother. However, these types of projects can be real nightmares for teams. The main thing to do is plan. You the follow this with a few healthy doses of additional planning.
My company’s data center was in our world-wide headquarters, however since it is an older building, the room where the data center was located was old and a little flawed. What do I mean by flawed? Well, once in a while water streamed below the raised floor. Unless you’re going for an underwater server rack, it is probably best to ensure water is nowhere near the same room. The place had some power problems in the past as well. Finally, access to the room was not as secure as we wanted since the facility teams needed access to generators and other equipment. When your company provides 24/7 services to world-wide locations, having a data center that is resilient and reliable is of utmost importance.
To Stay or Go?
Sure a little tender loving could have fixed some of the flaws I mentioned above, but how much would it cost? Let’s say a little more expensive than moving the whole thing to a colo. When deciding to move a data center, cost will be one of the main factors. In our case, repairing all the flaws would have still meant our company needed to continue to provide maintenance and ensure things like power were not issues. If you have a dedicated team of people who do that, it might work out. However, we found it to be wise to move the data center to a place that will have redundant power, cooling, and someone who monitors it day and night. It’s not free, but it ensures we provide what two R’s I mentioned above: resiliency and reliability, at least from a physical perspective. It is still up to the engineer to design a network that also utilizes those two Rs.
Now the business gets some more physical space back to turn into office (which was needed anyways).
Yes, you might be the person, team or director who wants to stay and rebuild. Sometimes that is the only choice. If that is the route to take, you will have to take a look at multiple outage windows to accommodate construction. Will you have to move the equipment? If there is dust, you probably want to cover things up. That in itself is a hassle, but sometimes the only option.
Another big factor in staying or going is the budget. Can it be afforded? Are we ripping out the existing equipment or is it a good opportunity to purchase new ones? The bigger the budget, the more options will exist.
So you’ve decided to move your data center. Good for you! Now unplug everything, toss it in the back of your car and hit the road. That sounds like a recipe for disaster. Moving a data center will consist of many steps and about the same amount (or more) of meetings. Thank goodness you have a dedicated project manager to ensure nothing falls out of place. If you do, good for you! In our case, our data center team’s manager (servers, virtualization, etc.;) was to play the PM role as well lead his team with the changes needed. Whether you have a PM or not, you should be the PM for your own area. There are too many vendors, questions, scenarios and ideas to keep track of. The business will need to know what is going on and when. On the network side, things to think about and plan for:
- Are there any circuits that need to be activated by certain dates and with what providers?
- Is new equipment going to be purchased? Does it have all the needed memory, interfaces and licenses needed?
- Are you ripping out existing cabling or purchasing new? Color coded? How long do they need to be?
- Who is moving equipment and how?
- How long will any outage be?
- Where are the diagrams?
- Any existing equipment and cabling that will move, is it tagged?
- Do you have enough power in the new place for all the equipment that will be there?
In the network world, my manager and I had various discussions. We already knew the plan was to move and what the budget was. Now it was time to get organized. Since we used much of the equipment for the local site as well as the world-wide services, we could not just pickup and go. Either equipment was purchased for the new data center or for the existing site if we were to rip their equipment out.
We felt it was a good idea to mostly start from scratch, purchasing new infrastructure for the data center. This gave us the important opportunity of pre-staging equipment. Anything you can do in advance, do it. Keep track of what you have done and what there is to do during each of the weeks leading up to the go-live.
We ordered new ISRs for internet connectivity, ASRs for the WAN and Catalyst 9500s for the core. We sprinkled in some other smaller Cat 9300s for DMZs and management connectivity. Keep in mind, the 9500s are a fiber switch. What do you do with all the copper connections you might have from different device’s management interfaces? Being an all fiber switch, many other purchases of GLC-TE SFPs followed. All of those little things add up. Purchasing one SFP or cable might not be a big deal; however ordering a bunch of each will have a hefty price tag attached.
The rest of the equipment, such as firewalls/VPNs, virtualization and other servers were to be taken from the existing site. This meant there were some re-IPs to be performed. In an effort not to re-point many things around the world to new IP addresses, it was just better to re-IP a few non critical services at the existing site. Obviously these mini-outages needed to occur off-hours. Those are all things to plan for. Sometimes you think there is only a linear path to a project, from point A to point B. However, the more you sit and plan, the bigger picture starts to appear. You then realize right after point A, you need to visit point K. Then your vendors will remind you point F will need to take place before point B. The hope is that you actually arrive at point B around the time you said you were. This is why it is important to plan early. Run through the plan in your mind as if you are in the middle of the move. Does it seem like a lot? There is nothing like running over to a warehouse miles away in the middle of the move for an SFP you desperately need. Thankfully, that did not happen. If you have the opportunity to order, order extras. It is better to have more than less.
Let’s get physical
The new data center was ready to go. I had the equipment configured, diagrams created, orders of cabling and connectors flowing in. Now was the time to get in early and rack what I could. All of the new equipment was gently placed into my vehicle and off we went. Yes, there is always a risk of my car flying off a cliff with all of that gear, but it did not happen.
Reaching our data center space in the colo, the gear was racked and connected. By this time, I had already sorted out configuration such as what will connect to each port, what routing protocols to use and trying to implement best practices. Sometimes making changes to existing environments open the flood gates to change requests and outage windows. However, with new equipment being setup, take advantage and look at what is best practice. What does Cisco recommend? If it works out in your environment, document and implement. Here is where you have the opportunity to configure resiliency and reliability. This is a world-wide data center, you better have those things. With multiple WAN routers for MPLS connectivity and multiple provider’s for internet connectivity, I am hoping we are well on the way. However, those things are mostly physical. Logically, you need to design how things will fail over if there are physical outages. The goal is for things to happen automatically without intervention. My next post will go into some of the FHRPs, Event Manager and IP SLA configs used. Technology facilitates your success if you know how to use it.
As the day of the actual move approaches, double check and triple check everything. Perhaps I am a little paranoid, but with big projects, paranoia might be a good catalyst. If you have Change Management in your organization, by this time your implementation plans will be known to all involved in the project. Whenever moving equipment, keep your vendors on speed dial in case there is that well-timed power supply failure that occurs when you shutdown equipment after 3 years. Again, thankfully that did not occur. The morning of the move came, critical equipment from the old site was yanked out by yours truly (firewalls, wireless controllers, etc.;) and also driven out to the new data center. The team split in two as I was responsible for bringing up the new location to the world and the other half was in charge or wrapping up configs/cleanup of the old site. Everything worked out pretty well with a few exceptions:
- The occasional broken clip on a cable, ensuring you scratch your head extra hard trying to find out why the particular server will not Ping.
- Technology being technology when an HA pair of firewalls can’t decide who will be active.
- A few other devices you can’t ping not because of broken clips, but forgotten firewall rules.
I think because time was a friend, things worked out well. This is why I urge everyone to plan, plan and plan. Do as much as you can as early as possible. All core infrastructure has been working well now for about two weeks. However, that was just part one. Part two is the move of all servers, SANs and anything left in that realm. That move will occur right before the end of the year, but I am pretty confident all will go well. The road has been paved which was the important part. With proper planning you lay down a good foundation to build successes on. Of course, there is always Murphy’s law. However, anything that is under your control, planning will ensure a slam dunk. On to the next project!