A data center consolidation is under way at global biotechnology company Vertex Pharmaceuticals Inc., and virtualization and virtualization management are playing a key role. Chris Pray, senior engineer for global information systems at Cambridge, Mass.-based Vertex, is overseeing this consolidation and steering the company down its virtualization path. He recently spoke with SearchCIO.com about virtualization management, including must-have policies, tools and procedures that will ensure the benefits the company has gained through virtualization are not lost as the environment quickly grows and shifts.
SearchCIO.com: How is your data center design strategy changing?
Pray: Up until about three months ago, we owned four data centers on-site in Vertex-owned buildings: one [data center] in San Diego, two in Cambridge and one in the U.K. We just finished consolidating the two in Cambridge into a colocation [he could not disclose the name of the colocation provider]. With that most recent [colocation], we moved about 450 servers, virtual and physical, along with a bunch of networking gear and backbone. The one back in July was a lot larger, probably double the size. We also have plans to colocate the sites in San Diego and the U.K.
Pray: Up until about three months ago, we owned four data centers on-site in Vertex-owned buildings: one [data center] in San Diego, two in Cambridge and one in the U.K. We just finished consolidating the two in Cambridge into a colocation [he could not disclose the name of the colocation provider]. With that most recent [colocation], we moved about 450 servers, virtual and physical, along with a bunch of networking gear and backbone. The one back in July was a lot larger, probably double the size. We also have plans to colocate the sites in San Diego and the U.K.
Why are you moving away from owning data centers to colocation?
Pray: We outgrew our data centers on-site. Even with virtualization, we were still outgrowing the square footage, electricity and cooling of on-site data centers. Moving to a hosted facility gave us stability in all those areas, and scalability.
Pray: We outgrew our data centers on-site. Even with virtualization, we were still outgrowing the square footage, electricity and cooling of on-site data centers. Moving to a hosted facility gave us stability in all those areas, and scalability.
What virtualization path did you go down leading up to the consolidation?
Pray: There has to be a substantial argument for a nonvirtualized environment. We have a virtualization-first policy for provisioning. Sometimes there are arguments for not virtualizing, for either large databases or large file systems. We process a lot of scientific data and oftentimes that data is enormous, with enormous storage requirements. That exceeds the capabilities of VMware. Other times -- a high-end database for example -- the limitation is the memory. If we have one Oracle database that requires 64 gig in order to run effectively, that doesn't make a good case for a virtual machine.
Pray: There has to be a substantial argument for a nonvirtualized environment. We have a virtualization-first policy for provisioning. Sometimes there are arguments for not virtualizing, for either large databases or large file systems. We process a lot of scientific data and oftentimes that data is enormous, with enormous storage requirements. That exceeds the capabilities of VMware. Other times -- a high-end database for example -- the limitation is the memory. If we have one Oracle database that requires 64 gig in order to run effectively, that doesn't make a good case for a virtual machine.
When you started out, what benefits were you hoping to gain through virtualization?
Pray: First are operational expenses, HVAC cooling and electricity. Next is portability. Virtual machines are much more portable than physical machines. We can transfer workloads between operating systems easily. We can transfer operating systems between cluster and hosted resources easily. Upgrades are much easier to manage. Self-service is a big proponent for allowing applications and developers to have a single pane of glass to manage all their virtual machines and have visibility into them.
Pray: First are operational expenses, HVAC cooling and electricity. Next is portability. Virtual machines are much more portable than physical machines. We can transfer workloads between operating systems easily. We can transfer operating systems between cluster and hosted resources easily. Upgrades are much easier to manage. Self-service is a big proponent for allowing applications and developers to have a single pane of glass to manage all their virtual machines and have visibility into them.
What other technologies are you combining with virtualization to optimize data center efficiency?
Pray: Server density is something Vertex has really been trying to do. We've not only taken a virtualization-first approach, but we've moved into high-density computing in the form of blades. So, we're standardizing on all blade hardware to constitute our [VMware] ESX clusters, which is a much more efficient approach when dealing with cooling, electricity and footprint. On a hosted facility, they charge you by the square footage, so if you can shave three racks down to half a rack, that's a big gain.
Pray: Server density is something Vertex has really been trying to do. We've not only taken a virtualization-first approach, but we've moved into high-density computing in the form of blades. So, we're standardizing on all blade hardware to constitute our [VMware] ESX clusters, which is a much more efficient approach when dealing with cooling, electricity and footprint. On a hosted facility, they charge you by the square footage, so if you can shave three racks down to half a rack, that's a big gain.
How has virtualization translated into gains for the business?
Pray: It's given the business a lot more stability in our infrastructure. It's a much higher-availability platform than standalone hosts. We have [VMware] High Availability and DRS [VMware Distributed Resource Scheduler] fully enabled on all of our clusters, so it provides peace of mind, knowing that if something does go down, it's going to come right back up.
Pray: It's given the business a lot more stability in our infrastructure. It's a much higher-availability platform than standalone hosts. We have [VMware] High Availability and DRS [VMware Distributed Resource Scheduler] fully enabled on all of our clusters, so it provides peace of mind, knowing that if something does go down, it's going to come right back up.
Has application performance improved as a result?
Pray: Several applications performed better when we virtualized them. Just moving from one older server core at a slower speed, to a newer core at a higher speed, even with a virtualization layer in between, the application owners and business owners have accolades for speed increases at the application level.
Pray: Several applications performed better when we virtualized them. Just moving from one older server core at a slower speed, to a newer core at a higher speed, even with a virtualization layer in between, the application owners and business owners have accolades for speed increases at the application level.
How is virtualization changing your disaster recovery strategy?
Pray: Our DR strategy is a work in progress. Ultimately the goal is to have a primary site in Cambridge and a DR site in San Diego. The strategy on a broad stroke would work with our Hewlett-Packard EVA [Enterprise Virtual Array] storage arrays replicating at the SAN [storage area network] level using continuous access. Then we will use VMware Site Recovery Manager [SRM] to tie in and orchestrate a DR failover; with all of the scripted pieces that come in failing over to another location, meaning IP address changes, DNS [domain name server] changes, protected storage groups.
Pray: Our DR strategy is a work in progress. Ultimately the goal is to have a primary site in Cambridge and a DR site in San Diego. The strategy on a broad stroke would work with our Hewlett-Packard EVA [Enterprise Virtual Array] storage arrays replicating at the SAN [storage area network] level using continuous access. Then we will use VMware Site Recovery Manager [SRM] to tie in and orchestrate a DR failover; with all of the scripted pieces that come in failing over to another location, meaning IP address changes, DNS [domain name server] changes, protected storage groups.
Even if you start off small, use naming conventions, because as [the environment] grows two-, three-, ten-fold, it will be a nightmare to manage if you don't.
Right now we're not there, the SRM product is in place, it's been qualified, and we are working toward that end result. But, even in a virtualized platform, we are seeing increased speeds on our traditional combo backups. So we're getting a shorter completion time for systems breaks. Also, the recovery time objective is much smaller.
How difficult is this environment to manage?
Pray: It is quite complex to administer a virtual environment. There's a lot more moving parts, a lot more things to consider. You do get gains in efficiencies in administration, but there are also administration tasks that you need to consider when you are virtualized. It's extremely difficult to administer and manage something that you can't see.
Pray: It is quite complex to administer a virtual environment. There's a lot more moving parts, a lot more things to consider. You do get gains in efficiencies in administration, but there are also administration tasks that you need to consider when you are virtualized. It's extremely difficult to administer and manage something that you can't see.
What tools do you use for capacity management and other aspects of virtualization management?
Pray: We use tools like Akorri [Inc.'s BalancePoint] for capacity and monitoring. That [tool's] capacity management spans not just virtual machines but physical as well. We use Groundwork [Open Source Inc.'s network monitoring software]. It's sort of a monitor of monitors. It takes input from our Orion [network monitoring product] by SolarWinds [Inc.], VMware vSphere and Red Hat [Inc.'s Satellite Server], and warns us on down services, down servers, a downed network connection. We most recently purchased VKernel [Corp.'s] capacity management suite -- to answer the questions, "Where can I put these next dozen VMs?" and "What's it going to do to my resources?"
Pray: We use tools like Akorri [Inc.'s BalancePoint] for capacity and monitoring. That [tool's] capacity management spans not just virtual machines but physical as well. We use Groundwork [Open Source Inc.'s network monitoring software]. It's sort of a monitor of monitors. It takes input from our Orion [network monitoring product] by SolarWinds [Inc.], VMware vSphere and Red Hat [Inc.'s Satellite Server], and warns us on down services, down servers, a downed network connection. We most recently purchased VKernel [Corp.'s] capacity management suite -- to answer the questions, "Where can I put these next dozen VMs?" and "What's it going to do to my resources?"
We also use the tool for rightsizing. Rightsizing is a very overlooked task in a virtual environment. [The tool] has an abstract slider that lets you adjust the CPU, memory, disk and network resources as necessary. Collecting data points on consumption of those resources gives you a projection of what's going to be needed in the future. It alarms when standard deviations are broken and thresholds are being crept up on, so it's a great tool for rightsizing virtual machines that are overprovisioned. The job of the tool is to make [the environment] lean, to do the job with just the resources required. That's really the whole objective of virtualization.
How hard is it to find talent that specializes in managing virtual environments?
Pray: I've been looking for a year to fill full-time and contract positions. It's a difficult skill set to fill because you need to not only know about VMware, but you have to have a fundamental understanding of networking, storage, backups and operating systems. It's really a cumulative skill set. We're talking about shared resources here. You have to know how it all ties together.
Pray: I've been looking for a year to fill full-time and contract positions. It's a difficult skill set to fill because you need to not only know about VMware, but you have to have a fundamental understanding of networking, storage, backups and operating systems. It's really a cumulative skill set. We're talking about shared resources here. You have to know how it all ties together.
How fast is your virtual environment growing?
Pray: We're probably 80% virtualized right now and climbing. It's a difficult target to hit. This environment is very big. We have 210 sockets of ESX, so that's roughly 110 hosts across three [data center] sites. Actually, I just installed seven hosts, so that's another 14 sockets, so we're about 224 sockets. There's also a last-minute order in for another 16 hosts, so that's 32 sockets. That's just ESX hosts, not virtual machines.
Pray: We're probably 80% virtualized right now and climbing. It's a difficult target to hit. This environment is very big. We have 210 sockets of ESX, so that's roughly 110 hosts across three [data center] sites. Actually, I just installed seven hosts, so that's another 14 sockets, so we're about 224 sockets. There's also a last-minute order in for another 16 hosts, so that's 32 sockets. That's just ESX hosts, not virtual machines.
How have your data center design policies and procedures changed in light of this speed of deployment?
Pray: Policies and procedures, it's not new, but the technology is moving faster than the documentation and the procedures, so it's always a catch-up game. Here and at other companies that I've worked at, policies and procedures grow organically as the technology evolves. You find as a seasoned administrator what works best and incorporate that into best practices. A lesson learned becomes something you follow through on the next time.
Pray: Policies and procedures, it's not new, but the technology is moving faster than the documentation and the procedures, so it's always a catch-up game. Here and at other companies that I've worked at, policies and procedures grow organically as the technology evolves. You find as a seasoned administrator what works best and incorporate that into best practices. A lesson learned becomes something you follow through on the next time.
What is an example of a lesson learned that became a virtualization management policy?
Pray: Storage naming conventions. The environment I walked into had … three different storage arrays. One of the things that was difficult was identifying, through vSphere, all the different attributes of that particular piece of storage. I need to know where it is, what storage array it's on, what type of disks are on that array, what kind of protocol is being used to present the storage to the ESX host, what the storage is going to be used for production, validation, database virtual machines. So, one of the first things I did was put a naming convention in place for the data storage in the virtual environment. That has become exceedingly important as the storage environment grows. We're dealing with hundreds and hundreds and hundreds of terabytes across three different sites.
Pray: Storage naming conventions. The environment I walked into had … three different storage arrays. One of the things that was difficult was identifying, through vSphere, all the different attributes of that particular piece of storage. I need to know where it is, what storage array it's on, what type of disks are on that array, what kind of protocol is being used to present the storage to the ESX host, what the storage is going to be used for production, validation, database virtual machines. So, one of the first things I did was put a naming convention in place for the data storage in the virtual environment. That has become exceedingly important as the storage environment grows. We're dealing with hundreds and hundreds and hundreds of terabytes across three different sites.
What advice would you give others trying to manage a virtual environment?
Pray: Manage standardization, naming conventions, logical folder structures. Make sure you have a consistency across your virtual environment and an architecture that is easy to follow and easy to scale. If you don't, it will turn into the Wild, Wild West, and it will become increasingly harder to manage over time. Even if you start off small, use naming conventions, because as [the environment] grows two-, three-, ten-fold, it will be a nightmare to manage if you don't. Once it's in place, it's much more difficult to adjust later on and correct.
Pray: Manage standardization, naming conventions, logical folder structures. Make sure you have a consistency across your virtual environment and an architecture that is easy to follow and easy to scale. If you don't, it will turn into the Wild, Wild West, and it will become increasingly harder to manage over time. Even if you start off small, use naming conventions, because as [the environment] grows two-, three-, ten-fold, it will be a nightmare to manage if you don't. Once it's in place, it's much more difficult to adjust later on and correct.
Chris Pray, senior engineer for global information systems, Vertex Pharmaceuticals Inc.
Source: