Ask anyone outside the enterprise IT department to identify the greatest data center cost, and they're apt to say servers. That’s true in some businesses; hardware costs are not trivial. But the truth may be that large organizations are wasting money unnecessarily by literally blowing a lot of hot air because of inefficient data center airflow.
If you're running your data center inefficiently, as many companies do, the cooling budget can be twice the cost of buying and running the hardware. That's a lot of cash wasted on heating, ventilation, and air conditioning (HVAC).
A May 2015 IDC data center survey found that up to 24 percent of a data center budget can go to cooling. With a mean data center budget of $1.2 million for IDC’s enterprise clientele, that means that cooling alone costs $300,000 a year.
In a word: Ouch!
Companies are getting more servers for less power these days, due to the rise of virtual machines (VMs), container technologies such as Docker, and the addition of low-power, low-heat ARM servers to the data center. These have all contributed to lower cooling costs. While that's helped, the bottom line is that we need higher-density server racks to keep our clouds and data centers running. That in turn means data center managers need to be better than ever at managing their cooling and budgets.
Heating and cooling have been a problem since mainframes ruled IT. As more businesses move their compute power from the PC to the local server rack and to data center-based clouds that have tens of thousands of servers, the problem continues to intensify.
The wrong way
Enterprise IT has conventionally addressed these costs in several ways The knee-jerk reaction is simply to add more cooling capacity. That's not the best option. Cold air doesn't come cheap, so it's not a cost-efficient option. Another old-fashioned approach, long dismissed as wrongheaded, is to use raised floors for cooling.
IDC found that most data centers are not running anywhere close to peak cooling efficiency, as measured by power usage effectiveness (PUE). PUE reflects the ratio of power coming into the data center to its distribution across the IT workload.
"A PUE of 1.0 is considered very efficient, while anything between 2.0 and 3.0 is considered very inefficient," according to a research note from Kelly Quinn, IDC research manager. "When you get to PUE ratios at 2 and above, you are looking at a massive amount of power going not just to the IT side of the house but also to the facilities side."
The U.S. government recommended in a 2015 sustainability planning document, "New data centers shall be designed and operated to maintain a PUE of at most 1.4, and are encouraged to be designed and operated to achieve a PUE of 1.2."
Google and Facebook have already surpassed these recommendations. Google is managing a cool PUE of 1.12. Facebook reports that its Open Compute flagship data center in Prineville, Oregon, runs at 1.07 PUE.
The rest of us aren’t faring nearly so well. IDC found that more than two-thirds of enterprises logged a PUE of more than 2.0; 10 percent of enterprises had PUEs greater than 3.0 or didn't know the ratio.
The right way
So what can you do about it? Lots.
IBM's Robert Sullivan introduced the concept of hot aisle/cold aisle cabinets in 1992. In this configuration, cabinets are placed so the front of one cabinet never faces the back of another. It's that simple. By doing this, you create alternating rows of cold supply and hot return air.
So everyone's doing this now? Right? Wrong.
In a 2010 interview, Sullivan sniped, "It infuriates me is to see computer rooms that have been designed in the last three to five years that still have a legacy layout. Give me a break."
So if your data center isn't using hot/cold aisles yet, do it. Do it now.
Next, as Sullivan pointed out, you need to physically contain the aisles—especially the hot aisles—to maximize cost-efficient cooling.
As Vali Sorell, a data center HVAC engineer, points out in a recent article, "By closing off the hot and cold aisles (or ducting the hot return air out of the cabinets), the air flow dynamics within the data center" are greatly improved. This method is called containment.
So why isn't everyone doing it? First, there's that old human factor. We like having the ability to climb inside our cabinets, racks, and aisles. Containment makes that harder. Another problem is it's not easy to get the air pressure differences between hot and cold aisles just right.
Contained hot and cold aisles may be annoying to some of your staff, and they can be costly to set up right in the first place. But the long-term power bill reduction can save your company serious money.
To get containment right, you can't ignore the little details. Lars Strong, a senior engineer at Upsite Technologies and a data center HVAC specialist, explains that you must seal air gaps both within IT racks and between IT racks. Even the small space between the bottom of an IT rack or cabinet and the raised floor or slab can have a significant impact on IT inlet temperatures, says Strong.
Even rack doors and cables can interfere with airflow, according to a Cisco Data Center Power and Cooling White Paper. It concludes, "To the greatest extent possible, airflow obstructions should be removed from the intake and exhaust openings of the equipment mounted in the chassis. If a rack door is installed, it should be perforated and be at least 65 percent open. Solid doors, made of glass or any other material, inevitably result in airflow problems and should be avoided. Proper cable management is critical to reducing airflow blockage."
Sorell recommends the use of air restrictors to close unprotected openings at cable cutouts, explaining, "A single unprotected opening of approximately 12" x 6" can bypass enough air to reduce the system cooling capacity by 1 KW of cabinet load. When each cabinet has a cable cutout, a large proportion of the cooling capacity is lost to bypass."
You may also be running your data center at too low a temperature. IDC found that 75 percent of businesses kept data centers below 75°F degrees, when the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) has a recommended maximum of 80.6°F degrees and allowable maximum of 95°F degrees.
This can make a huge difference. Jeff Klaus, Intel's general manager of Data Center Solutions, observed, "With each degree of increase, power bills reflect a two-percent increase in savings, which adds up significantly year after year." Pacific Gas & Electric (PG&E), in its Data Center Best Practices Guide, estimates that for a data center with typical infrastructure efficiency, "saving one Watt of power can save more than $15 over a three-year server lifespan; assuming infrastructure costs of $6/Watt and an annual average electric rate of $0.14 per kWh."
Google, for example, runs its data centers at 80 degrees F. And it runs some of the most energy-efficient data centers in the world. Here’s how:
- Measure PUE. You can't manage what you don’t measure, so Google samples each server's temperature once per second.
- Manage airflow. This has two parts. First, by using thermal modeling, Google locates "hot spots" and gets a real-world view of airflow. This enables the data center staff to use techniques as basic as moving computer room air conditioners (CRACs) to just the right spots, eliminating hot spots and evening out the ambient temperature.
While Google uses contained hot and cold aisles, the search giant also takes advantage of easier, inexpensive means of restricting airflow in server rooms and smaller data centers. Among them:
- Using blanking panels (or flat sheets of metal) to close off empty rack slots and prevent hot aisle air from seeping into the cold aisle
- Hanging plastic curtains—such as those used in commercial refrigerators—to seal off the cold aisle
- Enclosing areas with components that run hotter (such as power supply units) with plastic curtains
- Use free cooling. When possible, Google uses water instead of chillers, a.k.a. air conditioning units, to keep data centers cool and profitable.
Google does this in two ways. First it uses hot tubs—no, not the kind you splash around in! These are large tubs of water over which the hot air from the servers is blown. Fans pull hot air across water-filled cooling coils. Then the HVAC system returns the cooled air to the data center. The servers then draw in this air again to cool themselves down, completing the cooling process. In effect, it operates like a building-sized car radiator. In addition, Google also has cooling towers that use evaporative cooling to maintain data center temperatures inexpensively. Used in conjunction with good air flow, this can reduce anyone's data center cooling bill.
Look at the server load, too. Servers that exclude all non-essential features, such as those from Open Compute, can operate in a higher-temperature environment, reducing the overall cooling load.
If this sounds like a lot of work, well, yes, it is. But according to Peter M. Curtis, who wrote the book on data center and high-availability system management, proper airflow management alone can provide up to 30 percent in energy savings.
Curtis also recommends that, in some situations, mixing the cold supply air with the hot exhaust air in the data center can yield a more uniform temperature distribution. This does not mean you should ignore all other practical advice. It means that instead of focusing on individual AC issues, aisles, and cabinets, you should look at the system level to increase the overall efficiency of your entire data center.
This is not easy. To really manage your data center's energy use and airflow management, you need expert help. This won't come cheap. How much can you save by just getting your airflow just right? PG&E estimates that "Combined with an air-side economizer, air management can reduce data center cooling costs by over 60%." I don't know about your CFO, but I know if I had ever told mine I could cut 60 percent of our data center's HVAC costs, I would have gotten a promotion.
Image credit: Bob Michal
Keep learning
Choose the right ESM tool for your needs. Get up to speed with the our Buyer's Guide to Enterprise Service Management Tools
What will the next generation of enterprise service management tools look like? TechBeacon's Guide to Optimizing Enterprise Service Management offers the insights.
Discover more about IT Operations Monitoring with TechBeacon's Guide.
What's the best way to get your robotic process automation project off the ground? Find out how to choose the right tools—and the right project.
Ready to advance up the IT career ladder? TechBeacon's Careers Topic Center provides expert advice you need to prepare for your next move.