In the "good ol' days" of mainframes -- now called enterprise servers -- the objective was to cool the whole data centre, keeping everything at one uniform temperature of around 55 degrees Fahrenheit (13 degrees Celsius). Now that we've disguised blast furnaces as server racks, the data centre cooling model has changed. We now aim to cool individual pieces of equipment and are not as concerned with overall ambient temperatures.
We can also learn a great deal about data centre cooling from the Mc DLT sandwich that McDonald's introduced in 1985. A physical barrier in the packaging kept the hot part of the burger (the patty) hot and the cold side of the meal (lettuce and tomato) cool. McDonald's patented packaging is analogous to the today's data centre cooling strategy -- don't mix hot and cold air.
The basic issue is that it's a rare exception that a data centre's airflow is planned based on data and facts. It's hard to manage what can't be seen. Every CIO or CTO should require that his data centre manager model the facility's airflow and use the results to architect tile layout and equipment placement.
Two camps exist between those who believe that actual temperature and airflow should be measured with probes and meters and those who believe that mathematical modelling is sufficient. The use of actual measurements may be more accurate (at a point in time), but it takes a great deal of time and, more importantly, doesn't enable the modelling of new equipment arrangements or the design of new data centres.
Mathematical airflow models have been validated by predicting expected results and testing them against actual measurements in real data centres. The differences between the two sets of measurements were statistically insignificant. Through the use of air flow modelling, a consistent set of data can be used to compare existing and future designs. Modelling also enables testing of component failure and the impact of a loss of an air conditioner to equipment operations.
Data centre tile placement
Most people involved in data centre operations are generally not aware of the science behind the placement of perforated raised-floor tiles and which percentage of perforation to use. When asked why a particular decision was made for a tile's placement, the answer is usually, "Because it felt warm here."
Modelling can also demonstrate the impact of unmanaged bypass airflow, or cold air that is not used to cool equipment before returning to the CRAC. This typically includes cable openings in the raised floor and other places where underflow air leaks out. One study found that perforated tile airflow improved by 66% just by sealing cabling openings. This led to a 3 kW increase in available rack power consumption and an elimination of hot spots. Fifty-three percent of cooled air can escape through such gaps without removing any equipment heat.
In order to maintain a constant operating temperature, a server needs an amount of cool air equivalent to the power it consumes. Bypass air, blockages, recirculation hot air and airflow restrictions impact this, but the front of a piece of equipment, where the equipment draws in cool air, needs a very specific amount of air measured in cubic feet per minute (CFM). The typical plan is also designed so that air exiting the equipment rises 20 degrees Fahrenheit (7 degrees Celsius) from the air entering the equipment.
The two-foot-by-two-foot tile directly in front of a server rack should be the sole source of that rack's cooling needs. All the airflow coming out of the tile should be drawn into the rack and the installed equipment.
Perforated tiles and the magic number
The question is what size perforations the tile should have. An optimal selection needs to be made so the equipment temperature is maintained while wasted cooling capacity, flow and pressure is limited. The first step is to determine peak power consumption -- that's the number listed on the placard typically on the back of the equipment. However, that number does not represent what the equipment actually draws. The real number is somewhere between 25% to 60% of what's on the placard, and it depends on whether the server or equipment is constantly running or only running during certain periods. Using 45% to 50% of the rated power draw is a good target. Today, the leading practice is to install active energy monitoring and management software to identify and track component power consumption.
The heat management of servers and other data centre equipment is designed to increase the incoming air for heat removal by 20 degrees Fahrenheit while maintaining a consistent internal equipment temperature. There is a relationship between heat load (the equipment power consumption) and airflow rate (the cold air needed to maintain the desired air temperature rise). If desired temperature rise is a constant (20 degrees Fahrenheit), the impact of air density and the specific heat of air can be reduced to a simple constant, or magic number, of 154. (The actual fluid dynamics details are beyond the scope of this article.)
Once the total expected power draw of all the components in the rack is calculated, the total expected power needs for the rack is known. This needs to be converted into the cooling needs of the rack, and that's where the magic number of 154 comes into play (there is some adjustment required for altitude). The total rack power consumption in kilowatts multiplied by 154 provides the total airflow in CFM that is required to maintain the appropriate temperature for that equipment.
For example, if the total power consumption of a rack is 2.5 kW, then 385 CFM (2.5 times 154) is needed to take 55-degree air and raise it 20 degrees to 75 degrees Fahrenheit (24 degrees Celsuius), which is the typical CRAC return air temperature setting. So what tile perforation -- measured in perforation percentage -- is needed? It depends on many factors, and this is why airflow modelling is required.
There's no easy way to determine tile airflow output. Either the airflow is measured with an air volume meter or the airflow is modelled. And knowing what the flow is at a specific location does not identify the flow even one tile over. Given the broad set of influencing factors, what is sufficient at one tile may be completely wrong on the adjacent tile.
Get the airflow low and wrong and the servers will run hot, which could affect equipment reliability. Get the airflow high and wrong and you're wasting money that could be otherwise used to value-adding activities. Wasted airflow in one spot means that equipment in another location may not receive the cooling it needs.
It is also possible that the results will show that not enough CFM are available to be delivered through a floor tile -- after all, a hurricane can't be pushed through a keyhole. In those cases, alternative cooling techniques are needed. These include the use of water, which is up to 4,000 times more efficient than air in removing heat; point cooling solutions, which cool from above or the sides of racks; and rear-door and side-car heat exchanges. These solutions can remove as much 70 kW of heat from a single cabinet and can help to dramatically reduce data centre floor space requirements through significant improvements in equipment density.
Take steps toward efficient data centre design
With this information, data centre managers today can walk through their facilities and quickly identify 15% tactical power savings through bypass airflow mitigation. A more strategic set of activities is to model the data centre's airflow and redesign the raised-floor layout by appropriately placing perforated tiles based on the numbers. For financially strapped IT organisations (which organisation isn't in these economic times?), not optimizing rack cooling is tantamount to IT malpractice.
ABOUT THE AUTHOR: Lucian Lipinsky de Orlov is Director of Business Strategy for VIRTERA, an independent IT professional services and consulting firm that delivers virtualisation technologies and services to companies across the U.S. VIRTERA's proven vSpectrum consulting method helps clients in the successful and rapid adoption of virtualisation and green IT technologies while delivering optimum ROI. For additional information on how to reduce power consumption and costs in a virtual environment, please visit http://www.virteratech.com/index.php/site/solutions_overview.