This article of mine will appear in Logistics 2.0 magazine's special issue on Green IT.
There was a green revolution in the 1960s and 70’s which increased the yield of agriculture production in many developing countries. Similarly, there is another type of green revolution that is occurring in today’s environmentally conscious world for information technology (IT) systems. A data center that hosts application servers, network switches and storage devices is considered to be green if it consumes less power for running and cooling the computer systems hosted by it. Data center operators are very conscious about power consumption because there is a limit on the amount of power that can be provided by the power sub-station supporting the data center. Once that limit is reached the data center operator has to open a new data center, and the new data center will initially not be fully utilized. Thus, in today’s cost conscious era, most data center operators are vigilant about a) ensuring that the computing resources are not underutilized (because under utilized resources still consume a lot of power even when they are not doing useful work) b) procuring devices that consume less power and c) ensuring that they are using efficient data center cooling mechanisms. Thus, in this article we will discuss how data center administrators deal with the following issues:
* How to increase resource utilization?
* How to evaluate the resources they are purchasing with respect to power?
* How to ensure the devices are consuming less power?
* How to design/leverage efficient data center cooling mechanisms?
How to increase resource utilization?
Memory, CPU, fan, network cards and disks are some important computer resources that consume power. The input workload determines how much work these resources have to perform, and this, in turn, dictates the amount of power consumed by the computer system. It is interesting to note that he amount of power consumed by a system that is 50 percent utilized is not that much more than a computer system that is 10 percent utilized (due to the presence of fixed power consumption costs). Hence, less power is consumed by a single box that is 50 percent utilized than five boxes that are each only 10 percent utilized. In the past, the software running on a box was very tightly coupled to the hardware box, and thus, it was very difficult to dynamically move an application from one computer to another. However, with the emergence of hypervisor (server virtualization) technologies like Xen, HyperV, and VMWare, now it is possible to dynamically move applications between computer systems, and thus, increase the overall system utilization. The hypervisor technology allows multiple applications to run on a single box, and the failure of a single application does not affect the execution of the other applications running on the computer. The hypervisor technology has been around since 1960s (IBM VM operating system), but only now this technology has been made available to run on commodity hardware systems, and thus, it has become more prevalent. Most data center operators are re-designing their data centers to leverage this technology.
How to evaluate resources with respect to their power consumption properties?
Standards organizations like SNIA (Storage Networking Industry Association) are grappling with the task of how to rate a storage device with respect to its power consumption properties. For example, a storage device might use flash technology instead of disk drives, and thus, it can consume less power while being more expensive. So, it is not prudent to just look at the amount of power that is consumed by a computer device in isolation. Instead, one should look at power consumption in conjunction with performance, availability, reliability, physical shelf space requirements, and cost considerations. Therefore, system administrators have to make trade-offs between these different parameters, and there is a need for new power related metrics like IOs/Watt or IOs/Watt/dollar, or Watts/Cubic Feet etc. Furthermore, the administrators need to look at their respective application workloads to select the proper type of computer resources. For example, there is a difference in the I/O characteristics of archival workloads and on-line transaction processing (OLTP) type workloads. In archival workloads one does not care about high throughput, whereas, in OLTP workloads throughput requirements are very important. Thus, one can purchase a storage system with slower RPM disks for archival workloads than for OLTP workloads because slower RPM disks consume less power and are usually cheaper.
How to ensure that devices are consuming less power?
There are both pro-active and re-active techniques with respect to reducing power consumption in computer devices.
Pro-active techniques: These techniques a priori ensure that devices consume less power. For example, one can cut down on the number of disks being used by using higher capacity disks. A one Terabyte disk will consume less power than ten 100 Gigabyte disks. Similarly, one can reduce the number of copies of data, use data compression, thin provisioning and data de-duplication techniques to reduce the amount of data being stored on disks. This, in turn, reduces the number of disk drives being used which, in turn, leads to less power consumption. Similarly, one can also use Flash drives or higher efficiency power supplies to also pro-actively cut down on the amount of power being consumed.
Re-active techniques: In re-active techniques one dynamically changes the state of a physical resource from high power consuming state to low power consuming state. The state of CPU, memory and disk drives can be dynamically transitioned between different power states. It is important to note that there is a trade-off between power consumption and performance when one transitions a device to a lower power state. For example, if we spin down or shut down a disk drive, the next time we want to read data from that drive we will incur higher latency. This is not acceptable behavior for all the different types of workloads. For example, interactive applications cannot wait for disks to spin up.
How to design efficient cooling mechanisms?
In the past people assumed that for every 1 watt of power consumed, one requires 1 watt of power for cooling. However, now people are building sophisticated data centers to reduce the power required for cooling. Data center builders are using the notion of hot aisles and cold aisles, and are also encasing (insulating) the racks to ensure that hot air does not mix with the freshly brought cold air. Data center designers are also using blanking panels to fill up empty space in racks in order to manage air flow efficiency. People are also locating data centers in regions where the outside air temperature and humidity is optimum (temperature range of 20 degree to 25 degree C, and humidity range of 40 to 45 % with a maximum dew point of 17 degree C). Some system designers have started to leverage water cooling in lieu of air cooling in order to more efficiently remove the heat from the hot systems. However, the plumbing infra-structure requirement for water cooling leads to higher startup costs. Data center designers are also employing raised floor designs to facilitate better air flow circulation. In conclusion, the use of these cooling techniques is now leading to a ratio of less than 1 watt of cooling for every 1 watt of power consumed.
In conclusion, it is important to note that in addition to performance, power management is another quantitative way of measuring system performance. Going forward, as standards bodies produce new power measurement units this will become another key differentiator between the products from different vendors.
