post

A DCIM Quip

All the vendors sell me
In their unique and special way
They claim that I can do it all
“I’m an essential business play”

[Read more...]

post

DCIM: An Operator’s Perspective — David Schirmacher, Digital Realty

A great video providing a sneak peak into David Schirmachers’s upcoming discussion at the 2013 Uptime Symposium

post

A Colocation Efficiency Metric

Tony Greenberg, CEO of RampRate, recently distilled Digital Service Efficiency (eBay’s newest metric) into the more generic term of ‘Service Efficiency’ with the goal of applying this metric to specific industries for more widespread adoption (see Tony’s blog here).  Thank you, Tony for finally pointing us in the right and logical direction.  Tony challenged us to think about Service Efficiency as a metric for multiple industries. I’m going to take Tony up on this challenge and throw the ball back in his own backyard – let’s define Service Efficiency for the colocation industry.

Let me provide some background on why the concept of Service Efficiency is the right thinking for evaluating a data center’s true efficiency. eBay’s recent in-house metric “Digital Service Efficiency” (or DSE…yet another three letter acronym) metric is a practical use of a soup to nuts metric that industry experts have been talking about for years.  I remember my first data center energy efficiency seminar with Dr. Bob Sullivan, who said the ultimate goal is to standardize on compute load…essentially defining the efficiency of a data center by defining what efficiency really is, output over input.  eBay is in a strong position to quantify this metric because they own (and more importantly can measure) the energy input to the data center as well as the output which in their case is online market transactions (see their DSE dashboard here).  Electricity comes in, transactions go out.  This is clear, but only for the businesses in which ecommerce transactions are the output.  How would this work for other types of industries?

Tony recognized this limitation when he wrote: “eBay’s transaction may differ from Gap’s transaction, which in turn definitely differs from a Ford, Boeing, Fidelity, or GE transaction.”  He dropped the ‘D’ therefore, and challenged us all to think about ‘Service Efficiency.’  Service Efficiency, however, needs to be industry specific.  He offers a set of common industry outputs to define and standardize on in the chart below.

Industry

Potential DSE Denominator

Retail, Finance  Transaction
Online Gaming  Gaming Hour
Digital Media  View
Mfg  Unit Shipped

 

I would add one more industry to this chart, one that is growing 30%-40% year over year, probably faster than enterprise-owned data centers – the colocation industry.  Using the same structure as Tony, the outputs for colos included below:

Industry

Potential DSE Denominator 

Retail, Finance  Transaction
Online Gaming  Gaming Hour
Digital Media  View
Mfg  Unit Shipped
Colocation Power, Cooling, Network 

 

Just as eBay provides transactions to its customers, colocation companies provide their customers with reliable power, reliable cooling, and network connectivity.  The input is electricity…and the output is: reliable power, reliable cooling, and rich network connectivity options.  A Service Efficiency metric based on these parameters will reflect the true efficiency of data centers in the colocation market. Furthermore, this industry is unique in that the pricing of these services is directly dependent on the amount of electricity used by the data center. Colos are required to pass energy costs through to its customers. Reducing energy cost is a means to offer a lower price point. The Service Efficiency metric should reflect this.  The only way colo’s can reduce energy costs without risking reliable power, reliable cooling, and network connectivity is by running their operations extremely intelligently and efficiently.

A colo Service Efficiency metric can thus be used by tenants to evaluate which providers are not only the most efficient, but more importantly, offer attractive pricing for the services provided.

And who wouldn’t want a colo with these characteristics?

post

2013 State of the Union on Data Centers

Mr. Speaker, Mr. Vice President, members of Congress, fellow Americans, it is my task to report the state of the union’s data centers. To improve it is the task of us all.  In this blog post, thanks to the grit and determination of the American people, there is much progress to report.

[Read more...]

post

What Causes Hot Spots?

The single issue that every facility manager hates most is hot spots.  They are like a disease.  You cannot see them and you only become aware of them by observing their symptoms.  And be aware when you do.  Hot spots that continue for extended periods of time can cause poor server performance and, in the most extreme cases, server failures.  Since the number one job of a facility manager is to prevent server failure, hot spots needs to be diagnosed and treated early.    This post will examine why hot spots exist and why the industry’s current solution for mitigating them may actually make them worse.

[Read more...]

post

Google Data Center [finally] Revealed

Wired published an article yesterday discussing the layout and operation of Google’s data centers…details that have always been closely guarded by the Internet giant.  Read the article here:

http://www.wired.com/wiredenterprise/2012/10/ff-inside-google-data-center/

My initial thought before beginning to read the article was that I would get some details about the super efficient state-of-the-art equipment that Google has the luxury of affording.  However, the article revealed much more than this and goes into detail about the unique operational solutions that Google deploys to ensure that its customers are always receiving the best possible experience.  I was pleasantly surprised by these new facts, but why should I be?  Google doesn’t build data centers to run their servers, they build data centers to meet the needs of their customers.  Here are some highlights that I enjoyed from the article:

[Read more...]

post

The Farmer’s Dilemma

A story about temperature vs. pressure in cooling management

Once upon a time long long ago, in a valley just south of the San Francisco Bay, a large truck pulled up to a farmer’s house and dropped off a thousand boxes, each of which was the same size and weight.  The farmer was puzzled to find that in every box was a toaster each with the same set of instructions.  The instructions indicated that the farmer must construct a way to ensure that, when these thousand toasters were turned on, they would not overheat and burn up.

Said the instructions, if even one toaster burns up, the farmer will have failed and be damned for eternity.  If, however, the farmer succeeds at this challenge, he will be rewarded and praised in all the land.  This was quite an odd request, but the farmer took it seriously and consulted with his smartest group of friends.  The group was comprised of the three people: a toaster expert who knew the heat output of every type of toaster ever invented, a building engineer who had designed cooling systems for office buildings, and a PhD who just happened to be the farmer’s neighbor.  The team deliberated for a few weeks and came up with a brilliant design.

[Read more...]

post

Day 7 – Summary

For the past 6 days, you have seen different visualizations of  the same set of temperature data displayed and organized in different ways.  Depending on how the data is presented, new and different types of meaning can be derived.  Below is a table showing the actionable pieces of information obtained from the data and which visual most effectively provided the insight.

Summary of Insights

All of the information described in the above Table was obtained strictly from looking at the data in different ways.  Although none of this is field verified, as a data center operator, I have improved, and easily discernible direction on where to begin to look for potential issues.  And, as changes occur to fix these issues, operators can track the effects using the same visuals they used to identify them.  Site specific patterns can be recognized and accounted for.  Operators will be able to develop a deeper understanding of their facility using these tools and benefit from them based on their specific needs.  If I could come up with 9 useful insights by just looking at this data, imagine what an operator who knows his facility inside and out, could discover.

So the next time you have seemingly meaningless data, stop and graph it a few different ways.  I am certain you will obtain something insightful.

Previous Posts

About the Data
Day 1 – From 10,000 Feet
Day 2 – Rack by Rack
Day 3 – Learn to Love Graphs
Day 4 – Wordle
Day 5 – Bubble Charts
Day 6 – Bubble Charts Delta
Today – Summary

post

Day 6 – Bubble Charts Delta

Bubble Chart Showing Rack Top - Rack Bottom

In the bubble chart yesterday, we looked at rack top temperatures over the course of a week.  I mentioned that with bubble charts, it’s not easy to compare two parameters such as rack top and rack bottom.  However, what you can do is look at the delta temperature (Rack Top Temp – Rack Bottom Temp) for each location.  The bubble chart above is showing this difference.

The story that this data is telling us is incredible and extremely useful in determining airflow inefficiencies.

Here is some background. This facility has an under floor air distribution system with cold aisle/hot aisle layout.  Cool air is delivered through perforated tiles to the rack inlet.  Generally, the difference in air temperature between the bottom and top of the rack should be around 8°F due to the heating as the air rises and the typical mixing that occurs.

Takeaways

  • Rack 6, 8, 9, and 29 have greater than a 10°F delta T.  This may indicate areas where lots of mixing is occurring.  Most likely air from the hot aisle is sneaking over the top or through the racks affecting temperatures.  Blanking panels may help.
  • There are many racks with a negative delta T, meaning that the top of the rack is colder than the bottom of the rack.  Potential causes are  hot air from the hot aisle  infiltrating the cold aisle low to the ground due to missing blanking panels, through holes in the floor, or through the IT equipment itself.  The negative delta T may also indicate that IT equipment is installed backwards exhausting hot air into the cold aisles.
  • Most of the racks have a delta T that changes over the course of the week as indicated by multiple bubbles on a single vertical line.  However racks 12, 16, 17, and 18 show only one large bubble –  meaning that they are operating at exactly the same delta T for the entire time.  How could this be ?

Issues

  • The real value of looking at delta T in a bubble graph is to identify airflow issues..  This graph would most likely be used when hot or cold spots are identified to determine the root cause.

Previous Posts
About the Data
Day 1 – From 10,000 Feet
Day 2 – Rack by Rack
Day 3 – Learn to Love Graphs
Day 4 – Wordle
Day 5 – Bubble Charts
Today – Bubble Charts Delta
Tomorrow – A Day of Rest

post

Day 5 – Bubble Charts

Rack Top Temp Bubble Chart

Bubble charts visualization takes accuracy to the next level because they can show three variables at one time.  In this case the parameters are rack, top temperature, and time. Each vertical line is a specific rack.  The graph above is displaying the top temperatures.  The bubbles on that vertical line represent the temperature of the rack during the time period (the temperatures are rounded to the closest 2°F increment).  The size of the bubble indicates how long the rack was at that temperature.  For example: the top temperatures at rack 6 were roughly 68°F, 70°F, 72°F, 74°F, and 76°F.  The larger bubbles at 72°F and 74°F indicates that rack is operating at those temperatures the majority of the time.

Takeaways

  • It is instantly obvious that racks 12, 17, and 20 have persistent hot spots.
  • There appears to be multiple racks that share the same temperature profile.  Racks 1 – 4 are all operating in the same temperature range.  Racks 5 – 10 are also similar.
  • Rack 15 seems to be an outlier operating cooler than its surrounding sensors.  Possibly an airflow issue?
  • Racks 31 – 34 are the coldest of the bunch.  There may be an excessive amount of floor tiles in this region.

Issues

  • Bubble charts can only show one measured parameter at a time. In this case, it represents rack top temperatures .Bubble charts are not good for comparing rack top and rack bottom temperatures
  • Also, bubble charts should be reviewed over a specific time duration.   The chart above displays data over the course of a week.  You could certainly look at data over the course of a month or longer, but it becomes more  difficult if you would like to compare two time periods such as one week versus another week.

Previous Posts
About the Data
Day 1 – From 10,000 Feet
Day 2 – Rack by Rack
Day 3 – Learn to Love Graphs
Day 4 – Wordle
Today – Bubble Charts
Tomorrow – Bubble Charts Delta