March 11, 2011

Getting good quality I/O throughput data

Filed under: AWR, io, oracle, performance, visualisation — rmoff @ 11:33

This post expands on one I made last year here about sampling frequency (of I/O throughput, but it’s a generic concept).
The background to this is my analysis of the performance and capacity of our data warehouse on Oracle 11g.

Before I get too boring, here’s the fun bit:

Pork Pies per Hour (PP/h)

Jim wants to enter a championship pork-pie eating competition. He’s timed himself practising and over the course of an hour he eats four pork pies. So we might say that his Pork Pies per Hour (PP/h) rate is 4.

The competition lasts for thirty minutes. The world champion can eat eight pork pies in thirty minutes. Does Jim stand a chance?
(let’s pretend he has an insatiable appetite and isn’t going to get full, and all other smart-ass factors)

If his consumption rate was 4 PP/h, and he only has half an hour, then we would predict he’ll consume 4 * 0.5 = 2 pork pies. So Jim’s going to get his ass beat.

Or is he?

What if the rate of 4 PP/h masks the fact that the hour broke down as follows:

  • First 15 minutes: he ate one pork pie
  • 15 – 30 minutes: had a beer
  • 30 – 45 minutes: gobbled down three pork pies
  • 45 – 60 minutes: he had another beer and gently sweated pork fumes?

If that were the case and we had sampled every fifteen minutes, we’d see this:

So what we want to know, which is the maximum rate at which he can consume pork pies, is exposed only when we sample at an appropriate frequency.

Mega bites to Mega bytes

Enough of pork pies, and back to the tasty subject of I/O throughput. The point I am trying to make is that without an appropriate sample size the data that we have becomes less useful.

You can always process your data further to derive conclusions from it over a longer term. Five second samples are going to be fairly unintelligible if considered unprocessed over the timespan of a year.

But what you can’t do is add back in the detail that you lost by sampling with too great a frequency. Once that moment’s passed, it’s gone.

By the nature of metric which is a rate at which something happens, the sample is going to be an average over the sampling period. The problem with this is that it can mask peaks in the throughput. Peaks may (or may not) be hitting a ceiling in your system which an average figure will make you think you are plenty clear of.

System metrics for Oracle are available through AWR, which typically samples every hour. OS-level tools may sample more frequently, but in the context of capacity planning and analysis, periods are often going to be 10s of minutes, or hourly and upwards.

The following illustrates the effect of averaging I/O throughput figures.

The data is the average I/O throughput, sampled every five seconds (through this method). Note that already this is an average, but in the context of hourly samples (for example) we will have to live with five seconds as the starting point.

This first graph shows the original data, with a five minute average drawn over it. For each, the maximum is noted.
It can be seen that the maximum the I/O throughput hit was a shade over 3GB/s. That was the real maximum that we were driving through the I/O pipe over the period of time (ignoring that it’s a 5-second average). Now look at the maximum of the 5 minute average – we lost c240MB/s in our maximum, which is now 2.7GBs.

In the second graph the original sample is shown, with a 30 minute average. It’s clear to see the effect of averaging the data has – the peaks and troughs are smoothed out, giving a more even line. But is this what we want? Our apparent maximum I/O based on a 30 minute average has now almost halved!
Apparently, we only needed 1.6GB/s of I/O throughput during this time period. The graph clearly shows that this is a false statement. But what about now?

Same graph as before, but without the context of the original sample. Given this data – which is what you’ll have if you collect I/O throughput data that’s sampled every 30 minutes – then how would you know what the maximum throughput during that period was? It is impossible to know!

Pushing this point further, the same 30 minute average, over an extended period:

What’s the maximum throughput that was required during this period? When was the I/O throughput approaching capacity?
You can’t know from the averaged data alone!

Here’s the same extended period, with the original 5 second samples. This is just proving the point, that the 30 minute samples have obliterated the peaks particularly around 04:00 – 06:00.

So what?

To be able to plan for a system’s I/O capacity we need to know more than how much I/O it transferred over a relatively long period of time. We need to know what the biggest demand it put on the system was, otherwise we risk unseen bottlenecks. To make this useful, we also need to understand if these big demands were prolonged peaks or not. Particularly in a DW/BI environment, load is generally going to be sporadic. Sure, we run regular batches at predictable times, and may look to cache reports at fixed times, but they’re exceptions not the rule.

If a system has a limit of 3GB/s, and we hit that 3GB/s for a few seconds, what is the implication of that? We need to transfer the same amount of data, so hitting the limit means that the transfer is going to take longer. But a few seconds here and there may not matter — it all comes down to the context.

A report that runs for five minutes which hits I/O throughput limit for a second or two isn’t as much of a concern as one where the I/O hits the limit for minutes on end. There’s plenty written about system capacity and scalability, and it should be clear that if a system is hitting a capacity limit (whether it’s I/O, CPU, or whatever) for prolonged periods then the overall stability is going to suffer. Maybe that five minute report which spends four minutes on bottlenecked I/O doesn’t bother the user, but what about the report that’s supposed to run in a few seconds which is sat waiting for I/O at the same time?

(Ed: I’m pretty sure that I’m straying into the realm of knees and such, but need (knee’d?) to go off an read some more about it first to be sure)

I love twitter

In writing this, I’ve been struggling with some of the concepts which I can instinctively feel but don’t have the understanding to properly articulate. Particularly, if my assertion is that long sample periods are not a good idea, what is a good sample period? It’s no use being a smart-ass and rubbishing the data we collect if I can’t explain how we should collect it.

So, I turned to twitter. Twitter is awesome. (I think of Chet almost everytime I say this because he was one of the main guys who convinced me it was as good as the fuss made out. Check out his presentation all about it here here).

and back the answers came:

twitter is teh awesome

twitter is teh awesome

Amazingly helpful stuff, and focussed on my specific question. Sure, Google has changed our lives when it comes to finding the answers to questions. But (a) there is a lot of crap written on the internet (Blunder On So, anyone?), and (b) you will often find generally interesting things in the area in which you are interested, but for the more specific uncommon questions it’s unlikely you’ll get a direct hit.

Here, I had industry-leading figures directly answering my specific question! I love twitter. And I really appreciate experts taking the time to share their knowledge, experience and expertise.

What next?

Nothing that I’m writing here is new, but I enjoy writing things down to clarify my thoughts.

Courtesy of the good folk of twitter, I have some great links to follow up and digest.

March 9, 2011

Comparing methods for recording I/O – V$SYSSTAT vs HP Measureware

Filed under: io, oracle, rrdtool, unix — rmoff @ 13:38

I wrote last year about Graphing I/O data using gnuplot and Oracle V$SYSSTAT, using a script from Kevin Closson in his article How To Produce Raw, Spreadsheet-Ready Physical I/O Data With PL/SQL. Good For Exadata, Good For Traditional Storage.
Here I’ve got a simple comparison of the data recorded through this script (in essence, Oracle’s V$SYSSTAT), and directly on the OS through HP’s MeasureWare. It’s graphed out with my new favourite tool, rrdtool:

Happily for me, the data marries up almost exactly. One might say that so it ought. But it’s always good to have explicit proof of such a supposition.

March 8, 2011

OBIEE Systems Management – dodgy counter behaviour

Filed under: bi, bug, jmx, mbeans, obiee, systemsmanagement — rmoff @ 10:59

Over the last few months I’ve been doing a lot of exploring of OBIEE Systems Management data, covered in a mini-series of blog posts, Collecting OBIEE systems management data.

There are a vast number of counters exposed, ranging from the very interesting (Active Sessions, Cache Hits, etc) to the less so (Total Query Piggybacks, although for some seriously hardcore performance tuning even this may be of interest).

This short blog post is about a couple of counters which I’ve been monitoring but which looks to not be entirely reliable. Both are in the Oracle BI DB Connection Pool, and are:

  • Current Connection Count – The current number of open connections in the thread pool.
  • Current Busy Connection Count – The current number of connections assigned to process a query or processing a query in the DB Connection pool.

A picture tells a thousand words in this case:

OBIEE Connection Pool data

We can clearly see :

  • Current Busy Connection Count (“The current number of connections assigned to process a query or processing a query in the DB Connection pool.“) goes negative!
  • Current Connection Count (“The current number of open connections in the thread pool.“) accumulates. Either the counter is buggy, or there really are thousands of open connections in the thread pool which sounds worrying in itself.

These two counters reset and correct themselves only on a BI Server restart, which can be seen by the red vertical lines on the graph.

A snapshot of the current figures (via JConsole) backs up these numbers and puts “Current Connection Count” in context next to the ‘Peak’ and ‘Accumulated’ figures:

OBIEE Systems Management data, viewed through JConsole

March 1, 2011

Shiny new geek toys — rrdtool and screen

Filed under: jmx, unix, visualisation — rmoff @ 20:28

I’ve added two new toys to my geek arsenal today. First is one with which I’ve dabbled before, but struggled to master. The second is a revelation to me and which I discovered courtesy of twitter.


rrdtool is a data collection and graphing tool which I’ve been aware of for a while. I wanted to use it when I wrote about Collecting OBIEE systems management data with JMX, but couldn’t get it to work. I’ll not lie to you – it is a bitch to work with at first. Or put a more polite way, it has a steep learning curve. But when you reach the top of the curve and realise its potential…wow. You’ll soon understand why it is so widely used. I plan to write this up soon, but it let me draw nice graphs like this:

Example output from rrdtool of data collected via jmx


The second discovery is a tool called screen. Innocuously named (and a bugger to search for on google), I’d never heard of it until today when Frits Hoogland mentioned it and the subsequent stream of approving tweets from various geeks whose opinions I respect made me go and read up about it.

It is “a full-screen window manager that multiplexes a physical terminal between several processes, typically interactive shells”.
Why’s it good?
Two things:

  1. In one ssh session, you can have multiple ‘windows’. So you can be running one long process whilst working on another. Or developing some code and compiling it in the other. Or anything else clever you can think of.
  2. As important, you can reattach to sessions you’ve lost connectivity to. Running a long sqlplus job that you’ve had to wrap in a nohup’d shell wrapper? Disconnecting your laptop to take home and log on via VPN? Dialled in on a dodgy line that drops connection? All of these are no problem – you can reattach to your live session any time you want.

There’s a good introduction to screen in this article here.

Create a free website or blog at WordPress.com.