rmoff

October 27, 2010

Two excellent OBI presentations from Jeff McQuigg

Filed under: obiee, performance — rmoff @ 08:59

Jeff McQuigg has posted two presentations that he gave at Openworld 2010 on his website here: http://greatobi.wordpress.com/2010/10/26/oow-presos/

They’re full of real content and well worth a read. There’s excellent levels of detail and plenty to think about if you’re involved in OBI or DW development projects.

Does this summarise your system development & support ethos?

Filed under: support, thoughts — rmoff @ 08:57

I heard this on Thinking Allowed and thought how applicable it was to the attitudes that you can sometimes encounter in both systems development, and the support of production systems:

“Each uneventful day that passes reinforces a steadily growing false sense of confidence that everything is all right – that I, we, my group must be OK because the way we did things today resulted in no adverse consequences.”

by Scott Snook (Senior Lecturer in the Organizational Behavior unit at Harvard Business School )

October 26, 2010

Graphing I/O data using gnuplot and Oracle V$SYSSTAT

Filed under: io, oracle, unix — rmoff @ 15:42

Continuing in the beard-scratching theme of Unix related posts (previously – awk), here’s a way to graph out the I/O profile of your Oracle database via the Oracle metrics in gv$sysstat, and gnuplot. This is only the system I/O as observed by Oracle, so for belts & braces (or to placate a cynical sysadmin ;-)) you may want to cross-reference it with something like sar.

First, a pretty picture of what you can get:

Example gnuplot output of Oracle I/O

Why would you want to do this when you’ve got a pretty GUI in EM/Grid and flash graphs? Because the data in EM is averaged out over a relatively large sample (one minute for recent history, and whatever your AWR samplerate is for older history) and thus lower than the point-in-time I/O being driven through your server. My previous post (The danger of averages) illustrates this. For identifying bottlenecks or capacity planning, you need to know how much I/O throughput you really use. Consider this rough example: Using V$SYSMETRIC_HISTORY you can get an average over the last minute. For the first 30 seconds of this sampled minute you ran a query consuming I/O at 100 MB/s. For the last 30 seconds there was no system activity. V$SYSMETRIC_HISTORY shows a figure of 50 MB/s, as this is the average over a minute. You look at your system’s IO profile in EM and see 50 MB/s. Your hardware guys tell you that the system has a capacity of 100 MB/s throughput, so you think you’re well in the clear for capacity, when in reality you’re already hitting it. Now, what happens when two instances of this same query – each requiring 100 MB/s – runs? It will take twice as long when run concurrently (because they’ll have to share the throughput available, and thus get ~50 MB/s). It might be that this is acceptable, that the trade-off of hardware cost to increase I/O throughput capacity isn’t justifiable for making the queries run faster. But it’s important to be aware of the bottlenecks in a system so that they can be mitigated and considered in any capacity planning.

So, anyway, back to the point of this post:

Kevin Closson’s written a neat script here which will write out IO metrics from gv$sysstat to a flat file on the Oracle host. It looks like this:

2010-10-26-09:09:58|1|1|0|
2010-10-26-09:10:03|0|0|0|
2010-10-26-09:10:08|51|51|0|
2010-10-26-09:10:13|87|87|0|
2010-10-26-09:10:19|108|108|0|
2010-10-26-09:10:24|118|118|0|
2010-10-26-09:10:29|116|117|0|
2010-10-26-09:10:34|451|454|0|
2010-10-26-09:10:39|692|694|0|
2010-10-26-09:10:44|894|895|2|
2010-10-26-09:10:49|875|879|1|
2010-10-26-09:10:54|990|990|2|
2010-10-26-09:10:59|922|920|1|
2010-10-26-09:11:04|768|765|2|

I wanted a quick way to visualise the data, and also to plot it out once it was over the number of rows that Excel will chart at once (32k I think). gnuplot was the answer, but it’s a bit of a sod to get running as a fire-and-forget so here’s a help.
Here’s how it works:

  • For ease of use it copies the file generated by the above script to your local machine. If you don’t want to do that then remove the scp line.
  • It uses sed to convert bar (|) characters to Space ( ), as gnuplot requires whitespace separated columns. tr would probably do the job too.
  • It uses an inline input to gnuplot, but you could move this to a separate config file if you wanted
  • It plots the graph on screen (assuming you have configured X), and also writes it to a png file
# Download the file from a remote server
# Prompts for password, or use ssh key authentication to make it seamless
scp user@remotehost:/tmp/io.DBINSTANCE.log .
# Convert bar (|) to Space ( )
sed -e 's/|/ /g' io.DBINSTANCE.log > io.log
# Plot a graph (remove --persist if you don't want the window displayed)
gnuplot --persist <<EOF
set title "Read I/O MB/s\nSampled every 5 seconds"
set xdata time
set timefmt "%Y-%m-%d-%H:%M:%S"
set format x "%d %b\n%H:%M"
set ylabel "MB/s"
# You can set the y range to a specific constant based on your IO capacity
set yrange [0:3000]
set xlabel "Date/Time"
unset key
set grid
plot "io.log" using 1:3 with boxes fs
set terminal png font "courier, 10" size 1200,800
set output "io.png"
replot
EOF

This was written on cygwin, and presumably should work on any ‘nix system.


Ob. OBIEE: An alternative to gnuplot would be to graph the data in OBIEE using the text file as source data to a hacked together RPD 🙂 :

October 19, 2010

awk – split a fixed width file into separate files named on content

Filed under: obiee — rmoff @ 13:31

More of a unix thing than DW/BI this post, but I have a beard so am semi-qualified….

The requirement was to improve the performance of some ODI processing that as part of its work was taking one huge input file, and splitting it into chunks based on content in the file. To add some (minor) spice the file was fixed width with no deliminators, so the easy awk answers that I found on google weren’t applicable.

Source file, to be split based on cols 14-16:

0000000010069583000A
0000000010083583000A
0000000011600583000B
0000000011936584000D
0000000010101584000E
0000000010903584000G
0000000010517585000Q

Output files:
prefix.583.dat

0000000010069583000A
0000000010083583000A
0000000011600583000B

prefix.584.dat

0000000011936584000D
0000000010101584000E
0000000010903584000G

prefix.585.dat

0000000010517585000Q

So without further ado, my little command-line gem:

awk '{fn=substr($0,14,3);print > ("prefix." fn ".dat")}' file.dat

Where:

  • the output filename is made up of characters from column 14 in the input record for 3 characters
  • file.dat is the input filename

For extra brownie points, incorporate components of the input filename in the output filenames:

awk 'BEGIN {dt=substr(FILENAME,6,8)} {fn=substr($0,14,3);print > ("prefix." fn "." dt ".dat")}' file.20101018.dat

As a sidenote, this is an example of choosing the right tools for the job – when there’s simple commandline tools that can be scripted like this, using a heavyweight tool like ODI is overkill and can indeed land you with performance problems.

[edit]
Ted Dziuba has written a great article here on how unix utilities can often be the right tool for the job: Taco Bell Programming.
[/edit]

October 18, 2010

When is a bug not a bug? When it’s a “design decision”

Filed under: informatica, obia, rant, support — rmoff @ 13:02

Last month I wrote about a problem that Informatica as part of OBIA was causing us, wherein an expired database account would bring Oracle down by virtue of multiple connections from Informatica.

I raised an SR with Oracle (under OBIA support), who after some back-and-forth with Informatica, were told:

This is not a bug. That the two error messages coming back from Oracle are handled differently is the result of a design decision and as such not a product fault.

Is “design decision” the new “undocumented feature” ?

Create a free website or blog at WordPress.com.