rmoff

October 27, 2010

Does this summarise your system development & support ethos?

Filed under: support, thoughts — rmoff @ 08:57

I heard this on Thinking Allowed and thought how applicable it was to the attitudes that you can sometimes encounter in both systems development, and the support of production systems:

“Each uneventful day that passes reinforces a steadily growing false sense of confidence that everything is all right – that I, we, my group must be OK because the way we did things today resulted in no adverse consequences.”

by Scott Snook (Senior Lecturer in the Organizational Behavior unit at Harvard Business School )

October 18, 2010

When is a bug not a bug? When it’s a “design decision”

Filed under: informatica, obia, rant, support — rmoff @ 13:02

Last month I wrote about a problem that Informatica as part of OBIA was causing us, wherein an expired database account would bring Oracle down by virtue of multiple connections from Informatica.

I raised an SR with Oracle (under OBIA support), who after some back-and-forth with Informatica, were told:

This is not a bug. That the two error messages coming back from Oracle are handled differently is the result of a design decision and as such not a product fault.

Is “design decision” the new “undocumented feature” ?

September 22, 2010

Better safe than sorry…sanitising DB input

Filed under: metalink, silly, support — rmoff @ 13:41

As Twitter learnt yesterday, you should always sanitise user input. I was amused to see My Oracle Support doing so….recursively 🙂

The apostrophe in “doesn’t” got escaped once, and then again, and then again, and then again, and then again ……

March 5, 2010

Who’s been at the cookie jar? EBS-BI authentication and Load Balancers

Filed under: cluster, load balancing, obiee, sawserver, support — rmoff @ 10:44

We hit a very interesting problem in our Production environment recently. We’d made no changes for a long time to the configuration, but all of a sudden users were on the phone complaining. They could login to BI from EBS but after logging in the next link they clicked took them to the OBIEE “You are not logged in” screen.

Our users login to EBS R12 and then using EBS authentication log in to OBIEE (10.1.3.4). Our OBIEE is deployed on OAS, load balanced across two servers by an F5 BIG-IP hardware load balancer.

In the OBIEE NQServer.log we started to see a lot of these errors around the time users started complaining:

[nQSError: 13011] Query for Initialization Block 'EBS Security Context' has failed.
[nQSError: 23006] The session variable, NQ_SESSION.ACF, has no value definition.

The EBS/BI authentication configuration was not done by me, and the theory of it was one of the things on my to-do list to understand but as is the way had never quite got around to it. Here was a good reason to learn very quickly! This posting by Gerard Braat is fantastic and brought me up to speed quickly. There’s also a doc on My Oracle Support, 552735.1, and some more info from Gareth Roberts on the OTN forum here.

We stopped Presentation Services on one of the servers, and suddenly users could use the system again. If we reversed the stopped/started servers, users could use the system. With one Presentation Services server running, the system was fine. With both up, users got “You are not logged in”. What did this demonstrate? That on their own, there was nothing wrong with our Presentation Services instances.

We soon suspected the load balancer. The load balancer sets a cookie on each user’s web browser at the initial connection as they connect to BI. The cookie is used in each subsequent connection to define which application server the user should be routed to. This is because Presentation Services cannot maintain state across instances and so the user must always come through to the same application server that they initially connected to (and therefore authenticated on).

What had happened was that the Load Balancer was issuing cookies with an expiry date already in the past (the clock was set incorrectly on it *facepalm*). This meant that the initial connection from EBS to BI was successful, because authentication was done as expected. But – the next time the client came back to the BI server for a new or updated report, they hit the Load Balancer and since the cookie holding the BI app server affinity was invalid (it had already expired) the Load Balancer sends them to any BI app server. If it’s not the one that they authenticated against then BI tries to authenticate them again, but they don’t have the acf URL string (which comes through in the initial EBS click through to BI), and hence the “The session variable, NQ_SESSION.ACF, has no value definition.” error in the NQServer.log and “You are not logged in” error shown to the user.

As soon as the date was fixed on the load balancer cookies were served properly, we brought up both Presentation Services, and everything worked again. Phew.

Footnote: I cannot recommend this tool highly enough : Fiddler2. It makes tracing HTTP traffic, request headers, cookies, etc, a piece of cake (cookie?).

January 26, 2010

Identify your OBIEE users by setting Client ID in Oracle connection

Filed under: obiee, oracle, support — rmoff @ 10:37

You get a call from your friendly DBA. He says the production database is up the spout, and it’s “that bee eye thingumy causing it”. What do you do now? All you’ve got to go on is a program name in the Oracle session tables of “nqsserver@MYSERVER (TNS V1-V3)” and the SQL the DBA sent you that if you’re lucky will look as presentable as this:

The username against the SQL is the generic User ID that you had created for connections to the database from OBIEE.

So you turn to Usage Tracking and discover that when that particular SQL ran there were twenty users all running reports. And not only that, but the SQL that’s recorded is the Logical SQL, not the physical SQL.

So how do you identify the originating report that spawned the SQL that broke the database that upset the DBA that phoned you? …

With a large hat-tip to Mark Rittman, here’s one thing you can do to help matters. Within the Connection Pool object in the RPD you can add statements to execute at the beginning of each connection. In this case, we can set the Client ID for the user running the request.

call dbms_session.set_identifier('VALUEOF(NQ_SESSION.USER)')



Now when you look at the queries from OBIEE running on the database you’ll see the Client ID column is populated :


This helps you trace SQL from the database back to the originating user.

My only question about this is with regards to connection pooling. The documentation states that the Execute on Connect is run “…each time a connection is made to the database.” – but if connection pooling is enabled then by definition the connection is re-used so the client ID will only be set for the first user into the connection pool. However this doesn’t seem to be the case as on the database I see different Client IDs against the same session.

November 11, 2009

#Fail: My Oracle Support

Filed under: oracle, rant, support — rmoff @ 15:45

Metalink was retired this weekend and made way for the new My Oracle Support system. It didn’t go as smoothly as it could have done.

This post is going to be a bit of a rambling rant.

Ultimately people, including me, don’t like their cheese being moved (not unless there’s a really runny piece of Camembert at the end of it). That makes it a bit more difficult to discuss because some of people’s complaints will just be geeks being stubborn (and boy, can geeks be stubborn). Arguments descend into minutiae of detail and flash vs DHTML – whilst the bigger picture gets lost.

People especially don’t like their cheese being moved (okay okay enough of the cheese) change to systems that they depend on to do their job. If it were the migration of a blogging website or somesuch then it’d be a bummer, there’d be grumbling about it, but ultimately people would probably be quite sanguine about it. When it comes to a support website though, it has to be available.

If this were a system that we delivered to our users then we’d (hopefully) get laughed out the building and/or strung up. It stinks, and there’s no denying it. Maybe once upon a time the concept was a good one, but somewhere along the line looks overtook functionality and someone in charge forgot that this wasn’t a beauty contest but a support website relied on by many many people for doing their jobs. Some of the new functionality (and it is there) in MOS is quite neat — but I only discover it by accident because most of the time I’m waiting for the s#dding thing to load or respond to a mouse click.

I can understand a marketing agency designing some krazy kool website to sell junk food to kidz using lots of flash and clever code, and the benefit (whizzy effects impressing target audience) outweighs the disadvantage (lower spec’d PCs can’t display it properly or at reasonable speed). But a support website? C’mon! It’s a support website! It should work in Lynx (maybe not quite). It was apparently tested on a 2 gig / 3Ghz PC – I’d suggest that’s hardly standard fare yet.

I want to go to the Oracle support website and get support. I shouldn’t have to attend training or webinars to use a website. If I do, then the website’s badly designed. Seriously. And enough with the rambling waffly emails already. I get enough emails everyday that any communication about Metalink/MOS needs to be clear and concise. It doesn’t need BS in it about a “Leveraging the personalized, proactive, and collaborative support capabilities […] reduce the time you spend maintaining Oracle solutions” (literal quotation).

As an OBIEE user I’ve already been using My Oracle Support after Metalink3 was discontinued a few months ago. After that migration I raised several non-technical SRs reporting various problems, and almost always got a response with the implication that I was doing something wrong or needed helping, rather than the impression that I’d reported a bug which needed fixing.
Somehow, and I’d have hoped this would come from within the organisation, bugs reported by customers need to go straight to Dev, rather than the customer fobbed off. And I was fobbed off without a doubt. Next time I shall not bother reporting problems because it’s not worth the time I spend on it.

Sr. Customer Support Manager Chris Warticki at Oracle has blogged about the cutover:

There’s another blog from Support here.

OUG survey

OUG are running a survey until 19th Nov:

Last weekend, Oracle closed the current Metalink service and migrated the users to My Oracle Support.
UKOUG has had reports from its membership and from across EMEA of a number of problems in this migration.
In order to enter into dialogue with Oracle on this, we would appreciate it if you could complete the following very short survey.

You can find the survey here

Footnote – non-flash My Oracle Support

There is a non-flash version of My Oracle Support at http://supporthtml.oracle.com. However from where I am I can’t login directly (see errors below)

You might be able to get in indirectly on this link.

Clicking the Home link when going in on this link or trying to login from http://supporthtml.oracle.com gives 500 Internal Server error on IE and “Recursive error in error-page calling for /secure/error.jspx, see the application log for details. ” in FireFox.

Looking at http://supporthtml.oracle.com and having used the flash version for a while now the non-flash version looks pretty similar. More effort’s gone into its appearance than I’d expect for a site that’s been knocked out in HTML as a purely-functional alternative to the main flash site.
It’s evidently not fully functional yet but I wonder if someone’s taken the wise idea to do the rewrite in non-flash and will ditch the flash version at some point in the future?

Follow up

It looks like things are stabilising a bit, although I still get inconsistent results when using supporthtml.oracle.com.

Some more blogs about the problems:

September 4, 2009

Metalink 3 followup

Filed under: metalink — rmoff @ 07:47

Kudos to the My Oracle Support blog for taking the time to respond to my to my comment about searching for Metalink 3 SRs throwing an error.

In essence, if you previously used Metalink 3 you must use https://support.oracle.com. If you use https://metalink.oracle.com/ then you’ll hit the problems I did.

ML3 SR’s are not supported on https://metalink.oracle.com/CSP/ui/index.html. This front end is used for support on legacy server technology, middleware including BEA, and EBusinessSuite

Full details here

August 24, 2009

OBIEE error/message code reference

Filed under: obiee, oracle, support — rmoff @ 06:44

For some reason Oracle haven’t put out a 10.x version of the error & message codes reference guide for OBIEE, but the previous version for Siebel Analytics is still useful:

August 20, 2009

Do you mean (pt II)

Filed under: metalink, silly, support — rmoff @ 08:52

A follow up to my previous post about Metalink’s “Do you mean” feature, this one made me laugh:
didyoumean

I shall miss this kind of thing when Metalink3 merges into My Oracle Support….

Meep meep!

road-runner-1

July 24, 2009

Metalink 3 – Do You Mean … ?

Filed under: metalink, silly, support — rmoff @ 08:42

One of my little gripes with Metalink is its purporting to be helpful when it’s blatantly not. Here’s one:

Now which is more likely, on Metalink 3; that I’m searching for sawserver (integral component to OBIEE), or sqlserver?!

Older Posts »

Blog at WordPress.com.