Andrew, I have a couple of questions regarding your problem in the hope of helping. This first thing is I'm curious about your exact configuration.
Are you using Async IO? Have you tried to disable it? The BUG that you are getting makes this suspicious to me. Based on your environment you may see a minor peroformance hit, but in most cases it's not huge, and, a slightly slower and running database would seem to be better than a crashed one. Also it would give a clue as to the problem if that fixed it.
What filesystem?
I suspect your RAID card is some PERC/4 variant. Are you using megaraid or megaraid2? What version of this driver?
An interesting note is that Redhat has a whitepaper on their "crash" utility that includes a case study of debugging a kernel issue that causes a ""kernel BUG at pipe.c:120!". The case study doesn't say what the final issue was, but it might be worth an email to the author to see what the issue was and whether the fix has made it into the standard kernel, SuSE kernel yet.
Have you tried using a custom kernel compiled from stock sources? I know it wouldn't be certified but hey, the certified kernel is giving you a crash every ~72 hours.
You've probably answered these before on the list. I'll look back over the archives but these were just things that popped out of my head.
Later, Tom
-- --Original Message-- -- From: McAllister, Andrew [mailto:McAllisterA@(protected)] Sent: Tue 03/02/2004 4:02 PM To: Thomas.Fragstein@(protected); suse-oracle@(protected) Cc: Subject: RE: [suse-oracle] [OT] Proliant DL740 hangs with sles8 (k_smp 2.4.21 -190) I'm sorry, I don't understand.
Regarding swap. We have 8 gig of swap in 4x2gig partitions. But it doesn't really matter, our swap use has never gone above 300 megabytes. We had 8 gig of RAM, but removed 4 gig trying to eliminate potential problems. Still swap use is never above 300Gig.
In /etc/sysconfig/oracle VM_MAPPED_RATIO=1000 AIO_MAX_SIZE=262144
Andy
> -- --Original Message-- -- > From: Thomas.Fragstein@(protected) > [mailto:Thomas.Fragstein@(protected)] > Sent: Tuesday, March 02, 2004 2:56 PM > To: McAllister, Andrew; dragos.delcea@(protected); suse-oracle@(protected) > Subject: AW: [suse-oracle] [OT] Proliant DL740 hangs with > sles8 (k_smp 2.4.21-190) > > > Please test, when the system has an high load on the kswap > deamon the can > you help an new kernel! > > I has the problem solved on this way > > Sorry for my broken English > > Bye > Thomas > > -- --Urspr�ngliche Nachricht-- -- > Von: McAllister, Andrew [mailto:McAllisterA@(protected)] > Gesendet: Dienstag, 2. M�rz 2004 17:19 > An: Dragos Delcea; suse-oracle@(protected) > Betreff: RE: [suse-oracle] [OT] Proliant DL740 hangs with sles8 (k_smp > 2.4.21-190) > > Welcome to the club. > > Our Dell 6650's running Oracle 9.2.0.3 crash every 72-96 hours. Same > symptom as you, hard hang no error logs, not even to the > serial port. We > also see pipe.c:120 bugs in the logs about 12 hours before a crash. > Also, some shell scripts hang about 24 hours before a crash. > snip
-- To unsubscribe, email: suse-oracle-unsubscribe@(protected) For additional commands, email: suse-oracle-help@(protected) Please see http://www.suse.com/oracle/ before posting