=?gb2312?B?UmU6IFJFOiBbc3VzZS1vcmFjbGVdIFtPVF0gUHJvbGlhbnQgREw3NDAgaGFuZ3Mgd2l0a 2004-03-06 - By zhuchao
Since you hit a lot of problem using SUSE, Can you try redhat? We are using redhat 2.1 on production , with 2 6650 nodes for more than 1 year, no problem so far.
And we used 6600/redhat 2.1 too, no problem. Oracle was 9.2.0.1 and later upgraded to 9204.
> Today we upgraded to the most recent Dell BIOS A14 and verified that the Megaraid bios was up-to-date as well. Then we installed the latest megaraid driver from the LSI site, version 2.10.1. Re-ran our stress tests, and 450 loads in, we are still getting the Oracle errors. Our next step is to remove the 6650 and replace it with a 2650. We've never had an error on the stress test 2650. > > We've done a lot of research on the Dell sites and elsewhere. There are several people having trouble with 6650s, usually with Linux, some with Windows . But I suppose the windows people think a system hang is normal (sorry, couldn 't resist). We've already been told by Dell that Suse isn't supported. But my company just bought over 60 servers from Dell, so I'm going to bump this up the Dell food chain. > > Andy > > __ ____ ____ ____ ____ ____ ____ > > From: Miquel Colom [mailto:m.colom@(protected)] > Sent: Fri 3/5/2004 3:56 AM > To: McAllister, Andrew > Cc: Michael Hasenstein; suse-oracle@(protected); Sightler, Tom > Subject: RE: [suse-oracle] [OT] Proliant DL740 hangs with sles8 (k_smp 2.4.21 -190) > > > > Hello Andrew > > Your assumptions seems correct to me: > > 1-Megaraid driver issue. > 2-MotherBoard issue. > > Have you tried asking or researching at the Dell poweredge list in > lists.us.dell.com? > > Best regards > > Miquel Colom Piz? > Director Area Técnica > Dept. Sistemas Hotelbeds S.L. > > c-Joan Muntaner i Bordoy s/n Bjs. > 07006 - Palma de Mallorca > Telf. +34 971178839 > Fax. +34 971465062 > > > > > "McAllister, Andrew" <McAllisterA@(protected)> > 04/03/2004 23:47 > > > Para: <suse-oracle@(protected)> > cc: "Michael Hasenstein" <mha@(protected)>, "Sightler, Tom" > <tsightler@(protected)> > Asunto: RE: [suse-oracle] [OT] Proliant DL740 hangs with sles8 (k_smp 2.4.21-190) > > > Update on this thread... > > Our test 2650 has been running flawlessly for over 24 hours with over > 7000 of our stress test data loads and a load average of 7.5+. No errors > of any kind. > Config is as follows: > Dell 2650, SuSE SLES 8 SP3 fully patched by YOU in automatic mode, > kernel 2.4.21-190_smp, 2 GB RAM, Adaptec RAID on the motherboard, > aacraid driver, LVM, reiserfs, Oracle 9.2.0.3, async IO turned on. Max > open files per user 1024. > > Our standby 6650 has been running for 24 hours and is performing VERY > poorly. Latest change on this box was to disable async IO. Only 170 > stress test loads have finished in the last 20 hours. This compared to > the 2650 above which has completed 7000 loads. We recently turned OFF > async IO and performance went into the dumpster. > Config is as follows: > Dell 6650, SuSE SLES 8 SP3 fully patched by YOU in automatic mode, > kernel 2.4.21-190_smp, 4 GB RAM, Perc4/DC (LSI Megaraid 320 dual > channel), megaraid2 2.00.8, NO LVM, reiserfs, Oracle 9.2.0.3, async IO > turned OFF. Max open files per user 1024. No errors but this may be > because the lack of async IO is keeping the box from experiencing "heavy > load". Two databases running on this box, one dataguard standby of our > production environment and one stress test database. > > Production 6650 has been running for 22 hours. Rebooted last night to > fix multiple oracle listener hangs and hung shell scripts with hung pipe > problems. During system reboot we ran an fsck on all 8 of our database > reiser file systems. There were no corruptions found. Database and > listener were restarted and our normal data load last night produced two > "ORA-12599 (See ORA-12599.ora-code.com): TNS:cryptographic checksum mismatch" and "ORA-03113 (See ORA-03113.ora-code.com) > end-of-file on communications channel" messages. If history is any > indication, we will get more and more of these errors each night for the > next 3 nights, while other system problems arise. Otherwise today > operations were mostly normal. > Config is as follows: > Dell 6650, SuSE SLES 8 SP3 fully patched by YOU in automatic mode, > kernel 2.4.21-190_smp, 4 GB RAM, Perc4/DC (LSI Megaraid 320 dual > channel), megaraid2 2.00.8, LVM active, reiserfs, Oracle 9.2.0.3, async > IO turned ON. Max open files ulimit set to 16384 for the oracle user > PRIOR to starting the database. > > So from this info I am making the following deductions: > 1) Reiserfs doesn't appear to be a problem. It is running on both > working and broken servers. Last reboot we fsck'd all mount points on a > broken system, no corruption found. > 2) Async IO doesn't appear to be a problem. It is running on broken and > working servers. > 3) Logical volume manager doesn't appear to be a problem, it is running > on broken and working servers. > 4) Max open files for oracle user at 1024 doesn't appear to be a > problem. It is set to 1024 on working and non-working systems and 16384 > on a non-working system. > 5) Broadcom GigE cards may or may not have contributed to the problems. > 6) Hyperthreading on or off, broken servers are still broken. > > Based on this I think that there is a fairly good chance (I HOPE) the > problem is: > 1) Dell 6650 chipset/motherboard issues with kernel 2.4.21-190_smp and > kin. > 2) Megaraid controller hardware or megaraid2 driver issues (running > 2.00.8 from SuSE while 2.10.0.1 is current from LSI) > 3) Some other kernel or driver problem that is not directly related to > anything above, but eventually causes a resource starvation that does > affect the components tested. I hope this isn't the problem. > > Next steps? I guess our next step is to go off the supported kernel rpm > and upgrade to the megaraid2 2.10.xx driver from LSI. Anyone else have > any other suggestions to help diagnose the problems? > > Thanks > Andy > > P.S. Michael, Oracle hasn't yet responded to our TAR updated last night > at 9:20pm. > > > > > -- > To unsubscribe, email: suse-oracle-unsubscribe@(protected) > For additional commands, email: suse-oracle-help@(protected) > Please see http://www.suse.com/oracle/ before posting > > > >
-- To unsubscribe, email: suse-oracle-unsubscribe@(protected) For additional commands, email: suse-oracle-help@(protected) Please see http://www.suse.com/oracle/ before posting
|
|