[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [rtl] SMP error
Hi Stuart,
I completely agree with you, it's most likely to be a hardware problem.
It could be:
A. Some kind of malfunction in hardware (bad CPUs, bad memory or ???);
OR
B. Some kind of incompatibility in hardware (buggy BIOS, some special features etc.).
In case of A it's not worth anybody's time. In case of B it might be interesting to find out the cause and may be even some workaround.
I believe it is more likely to be case B. We have two incidents of this SMP problem - on Surya hardware (dual Pentium-II, PR440FX chipset) and on my hardware (dual Pentium Pro, Intel 82440FX chipset).
My system:
Genuine Intel Buckeye B440FX motherboard, MP V1.4 spec compatible
Dual Pentium Pro 200MHz, 256K L2, same stepping, stepping is compatible with B440FX motherboard, CPUs are not overclocked, same versions of local APICs and IOAPIC (Version 17).
128MB EDO DRAM 60ns, checked with comprehensive Checkit memory test (DOS), never fails on kernel compile (I am not aware of any Linux memory test other than gcc :-).
The regular SMP linux is very stable on this machine, both Red-Hat 6.1 pre-built linux-2.2.12smp and custom built smp linux-2.2.13. I have run SMP tests from http://www.keylabs.com/linux/linux_tools.html (loop5d), both CPUs are shown to be loaded up to 100%. Regular linux never crashed. Though it still doesn't mean anything as I am not running it in "production" multi-user heavy-load environment.
System "oddity":
CPUs are on separate board with the "Brittany" connector;
PCI-to-PCI bridge (two PCI buses);
Fault-Resilient-Boot feature (now disabled);
plenty of system monitoring features (Intel high-end servers);
problem with PCI passive release in 440FX (patch applied, seems to be irrelevant).
Please don't get me wrong, I am not blaming rtlinux nor rtai in anything, I am a big admirer of these great development efforts that provide me a lot of motivation to learn. I am just posting the report of my problem to the rtl-list in hope that it might be helpful if somebody else runs into similar problem.
Thank you,
Sergey.
P.S. By the way, are you aware of any good SMP stress test for linux?
----- Original Message -----
From: Stuart Hughes <sehughes@zentropix.com>
To: Sergey Osechinskiy <sergeyo@worldnet.att.net>
Cc: <yodaiken@fsmlabs.com>; <rtl@rtlinux.org>
Sent: Friday, April 14, 2000 2:30 AM
Subject: Re: [rtl] SMP error
> Hi Sergey,
>
> It sounds like you are onto something when you mentioned that the
> machine it self could have some hardware problems. Motherboard quality
> is highly variable, also you are supposed to match the stepping on the
> CPU's too (Intel has a stepping compatibilitly list somewhere on their
> website). I'm not saying that your hardware is 'bad' but I think that
> it is worth considering this as an option. Is SMP rock solid on this
> machine under regular linux ???
>
> Regards, Stuart
>