[CentOS] ECC RAM Error

Thu Oct 11 14:30:48 UTC 2007
Centos <centos at unixplanet.biz>

Thank you Lance,

We will change the memory to see if it is resolving the problem.
that storage only has basic linux kernel , which unfortunately does not 
carry memtest86.



Lance Davis wrote:
> On Thu, 11 Oct 2007, Centos wrote:
>
>> do you think replacing ram will solve our problem ?
>
> assuming it is RAM gone faulty and not some other issue then it should.
>
>> how can I make sure it is the ram ?
>
> memtest86 ??
>
> Regards
> Lance
>
>>
>>
>> Lance Davis wrote:
>>> On Thu, 11 Oct 2007, Centos wrote:
>>>
>>> > The ECC errors only happens when I am transferring data from other 
>>> > storage to this one that we get error.
>>> > it only happens when it is writing data to it.
>>>
>>> Well that is when it is detected ...
>>>
>>> As I said ECC RAM errors are concerned with an error in storage - 
>>> not an
>>> error in transmission.
>>>
>>> Regards
>>> Lance
>>>
>>>
>>>
>>> > > > Lance Davis wrote:
>>> > > On Thu, 11 Oct 2007, Centos wrote:
>>> > > > > > was wondering if it is safe to use the device, until we 
>>> receive RAM.
>>> > > > that device is our main storage.
>>> > > > > does data retransmit when ECC errors happen.
>>> > > > I don't want to have data corruption.
>>> > > > > You are not talking about data transission - but storage
>>> > > > > If two or more bit errors occur then ECC is not able to 
>>> correct them > > and
>>> > > you are likely to get data corruption.
>>> > > > > Regards
>>> > > Lance
>>> > > > > > > > > Matthew Miller wrote:
>>> > > > > On Thu, Oct 11, 2007 at 09:57:12AM -0300, Centos wrote:
>>> > > > > > > > Has any one have any experience in ECC RAM Errors.
>>> > > > > > we are seeing ECC fault Errors but I am not sure if it can 
>>> be > > > > related to > RAM it self or
>>> > > > > > it is related to bad connection and noise.
>>> > > > > > please let me know if you have a good document regarding 
>>> ECC > > Errors,
>>> > > > > > specially I want to know if data will be retransmitted 
>>> when > > error > > > happens.
>>> > > > > > 02:00:31, Thursday, 10/11/2007
>>> > > > > > : EXCEPTION: ECC Error Interrupt (Two or more Bit Error)
>>> > > > > > 0C18:00020001 0C68:00000000 Lcause:74630001 Lerr:1C855F82
>>> > > > > > > > > > Change the memory; see if the errors persist.
>>> > > > > > > >
>>>>>> _______________________________________________
>>> > > > CentOS mailing list
>>> > > > CentOS at centos.org
>>> > > > http://lists.centos.org/mailman/listinfo/centos
>>> > > > >
>>>> _______________________________________________
>>> > CentOS mailing list
>>> > CentOS at centos.org
>>> > http://lists.centos.org/mailman/listinfo/centos
>>> >
>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
>>
>