[Devel] Re: megaraid_mbox: garbage in file

James Bottomley James.Bottomley at SteelEye.com
Fri May 5 08:59:22 PDT 2006


On Fri, 2006-05-05 at 09:37 +0400, Vasily Averin wrote:
> The issue is that the correctly finished scsi read command return me garbage
> (repeated 0 ...127 -- see hexdump in my first letter) instead correct file content.
> "attempt to access beyond end of device" messages occurs due the same garbage
> readed from the Indirect block. I found this garbage present in data buffers
> beginning at megaraid driver functions.
> 
> I would note that if I read the same file by using dd with bs=1024 or bs=512 --
> I get correct file content.
> 
> When I use kernel with 4Gb memory limit -- the same cat command return me
> correct file content too, without any garbage.
> 
> Question is what it is the strange garbage? Have you seen it earlier?
> Is it possible that it is some driver-related issue or it is broken hardware?
> And why I can workaround this issue by using only 4Gb memory?

This is really odd ... if the controller can't reach *any* memory above
32 bits, then, on an 8GB machine you'd expect corruption all over the
place since most user pages come from the top of highmem.

The first thing to try, since you have an opteron system, is to get rid
of highmem entirely and use a 64 bit kernel (just to make sure we're not
running into some annoying dma_addr_t conversion problem).  Then, I
suppose if that doesn't work, try printing out the actual contents of
the sg list to see what the physical memory location of the page
containing the corrupt block is.

This could also be a firmware problem, I suppose, but I haven't seen any
similar reports.

James





More information about the Devel mailing list