[Beaglebone] SD Card Corruption on Read Only File System

Not sure that it is of interest to anyone, but the 2 cards have now written over 2 Billions blocks

First of all, how big is each block. You should set a write block to
the size of a filesystem cluster that way you do not have
fragmentation and that will every write, you're guaranteed to modify
the entire block array.

Second, Make sure that your write operation flushes to the sd-card so
that modified data wont live in system cache.

Whatever you do, a memory cell usually fail at 200,000 writes. Figure
out how the size of your sd card, the sector size, and cluster size;
basic math will tell you how many cluster write you will need to
perform until failure.

Thanks for the feedback.

I do not know the size of the blocks. What I was looking to find out
was if there really was wear leveling on these cards (as they fail
after a few weeks, I wanted to check this).

Wear leveling on the card is very hard to detect, in my experience.
It's up to the controller embedded in the card on how the writes take
place physically.

There should not be any fragmentation anyway as I am writing the same
file over and over.

Don't be sure of this, the controller may do a read-copy-write
operation and that may involve garbage collection on the flash itself,
depending on the erase block size and the size of your write. Your
file system may not be fragmented but the data on flash may be.

On flash, you can't simply overwrite a physical erase block, you have to
fully erase it first so usually a controller will copy the existing
good bits out of an erase block, append the new data, and write it
somewhere else. The old erase block now can be erased.

See Arnd's article in lwn [1].

[1]:Optimizing Linux with cheap flash drives [LWN.net]

I used sync to mount the card in order to make it gets written every
time. Before doing so, iostat would increment only every 5 sec, when
the cache was flushed to the card. After that, iostat would
increment continuously.

Watch out for wear increasing with sync mounts. You're now writing much
more often when not relying on the kernel buffers. Yes, it's safer in
the sense that if power goes away the card has the most right data on
it, but now you're doing writes (and hence the read-copy-write / garbage
collection above) potentially much more often.

If you want SD/MMC flash cards that do wear leveling, sign an NDA from
the card vendor and get the data sheet / papers on how they do it. Or
buy some really tiny probes and probe a raw die to see what it's doing
(definitely possible but probably not cheap).

-Andrew

One of the 2 cards has failed after a bit more than 2 billion block
writes which corresponds to writing the same file 21 million times.
My block size (reported by frisk -l) is 512 bytes and the card size
is 4GbB with at least 2.8GB free.

fdisk is wrong.

Usual erase block sizes for 4 GB SD cards are 1 to 8 MB. Cards usually
have hundreds or thousands of erase blocks, at most.

Read Arnd's lwn article [1]. Use flashbench [2] to determine what your
card has for erase block size.

[1]:Optimizing Linux with cheap flash drives [LWN.net]
[2]:arnd/flashbench.git - Tool for benchmarking and classifying flash memory drives

Am I wrong to assume the following:
(2.8GB / 512) x 200 000 = max number of block writes (1 094 billions)

See above. You're also not accounting for read-copy-write that might
take place depending on what data already exists at the physical erase
block the card decides to write to (possibly prompting garbage
collection which may have more writes than you expect).

If the above is correct, the card has failed a lot earlier than it
can be expected. Hence something else is killing these cards or the
wear leveling is not working properly.

Most of the cheap cards have poor, if any, wear leveling routines.

-Andrew

One of the 2 cards has failed after a bit more than 2 billion block
writes which corresponds to writing the same file 21 million times.
My block size (reported by frisk -l) is 512 bytes and the card size
is 4GbB with at least 2.8GB free.

fdisk is wrong.

[snip]

If the above is correct, the card has failed a lot earlier than it
can be expected. Hence something else is killing these cards or the
wear leveling is not working properly.

Most of the cheap cards have poor, if any, wear leveling routines.

I'm wrestling probably the same issues. 2/3 of my uSD cards now cannot
be backed up or restored (usind dd if=/dev/mmcblk0 etc.), one doesn't
seem to boot the other hangs the command line on bone for long periods.
My application writes small files pretty often.

This page:

contains comments pointing out that wear leveling and power cycle tolerance
are conflicting goals, and probably even fewer cards get both right.

And there I was thinking all this mess was sorted out inside the little
plastic wafer.

Andrew, do you happen to be aware of anyone who sells SD cards that *do*
have good leveling/power cycle tolerance and *are* believed to work well
with the bone? I'd be happy to pay.

Thanks,
Britton

>
>> One of the 2 cards has failed after a bit more than 2 billion block
>> writes which corresponds to writing the same file 21 million times.
>> My block size (reported by frisk -l) is 512 bytes and the card size
>> is 4GbB with at least 2.8GB free.
>
> fdisk is wrong.

[snip]

>> If the above is correct, the card has failed a lot earlier than it
>> can be expected. Hence something else is killing these cards or
>> the wear leveling is not working properly.
>
> Most of the cheap cards have poor, if any, wear leveling routines.

I'm wrestling probably the same issues. 2/3 of my uSD cards now
cannot be backed up or restored (usind dd if=/dev/mmcblk0 etc.), one
doesn't seem to boot the other hangs the command line on bone for
long periods. My application writes small files pretty often.

What cards are you using?

This page:
Is it true that a SD/MMC Card does wear levelling with its own controller? - Electrical Engineering Stack Exchange
contains comments pointing out that wear leveling and power cycle
tolerance are conflicting goals, and probably even fewer cards get
both right.

I don't think wear leveling and power cycle tolerance have anything to
do with each other.

Wear leveling is just writing data to erase blocks that have had less
writes rather than putting it somewhere else. Advanced modes may
migrate static data out of low write-count blocks in order to even
things out further. The goal is to wear all blocks at the same rate,
such that no single block fails much before any other.

Power cycle tolerance, as talked about on that stackexchange thread,
isn't quite right. If you are sure you've flushed the kernel buffers
and waited for the card to have written everything, it doesn't really
matter if the data was written in a way that is good at wear leveling
or bad. If you pull power with the kernel buffers having data not yet
written, or only partially written out, then you'll run into fun file
system check issues on next boot, again regardless of wear leveling.

Don't pull power till you're sure the transactions with the card have
completed. Or, simply use a read only file system or boot from a
tmpfs/ramdisk.

And there I was thinking all this mess was sorted out inside the
little plastic wafer.

In the expensive ones, yes, probably it'll wear level at least in a
rudimentary way. In the cheap cards, I wouldn't count on it. SD cards
are super price conscious, if it's not required by the spec (it's not)
many manufacturers of the lower end won't do it.

Cameras write in a nice linear fashion (usually), that's what the SD
spec is written for. This crazy random writes that Linux (or any other
non-camera) does is a brave new world for little SD cards.

Andrew, do you happen to be aware of anyone who sells SD cards that
*do* have good leveling/power cycle tolerance and *are* believed to
work well with the bone? I'd be happy to pay.

Samsung Plus 8 GB uSD cards are generally considered quite good. I
personally like the SanDisk mobile ultra 4 GB uSD cards the best but the
larger sizes of this card are not as good (and 32 GB version has some
major possible issues, avoid those). I've used both Samsung Plus and
SanDisk ultra mobile on bones with no issues.

If you're in the USA, BestBuy.com has 4 GB SanDisk mobile ultra cards
for $4.99 each on sale.

-Andrew

You replied only to me but I'm replying to you and the beagle list.

>
>> > Most of the cheap cards have poor, if any, wear leveling
>> > routines.
>>
>> I'm wrestling probably the same issues. 2/3 of my uSD cards now
>> cannot be backed up or restored (usind dd if=/dev/mmcblk0 etc.),
>> one doesn't seem to boot the other hangs the command line on bone
>> for long periods. My application writes small files pretty often.
>
> What cards are you using?

Kingston 4 GB Class 4 Can't dd onto or off of card
PNY 4GB Class 4 Can't dd onto or off of card
kngston 4 GB Class 4 works but hasn't seen nearly as many writes

Those are cheap cards. I wouldn't use them.

My rule of thumb is to use SD cards from companies who also own a
semiconductor fab. Usually you'll get decent quality stuff following
that rule. To get good stuff, buy as many cards as you can and test
them to find out what they really do.

None of these cards has seen anywhere near enough writing to get close
to the normal limits of flash memory assuming even slightly sane wear
leveling. Unless Angstrom is doing something insane that I don't know
about (they have considerable uptime).

I doubt any of those cards do any kind of wear leveling, sane or not.
Angstrom, or any other OS, has no control over the wear leveling on an
SD card, the internal controller to the card handles that.

I think maybe bones should ship with a different, better card if
possible.

That's not as easy as it sounds.

>> This page:
>> Is it true that a SD/MMC Card does wear levelling with its own controller? - Electrical Engineering Stack Exchange
>> contains comments pointing out that wear leveling and power cycle
>> tolerance are conflicting goals, and probably even fewer cards get
>> both right.
>
> I don't think wear leveling and power cycle tolerance have anything
> to do with each other.
>
> Wear leveling is just writing data to erase blocks that have had
> less writes rather than putting it somewhere else. Advanced modes
> may migrate static data out of low write-count blocks in order to
> even things out further. The goal is to wear all blocks at the
> same rate, such that no single block fails much before any other.

I would hope you would be correct, but if the migration process fails
to correctly use some sort of atomic lock then the posts on that page
saying it could corrupt any data on the disk (not just what's being
written) could be true. This panasonic ad implies (FUDs?) that many
cards aren't power-cycle tolerant:
http://panasonic.net/avc/sdcard/industrial_sd/function.html

During a garbage collection or other background operations, yes, if
power is lost you can lose data or end up with a disk where the state
is not know even by the controller. That is a risk with any flash based
storage.

Background operations might happen at any time, including garbage
collection or rewriting static data. This doesn't really have anything
to do with wear leveling other than some wear leveling algorithms may
do these background operations as well as other kinds of algorithms.
So my earlier statement about wear leveling and power loss sensitivity
isn't completely right, but rarely will SD cards do intense background
operations or wear leveling schemes like this. It's cost (dollars)
prohibitive to implement such abilities when 99% of customers won't
ever need them.

Unless you can get a data sheet on an SD card that describes the way it
does wear leveling and other background operations, we're all just
stabbing in the dark. Maybe Panasonic's cards do a great job, but I
have no data on that (and I assume if I did I wouldn't be allowed to
share it due to NDA).

I'm not sure where those Panasonic microSD cards can even be purchased.

> Cameras write in a nice linear fashion (usually), that's what the SD
> spec is written for. This crazy random writes that Linux (or any
> other non-camera) does is a brave new world for little SD cards.

Yes it seems like cameras would need to worry about it less. But.
My camera recently silently ate a bunch of pictures. And my droid SD
card has now entirely quit working. All together I've pretty well
lost all belief in SD card reliability unless somebody is ready to
promise otherwise.

No one's ready to promise otherwise :slight_smile:
At least not me.

>> Andrew, do you happen to be aware of anyone who sells SD cards that
>> *do* have good leveling/power cycle tolerance and *are* believed to
>> work well with the bone? I'd be happy to pay.
>
> Samsung Plus 8 GB uSD cards are generally considered quite good. I
> personally like the SanDisk mobile ultra 4 GB uSD cards the best
> but the larger sizes of this card are not as good (and 32 GB
> version has some major possible issues, avoid those). I've used
> both Samsung Plus and SanDisk ultra mobile on bones with no issues.
>
> If you're in the USA, BestBuy.com has 4 GB SanDisk mobile ultra
> cards for $4.99 each on sale.

Thanks, I'll try these. On a related note I see the blackbone has 2
DB eMMC which is supposedly more reliable, I wonder if it boots from
there out of the box or is an easy setup.

eMMC should work better than cheap SD cards. There's other benefits of
eMMC, too, mainly that the bus is wider so read throughput can be higher
than SDv2.00 (the write throughput on the eMMC used on the black isn't
stellar[1], assuming the BOM hasn't changed since then). Setup, if
you don't buy a black that boots from eMMC directly, is easy.

[1]:http://lists.linaro.org/pipermail/flashbench-results/2013-January/000353.html

-Andrew

I think that you might have misunderstood my goal.
My goal was to find out if there was some wear leveling on the
kingston card shipped with the beagle bone.
This is why I used sync to write as much as possible to the SD card.

The result is that there is certainly wear leveling on that card as I
have overwritten the same file 21 million times. If there was no
wear leveling, it should have failed much earlier.

The ability to write to the same block multiple times more than you
think you should does not indicate wear leveling. It may simply
indicate that there was a read-copy-write operation. All cards will do
this, it's required by the physics of flash when attempting to write
back to the same block data that won't fit based on the page sizes
(sorry, I'm not good at explaining this).

Wear leveling is doing this activity with a stated goal of optimizing
the write cycles on each erase block. Doing wear leveling intelligently
will lead to longer life. Simply doing read-copy-write operations on a
write to a block does not imply wear leveling.

If the controller was really simply overwriting the same erase block
every time, performance would really suck. You'd have to read out the
entire erase block, erase it, and write back the new data. This would
require quite a large amount of cache (at least 1 erase block's worth,
measured in MB). The controllers in cheap SD cards have a few kB of
RAM, at most, as cost is very critical. Thus, they don't simply write
back to the same erase block but move the useful data along with the new
write to another erase block and then go back and erase the now
no-longer-needed erase block after the operation completes. This gives
decent write performance with small caches. It looks like wear
leveling at a high level but it's not.

-Andrew

>> This page:
>> Is it true that a SD/MMC Card does wear levelling with its own controller? - Electrical Engineering Stack Exchange
>> contains comments pointing out that wear leveling and power cycle
>> tolerance are conflicting goals, and probably even fewer cards get
>> both right.
>
> I don't think wear leveling and power cycle tolerance have anything
> to do with each other.
>
> Wear leveling is just writing data to erase blocks that have had
> less writes rather than putting it somewhere else. Advanced modes
> may migrate static data out of low write-count blocks in order to
> even things out further. The goal is to wear all blocks at the
> same rate, such that no single block fails much before any other.

I would hope you would be correct, but if the migration process fails
to correctly use some sort of atomic lock then the posts on that page
saying it could corrupt any data on the disk (not just what's being
written) could be true. This panasonic ad implies (FUDs?) that many
cards aren't power-cycle tolerant:
http://panasonic.net/avc/sdcard/industrial_sd/function.html

During a garbage collection or other background operations, yes, if
power is lost you can lose data or end up with a disk where the state
is not know even by the controller. That is a risk with any flash based
storage.

Isn't it possible to handle this (at least theoretically) with an atomic
write that certifies that a (possibly larger) recent write has completed?
This is how databases work if I understand right, I would have guessed that
the firmware on the SD cards would do the same sort of thing. Or is this
not possible with flash?

Thanks very much for all your info on this stuff.

Britton

It's possible with flash. I have no idea of telling if a controller
does it or not though, at least not without probing the part in
question and taking a logic analyzer to it.

Now you have me interested in doing this... :slight_smile:

-Andrew

It only makes sense if the wear leveling actually works better and
the erase block sizes stay small. A better bet would be, if you're
going to deploy new images, use any SD card you want but not write to
it. Boot to a ramdisk as the root file system, then all your problems
go away :slight_smile:

Boot will possibly take longer, some logging ability will be lost (or
will become more convoluted), but graceful powerdown becomes literally
pull the plug.

Regarding deployment, cycle through your customers. Build 25 units or
so and mail them out asking customers to mail the defunct units back
after they receive the new one. That way your customers don't have down
time. Then repair the 25 you get back and repeat.

-Andrew

Well ironically this morning I've had many more pack it in. This is
a bit of a disaster. Unfortunately my remote units are tamper proof
with no one allowed to go into the enclosures. I've started reading
up on Ram disks. I'm using Ubuntu 12.10 on the boards. Do you have
any good references for setting up a RAM disk?

Google has a lot of recommendations:
https://www.google.com/search?q=root+on+ramdisk

-Andrew

Is it enough to only ramdisk the known write activity directories
like /var/log and /tmp or are there other places where write activity
will occur and therefore it is safer to ramdisk the entire root? I'm
wondering because 256M is not very much memory and Ubuntu 12.10 is
not that small.

Running just the highly written directories in a ramdisk or tmpfs is a
good half measure. To be safest, put the full root fs on ram disk.

Use something like Angstrom or BuildRoot to construct your root fs.
Depending on your needs, a root fs of 10s of MB is possible without
much effort using either of those. Ubuntu (and Debian) are bloated
pigs for running root on a ram disk (but they are decent choices when
not).

I'm noticing that Raspberry Pi is also encountering much of the same
problem with Sd card corruption.

Did anyone every figure out why a read-only filesystem is still
causing problems as an earlier poster already remarked?

Not that I know of. Most likely the file system wasn't as read only as
the user thought.

I've also found at this
link<http://cxcv.de/post/34356721648/fixing-raspberry-pi-sd-card-issues&gt;
mention of a fix for SD card corruption with the Raspberry Pi based
on core frequency. I believe the Beaglebone switches clock speed
based on power from usb or 5V barrel. Are there are situations that
cause the BB to fluctuate clock speed and perhaps this is causing the
issue with SD card failure.

It shouldn't but if you want to disable freq scaling, it's easy, just
don't install the cpufreq tools and disable freq scaling in your
kernel. I run my bones at 720 MHz all the time, the power savings of
the lower frequencies isn't worth the hassle for me. If that saves SD
cards, too, that's a nice side benefit. I've never tested the impact
to the SD card of frequency scaling.

I'm throwing everything at the wall to see what sticks because I need
to pick a route pronto and figure out how to make the BB bulletproof
within the next 12 hours.

Don't rush a fix. Find what works, sweet talk the customers, and
deploy as quickly as you can, but don't rush something out that's going
to have issues without testing that you've actually fixed anything and
not broken something else.

-Andrew

You might also want to look at the so called industrial rated SD cards, e.g.

http://www.delkinoem.com/secure-digital-industrial.html

or

http://swissbit.com/index.php?option=com_content&view=article&id=194&Itemid=62

I’m investigating these for our datalogger design that’s based on the BB.

Also interested to hear of any others using industrial rated SD cards.

You might also want to look at the so called industrial rated SD
cards, e.g.

http://www.delkinoem.com/secure-digital-industrial.html

or

Security and storage solutions for industrial applications - Swissbit

I'm investigating these for our datalogger design that's based on the
BB. Also interested to hear of any others using industrial rated SD
cards.

Expect to pay quite a premium for "industrial" SLC cards.
I'm possibly awaiting a few ATP cards to test. If they arrive I'll post
life test and flashbench results.

As a point of reference, one ATP card some of our Windows guys were
looking to get was the ATP 32 GB industrial full size SD [1] and it
priced out around $250 per unit for a few hundred unit order. CDW has
ATP 2 GB industrial cards for $39 [2]. Compare this with 4 GB SanDisk
Mobile Ultra consumer level cards for $5 and you'll get a rough idea of
the price multiplier.

[1]:ATP Electronics | The Global Leader in Specialized Storage and Memory Solutions
[2]:ATP Industrial Grade - flash memory card - 2 GB - SD - AF2GSDI - Flash Memory Cards - CDW.com

-Andrew

I’m not an SD card expert, but I think that if there is ANY writeable partition on an SD card, then they are all, in a sense,
writeable, as the pool of erase blocks is constantly being recycled. So, you THINK that you have a partition that
is read-only, but when another partition is written to, blocks from here and there are gathered and moved to another
place by the wear leveling and/or block erase mechanism, including those from the supposedly read-only partition.

Jon

Don’t forget that the card has to do it’s internal housekeeping (bad block assignment etc)., so corruption can occur if it loses power whilst doing one of these operations.

This was asked awhile ago but I did not see a clear response. Playing around with the Beaglebone black I have had corruption on the SD card while not on the eMMC.

If I can minimize my app to work purely on the onboard 2Gb am I guaranteed (nearly) very little corruption, or is it going to have the same failure style just further down the road. When/if this happens it would seem the Beagle black would be ‘bricked’ except for then depending upon external SD cards with periodic replacements.

This was asked awhile ago but I did not see a clear response. Playing
around with the Beaglebone black I have had corruption on the SD card
while not on the eMMC.

If I can minimize my app to work purely on the onboard 2Gb am I
guaranteed (nearly) very little corruption, or is it going to have
the same failure style just further down the road. When/if this
happens it would seem the Beagle black would be 'bricked' except for
then depending upon external SD cards with periodic replacements.

There's no guarantee.

The black won't be bricked, you can always boot from SD. If you have a
hot air rework station, you can remove the eMMC from the board
(granted, putting a new one down is a tad trickier).

If you write to flash a lot, any kind of flash, eventually erase blocks
will wear out and it won't take writes without corruption. Write less
or if you need to write a lot, get a spinning rust disk or other way to
get data off without hitting flash.

The eMMC on the black is the lowest end device Micron makes. But, that
being said, I'd expect its wear leveling routines and other background
operations to be quite lot better than Kingston SD cards. If you're
interested in the flashbench results, see [1] (4 bit mode) and [2] (8
bit mode). Erase block size is 2 MiB, so align your partitions to that
and you will see probably slightly better life and performance.

[1]:http://lists.linaro.org/pipermail/flashbench-results/2013-January/000353.html
[2]:http://lists.linaro.org/pipermail/flashbench-results/2013-February/000355.html

Also see Arnd's recommendations [3] in response. Don't use ext3.

[3]:http://lists.linaro.org/pipermail/flashbench-results/2013-February/000360.html

-Andrew