Fragmentation while extracting big tar file with 81000 4k size files on eMMC partition

Hello Experts,

We are seeing fragmentation and resultant eMMC partition corruption when extracting tar file with 81000 files of 4k size.

Detailed steps are listed in Test_steps.txt file and other attached tars contain required scripts to big tar file and other test scripts.

In test is very simple,

  1. we have tar with 81000 number of 4k size file
  2. we ran latest debian from Beagleboard.org from Micro SD Card
  3. Partitioned eMMC in two partitions
  4. Then mounted one of the partition and started extracting big tar file on that partition.

This results in below Error.

EXT4-fs error (device mmcblk1p2) in ext4_do_update_inode:4665: Journal has aborted
[12064.194479] EXT4-fs error (device mmcblk1p2) in add_dirent_to_buf:1921: Journal has aborted
[12064.209980] EXT4-fs error (device mmcblk1p2) in ext4_do_update_inode:4665: Journal has aborted
[12064.223765] EXT4-fs error (device mmcblk1p2) in ext4_create:2455: IO failure
[12064.238221] EXT4-fs error (device mmcblk1p2): ext4_journal_check_start:56: Detected aborted journal
[12064.247591] EXT4-fs (mmcblk1p2): Remounting filesystem read-only

Strange enough this test works fine on Angstrom OS image https://s3.amazonaws.com/angstrom/demo/beaglebone/Angstrom-Cloud9-IDE-GNOME-eglibc-ipk-v2012.12-beaglebone-2013.06.20.img.xz
That image has kernel 3.8, while latest debian has kernel 4.4.

I suspect some memory leak in the driver(edma.c).

We see this issue sporadically in field devices so this is not unrealistic error.
We got this use case to reproduce this issue.

Any suggestion, patch to solve this issue ?

Thank you,

Regards,
Ankur

run_on_bbb.tar.gz (1.82 KB)

run_on_linux_pc.tar.gz (606 Bytes)

Test_steps.txt (2.45 KB)

Hi Mr. Nelson,

if you have any suggestion for this, do let me know.

Thank you,

Regards,
Ankur

Hello Experts,

I have tried following,

  1. I tried changing various virtual memory parameters in /proc/sys/vm/ but didn’t have much success.
  2. I tried it in ti-processor-sdk-linux-am335x-evm-03.03.00.04 and it seems fail less frequent but it does fail there as well.
  3. In 3.12 kernel we had taken two patches mentioned below, which basically freeing some memory,

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/dma/edma.c?id=5ca9e7ce6eebec53362ff779264143860ccf68cd
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/dma/edma.c?id=e3ddc979465118b7ba46fed7cd10f4421edc3049

Though above patches solved the problem but it took lot of time to extract tar. without this patches it used to take 4m, and with patches it took 1 hour to extract tar, so we couldn’t use this patch.
4. Now as those patches are freeing memory in edma.c file I feel there could be memory leak in edma driver but i am unaware of full dma,edma driver so couldn’t debug.
I saw video about edma controller in TI website but still it is not so easy to go through edma driver.

Request someone to suggest how to get to root cause of this issue.

Thank you,

Regards,
Ankur

Hello

Hello Experts,

We are seeing fragmentation and resultant eMMC partition corruption when
extracting tar file with 81000 files of 4k size.

Detailed steps are listed in Test_steps.txt file and other attached tars
contain required scripts to big tar file and other test scripts.

In test is very simple,
1. we have tar with 81000 number of 4k size file
2. we ran latest debian from Beagleboard.org from Micro SD Card
3. Partitioned eMMC in two partitions
4. Then mounted one of the partition and started extracting big tar file on
that partition.

This results in below Error.
EXT4-fs error (device mmcblk1p2) in ext4_do_update_inode:4665: Journal has
aborted
[12064.194479] EXT4-fs error (device mmcblk1p2) in add_dirent_to_buf:1921:
Journal has aborted
[12064.209980] EXT4-fs error (device mmcblk1p2) in
ext4_do_update_inode:4665: Journal has aborted
[12064.223765] EXT4-fs error (device mmcblk1p2) in ext4_create:2455: IO
failure
[12064.238221] EXT4-fs error (device mmcblk1p2):
ext4_journal_check_start:56: Detected aborted journal
[12064.247591] EXT4-fs (mmcblk1p2): Remounting filesystem read-only

Did u check the the number of inodes on your device mmcblk1p2 (df -i).
The number _must_ be over 81000. You can tune the number of inodes
while formatting (man mkfs.ext4) with -N. You should play around with
-b and other parameters as well...

tar.gz is nothing else than a gzipped tar file. Did you try the same
experiment with a uncompressed tar having the same size ?

Strange enough this test works fine on Angstrom OS image

Maybe Angstrom mkfs produces a different filesystem with different # of inodes

Hi Dieter Wirz,

Thank you for reply,

I think there are enough inodes available for a partition you see output of tune2fs command.(my busybox df doesn’t support df -i command).

`

tune2fs -l /dev/mmcblk0p16 | grep -i inode

Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse
Inode count: 128016
Free inodes: 107941
Inodes per group: 2032
Inode blocks per group: 254
First inode: 11
Inode size: 128
Journal inode: 8
Journal backup: inode blocks
First error inode #: 0
Last error inode #: 0

`

I tested with tar file as well. Result is same as tar.gz file.

Thank you,

Regards,
Ankur