How to decide min_free_kbytes size for BBB based custom board embedded Linux board

We are using BeagleBoneBlack based custom Linux board.
It has 256MB of RAM and 4GB of eMMC.
Currently RFS size of the project is 163MB. While RFS partition size is 500MB.
For testing, we added 20 number of big files(10MB size) and started firmware upgrade process.

During the firmware Upgrade process we see following error when roofs is being written,

We could solve it by changing
`

/proc/sys/vm/min_free_kbytes

`

from 2005 to 4096.

But now my doubt is what should be the ideal value for that, what factors we should consider while calculating it. From the kernel documentation I don’t get that information,
but I could understand one thing that is this value can not be too low or too high or else system will break.

Any suggestion/pointer ?

[ 6676.674219] mmcqd/1: page allocation failure: order:1, mode:0x200020 [ 6676.674256] CPU: 0 PID: 612 Comm: mmcqd/1 Tainted: P O 3.12.10-005-ts-armv7l #2 [ 6676.674321] [<c0012d24>] (unwind_backtrace+0x0/0xf4) from [<c0011130>] (show_stack+0x10/0x14) [ 6676.674355] [<c0011130>] (show_stack+0x10/0x14) from [<c0087548>] (warn_alloc_failed+0xe0/0x118) [ 6676.674383] [<c0087548>] (warn_alloc_failed+0xe0/0x118) from [<c008a3ac>] (__alloc_pages_nodemask+0x74c/0x8f8) [ 6676.674413] [<c008a3ac>] (__alloc_pages_nodemask+0x74c/0x8f8) from [<c00b2e8c>] (cache_alloc_refill+0x328/0x620) [ 6676.674436] [<c00b2e8c>] (cache_alloc_refill+0x328/0x620) from [<c00b3224>] (__kmalloc+0xa0/0xe8) [ 6676.674471] [<c00b3224>] (__kmalloc+0xa0/0xe8) from [<c0212904>] (edma_prep_slave_sg+0x84/0x388) [ 6676.674505] [<c0212904>] (edma_prep_slave_sg+0x84/0x388) from [<c02ec0a0>] (omap_hsmmc_request+0x414/0x508) [ 6676.674544] [<c02ec0a0>] (omap_hsmmc_request+0x414/0x508) from [<c02d6748>] (mmc_start_request+0xc4/0xe0) [ 6676.674568] [<c02d6748>] (mmc_start_request+0xc4/0xe0) from [<c02d7530>] (mmc_start_req+0x2d8/0x38c) [ 6676.674589] [<c02d7530>] (mmc_start_req+0x2d8/0x38c) from [<c02e4818>] (mmc_blk_issue_rw_rq+0xb4/0x9d8) [ 6676.674611] [<c02e4818>] (mmc_blk_issue_rw_rq+0xb4/0x9d8) from [<c02e52e0>] (mmc_blk_issue_rq+0x1a4/0x468) [ 6676.674631] [<c02e52e0>] (mmc_blk_issue_rq+0x1a4/0x468) from [<c02e5c68>] (mmc_queue_thread+0x88/0x118) [ 6676.674657] [<c02e5c68>] (mmc_queue_thread+0x88/0x118) from [<c004d8b8>] (kthread+0xb4/0xb8) [ 6676.674681] [<c004d8b8>] (kthread+0xb4/0xb8) from [<c000e298>] (ret_from_fork+0x14/0x3c) [ 6676.674691] Mem-info: [ 6676.674700] Normal per-cpu: [ 6676.674711] CPU 0: hi: 90, btch: 15 usd: 79 [ 6676.674739] active_anon:4889 inactive_anon:13 isolated_anon:0 [ 6676.674739] active_file:8082 inactive_file:43196 isolated_file:0 [ 6676.674739] unevictable:422 dirty:2 writeback:1152 unstable:0 [ 6676.674739] free:3286 slab_reclaimable:1090 slab_unreclaimable:915 [ 6676.674739] mapped:1593 shmem:39 pagetables:181 bounce:0 [ 6676.674739] free_cma:1982 [ 6676.674800] Normal free:13144kB min:2004kB low:2504kB high:3004kB active_anon:19556kB inactive_anon:52kB active_file:32328kB inactive_file:172784kB unevictable:o [ 6676.674813] lowmem_reserve[]: 0 0 0 [ 6676.674831] Normal: 2584*4kB (UMC) 217*8kB (C) 57*16kB (C) 5*32kB (C) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = 13144kB [ 6676.674885] 51661 total pagecache pages [ 6676.674900] 0 pages in swap cache [ 6676.674910] Swap cache stats: add 0, delete 0, find 0/0 [ 6676.674918] Free swap = 0kB [ 6676.674925] Total swap = 0kB [ 6676.674938] SLAB: Unable to allocate memory on node 0 (gfp=0x20) [ 6676.674949] cache: kmalloc-8192, object size: 8192, order: 1 [ 6676.674962] node 0: slabs: 3/3, objs: 3/3, free: 0 [ 6676.674984] omap_hsmmc 481d8000.mmc: prep_slave_sg() failed [ 6676.674997] omap_hsmmc 481d8000.mmc: MMC start dma failure [ 6676.676181] mmcblk0: unknown error -1 sending read/write command, card status 0x900 [ 6676.676300] end_request: I/O error, dev mmcblk0, sector 27648 [ 6676.676318] Buffer I/O error on device mmcblk0p9, logical block 896 [ 6676.676329] lost page write due to I/O error on mmcblk0p9 [ 6676.676401] end_request: I/O error, dev mmcblk0, sector 27656 [ 6676.676415] Buffer I/O error on device mmcblk0p9, logical block 897 [ 6676.676425] lost page write due to I/O error on mmcblk0p9 [ 6676.676450] end_request: I/O error, dev mmcblk0, sector 27664 [ 6676.676461] Buffer I/O error on device mmcblk0p9, logical block 898 [ 6676.676471] lost page write due to I/O error on mmcblk0p9 [ 6676.676494] end_request: I/O error, dev mmcblk0, sector 27672 [ 6676.676505] Buffer I/O error on device mmcblk0p9, logical block 899 [ 6676.676515] lost page write due to I/O error on mmcblk0p9 [ 6676.676537] end_request: I/O error, dev mmcblk0, sector 27680 [ 6676.676548] Buffer I/O error on device mmcblk0p9, logical block 900 [ 6676.676558] lost page write due to I/O error on mmcblk0p9 [ 6676.676580] end_request: I/O error, dev mmcblk0, sector 27688 [ 6676.676591] Buffer I/O error on device mmcblk0p9, logical block 901 [ 6676.676601] lost page write due to I/O error on mmcblk0p9 [ 6676.676622] end_request: I/O error, dev mmcblk0, sector 27696 [ 6676.676634] Buffer I/O error on device mmcblk0p9, logical block 902 [ 6676.676643] lost page write due to I/O error on mmcblk0p9 [ 6676.676665] end_request: I/O error, dev mmcblk0, sector 27704 [ 6676.676676] Buffer I/O error on device mmcblk0p9, logical block 903 [ 6676.676685] lost page write due to I/O error on mmcblk0p9 [ 6676.676707] end_request: I/O error, dev mmcblk0, sector 27712 [ 6676.676718] Buffer I/O error on device mmcblk0p9, logical block 904 [ 6676.676728] lost page write due to I/O error on mmcblk0p9 [ 6676.676749] end_request: I/O error, dev mmcblk0, sector 27720 [ 6676.678266] mmcqd/1: page allocation failure: order:1, mode:0x200020 [ 6676.678285] CPU: 0 PID: 612 Comm: mmcqd/1 Tainted: P O 3.12.10-005-ts-armv7l #2 [ 6676.678330] [<c0012d24>] (unwind_backtrace+0x0/0xf4) from [<c0011130>] (show_stack+0x10/0x14) [ 6676.678358] [<c0011130>] (show_stack+0x10/0x14) from [<c0087548>] (warn_alloc_failed+0xe0/0x118) [ 6676.678385] [<c0087548>] (warn_alloc_failed+0xe0/0x118) from [<c008a3ac>] (__alloc_pages_nodemask+0x74c/0x8f8) [ 6676.678412] [<c008a3ac>] (__alloc_pages_nodemask+0x74c/0x8f8) from [<c00b2e8c>] (cache_alloc_refill+0x328/0x620) [ 6676.678434] [<c00b2e8c>] (cache_alloc_refill+0x328/0x620) from [<c00b3224>] (__kmalloc+0xa0/0xe8) [ 6676.678464] [<c00b3224>] (__kmalloc+0xa0/0xe8) from [<c0212904>] (edma_prep_slave_sg+0x84/0x388) [ 6676.678493] [<c0212904>] (edma_prep_slave_sg+0x84/0x388) from [<c02ec0a0>] (omap_hsmmc_request+0x414/0x508) [ 6676.678524] [<c02ec0a0>] (omap_hsmmc_request+0x414/0x508) from [<c02d6748>] (mmc_start_request+0xc4/0xe0) [ 6676.678547] [<c02d6748>] (mmc_start_request+0xc4/0xe0) from [<c02d7530>] (mmc_start_req+0x2d8/0x38c) [ 6676.678568] [<c02d7530>] (mmc_start_req+0x2d8/0x38c) from [<c02e4994>] (mmc_blk_issue_rw_rq+0x230/0x9d8) [ 6676.678589] [<c02e4994>] (mmc_blk_issue_rw_rq+0x230/0x9d8) from [<c02e52e0>] (mmc_blk_issue_rq+0x1a4/0x468) [ 6676.678608] [<c02e52e0>] (mmc_blk_issue_rq+0x1a4/0x468) from [<c02e5c68>] (mmc_queue_thread+0x88/0x118) [ 6676.678632] [<c02e5c68>] (mmc_queue_thread+0x88/0x118) from [<c004d8b8>] (kthread+0xb4/0xb8) [ 6676.678655] [<c004d8b8>] (kthread+0xb4/0xb8) from [<c000e298>] (ret_from_fork+0x14/0x3c) [ 6676.678664] Mem-info: [ 6676.678672] Normal per-cpu: [ 6676.678683] CPU 0: hi: 90, btch: 15 usd: 84 [ 6676.678709] active_anon:4889 inactive_anon:13 isolated_anon:0 [ 6676.678709] active_file:8082 inactive_file:43196 isolated_file:0 [ 6676.678709] unevictable:422 dirty:2 writeback:896 unstable:0 [ 6676.678709] free:3286 slab_reclaimable:1090 slab_unreclaimable:910 [ 6676.678709] mapped:1593 shmem:39 pagetables:181 bounce:0 [ 6676.678709] free_cma:1982 [ 6676.678764] Normal free:13144kB min:2004kB low:2504kB high:3004kB active_anon:19556kB inactive_anon:52kB active_file:32328kB inactive_file:172784kB unevictable:o [ 6676.678776] lowmem_reserve[]: 0 0 0 [ 6676.678791] Normal: 2584*4kB (UMC) 217*8kB (C) 57*16kB (C) 5*32kB (C) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = 13144kB [ 6676.678842] 51661 total pagecache pages [ 6676.678854] 0 pages in swap cache [ 6676.678864] Swap cache stats: add 0, delete 0, find 0/0 [ 6676.678871] Free swap = 0kB [ 6676.678878] Total swap = 0kB [ 6676.678898] omap_hsmmc 481d8000.mmc: prep_slave_sg() failed [ 6676.678911] omap_hsmmc 481d8000.mmc: MMC start dma failure [ 6676.679631] mmcblk0: unknown error -1 sending read/write command, card status 0x900 [ 6676.681433] mmcqd/1: page allocation failure: order:1, mode:0x200020 [ 6676.681455] CPU: 0 PID: 612 Comm: mmcqd/1 Tainted: P O 3.12.10-005-ts-armv7l #2 [ 6676.681494] [<c0012d24>] (unwind_backtrace+0x0/0xf4) from [<c0011130>] (show_stack+0x10/0x14) [ 6676.681523] [<c0011130>] (show_stack+0x10/0x14) from [<c0087548>] (warn_alloc_failed+0xe0/0x118) [ 6676.681546] [<c0087548>] (warn_alloc_failed+0xe0/0x118) from [<c008a3ac>] (__alloc_pages_nodemask+0x74c/0x8f8) [ 6676.681570] [<c008a3ac>] (__alloc_pages_nodemask+0x74c/0x8f8) from [<c00b2e8c>] (cache_alloc_refill+0x328/0x620) [ 6676.681592] [<c00b2e8c>] (cache_alloc_refill+0x328/0x620) from [<c00b3224>] (__kmalloc+0xa0/0xe8) [ 6676.681618] [<c00b3224>] (__kmalloc+0xa0/0xe8) from [<c0212904>] (edma_prep_slave_sg+0x84/0x388) [ 6676.681644] [<c0212904>] (edma_prep_slave_sg+0x84/0x388) from [<c02ec0a0>] (omap_hsmmc_request+0x414/0x508) [ 6676.681673] [<c02ec0a0>] (omap_hsmmc_request+0x414/0x508) from [<c02d6748>] (mmc_start_request+0xc4/0xe0) [ 6676.681695] [<c02d6748>] (mmc_start_request+0xc4/0xe0) from [<c02d7530>] (mmc_start_req+0x2d8/0x38c) [ 6676.681715] [<c02d7530>] (mmc_start_req+0x2d8/0x38c) from [<c02e4994>] (mmc_blk_issue_rw_rq+0x230/0x9d8) [ 6676.681735] [<c02e4994>] (mmc_blk_issue_rw_rq+0x230/0x9d8) from [<c02e52e0>] (mmc_blk_issue_rq+0x1a4/0x468) [ 6676.681755] [<c02e52e0>] (mmc_blk_issue_rq+0x1a4/0x468) from [<c02e5c68>] (mmc_queue_thread+0x88/0x118) [ 6676.681778] [<c02e5c68>] (mmc_queue_thread+0x88/0x118) from [<c004d8b8>] (kthread+0xb4/0xb8) [ 6676.681800] [<c004d8b8>] (kthread+0xb4/0xb8) from [<c000e298>] (ret_from_fork+0x14/0x3c) [ 6676.681809] Mem-info: [ 6676.681816] Normal per-cpu: [ 6676.681826] CPU 0: hi: 90, btch: 15 usd: 88 [ 6676.681852] active_anon:4889 inactive_anon:13 isolated_anon:0 [ 6676.681852] active_file:8082 inactive_file:43196 isolated_file:0 [ 6676.681852] unevictable:422 dirty:2 writeback:768 unstable:0 [ 6676.681852] free:3286 slab_reclaimable:1090 slab_unreclaimable:906 [ 6676.681852] mapped:1593 shmem:39 pagetables:181 bounce:0 [ 6676.681852] free_cma:1982 [ 6676.681908] Normal free:13144kB min:2004kB low:2504kB high:3004kB active_anon:19556kB inactive_anon:52kB active_file:32328kB inactive_file:172784kB unevictable:o 2016-07-19T06:47:28.562553-04:00 kernel: [ 6676.681920] lowmem_reserve[]: 0 0 0

I’m not sure why you feel compelled to even bother playing with this. In 20-21 years of using Linux I’ve never had to set this manually. But . . .

https://www.google.com/search?q=%2Fproc%2Fsys%2Fvm%2Fmin_free_kbytes&ie=utf-8 ---->
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html ---->

This actually smells very much like the random mmc issues we saw on "3.8.x"
based images... mmc wasn't really fixed/solid till the 3.14.x timeline...
Not sure how much of that was back-ported to 3.12.10...

Regards,

Thank you for reply Willian Hermans,
I already referred links you provided below and that’s the reason I turned to the forums to understand If I a doing anything wrong.
OR What is the right approach.

Thank you for reply Willian Hermans,
I already referred links you provided below and that’s the reason I turned to the forums to understand If I a doing anything wrong.
OR What is the right approach.

Ok, it has been my experience that it is best to let Linux handle such things as it sees fit. Granted this is the first time I’ve actually seen, or heard of this. Which tells me that again, this is something best left to Linux to work out.

I did years ago attempt to manually manage something similar ( in relation to GbE and frame sizes etc ) in an attempt to get better performance from my network adapters . . . and it did not end well. As I ended up making things much worse than what Linux was doing automatically. This option looks very similar in that you can easily end up making things much worse than they already are.

Thank you for reply Mr. Nelson,

Were there any patches or fixes your remember for 3.12 kernel ?
We will move to 4.1 kernel but its in pipeline and current boards which we are shipping should work with 3.12.
Considering that do you have any suggestions for us to fix this issue ?

While testing I also observed that after mentioned error I saw following error in the partition from where board was booting.

`
[ 194.912834] EXT4-fs error (device mmcblk0p15): ext4_journal_check_start:56: Detected aborted journal
[ 194.922558] EXT4-fs (mmcblk0p15): Remounting filesystem read-only

`

This error was recovered after rebooting board. However I think we must handle this case.
I thinking of running e2fsck and reboot the board when error is received during mounting partition.
But what if this errors not recovered at all ?
do you suggest on that?

Regards,
Ankur

Thank you reply Mr.Hermans,

I totally agree with you, and me too not willing to change that settings(or not willing to change it without understanding how to decide its value).However I have to fix those errors also, that’s the reason I posted my query here.

Regards,
Ankur

Hello Mr.Nelson,

This issue is reproducible in 4.1 kernel also.
I found the use case where it is almost every time reproducible.

  1. Conider that we have two RFS partition, and current root is /dev/mmcblk0p15. Another RFS partition /dev/mmcblk0p16 is not in use.
  2. Create RFS tar of 390MB(compressed size. Uncompressed size is 487MB) of size. I had include some big files in it. (I had copied 35 files of size 10MB each).
  3. Copy this big tar e.g. rootfs.tar.gz to board.
  4. Mount unused partition i.e. /dev/mmcblk0p16 to /mnt/rfs_test.
  5. Delete content from /mnt/rfs_test/, using command

rm -rf /mnt/rfs_test/*

  1. Untar rootfs.tar.gz to /mnt/rfs_test/

tar -xzvf /home/rootfs.tar.gz -C /mnt/rfs_test/

  1. From another ssh terminal run

while [ true ];do dd if=/dev/mmcblk0p16 of=/dev/null bs=1M; done
8. From yet another terminal observe the syslog, or dmesg

while [ ture ] ; do clear; dmesg | tail -n 30; sleep 2;done

  1. Before tar extraction completes you will see error reproduced.

So It doesn’t seem to be mmc driver issue, isn’t it ?
Do you have any suggestion/pointer for me ?

Thank you,

Regards,
Ankur

Forgot to copy kernel messages for 4.1 kernel
`

[ 4531.561953] mmcqd/1: page allocation failure: order:3, mode:0x204020
[ 4531.561989] CPU: 0 PID: 607 Comm: mmcqd/1 Not tainted 4.1.18-005-ts-armv7l #2
[ 4531.562001] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 4531.562063] [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[ 4531.562090] [] (show_stack) from [] (warn_alloc_failed+0xe4/0x120)
[ 4531.562115] [] (warn_alloc_failed) from [] (__alloc_pages_nodemask+0x53c/0x83c)
[ 4531.562139] [] (__alloc_pages_nodemask) from [] (cache_alloc_refill+0x2c8/0x53c)
[ 4531.562159] [] (cache_alloc_refill) from [] (__kmalloc+0xa8/0xe8)
[ 4531.562191] [] (__kmalloc) from [] (edma_prep_slave_sg+0x8c/0x2d4)
[ 4531.562221] [] (edma_prep_slave_sg) from [] (omap_hsmmc_request+0x420/0x50c)
[ 4531.562254] [] (omap_hsmmc_request) from [] (mmc_start_request+0xf4/0x11c)
[ 4531.562277] [] (mmc_start_request) from [] (mmc_start_req+0x288/0x394)
[ 4531.562308] [] (mmc_start_req) from [] (mmc_blk_issue_rw_rq+0xb4/0xaac)
[ 4531.562332] [] (mmc_blk_issue_rw_rq) from [] (mmc_blk_issue_rq+0xf8/0x4a8)
[ 4531.562351] [] (mmc_blk_issue_rq) from [] (mmc_queue_thread+0x94/0x130)
[ 4531.562377] [] (mmc_queue_thread) from [] (kthread+0xd4/0xec)
[ 4531.562399] [] (kthread) from [] (ret_from_fork+0x14/0x2c)
[ 4531.562409] Mem-Info:
[ 4531.562437] active_anon:4260 inactive_anon:17 isolated_anon:0
[ 4531.562437] active_file:13673 inactive_file:38058 isolated_file:0
[ 4531.562437] unevictable:422 dirty:1384 writeback:695 unstable:0
[ 4531.562437] slab_reclaimable:1709 slab_unreclaimable:1539
[ 4531.562437] mapped:1154 shmem:39 pagetables:170 bounce:0
[ 4531.562437] free:2420 free_pcp:54 free_cma:1600
[ 4531.562496] Normal free:9680kB min:2004kB low:2504kB high:3004kB active_anon:17040kB inactive_anon:68kB active_file:54692kB inactive_file:152232kB unevictable:1688kB isolated(anon):0kB isolated(file):0kB present:261120kB managed:251972kB mlocked:1688kB dirty:5536kB writeback:2780kB mapped:4616kB shmem:156kB slab_reclaimable:6836kB slab_unreclaimable:6156kB kernel_stack:880kB pagetables:680kB unstable:0kB bounce:0kB free_pcp:216kB local_pcp:216kB free_cma:6400kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 4531.562509] lowmem_reserve[]: 0 0 0
[ 4531.562527] Normal: 6064kB (UEMRC) 4538kB (UMRC) 16516kB © 2932kB © 164kB © 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB 08192kB = 9680kB
[ 4531.562584] 52118 total pagecache pages
[ 4531.562600] 0 pages in swap cache
[ 4531.562609] Swap cache stats: add 0, delete 0, find 0/0
[ 4531.562617] Free swap = 0kB
[ 4531.562624] Total swap = 0kB
[ 4531.562631] 65280 pages RAM
[ 4531.562639] 0 pages HighMem/MovableOnly
[ 4531.562646] 4294965487 pages reserved
[ 4531.562654] 4096 pages cma reserved
[ 4531.562696] edma-dma-engine edma-dma-engine.0: edma_prep_slave_sg: Failed to allocate a descriptor
[ 4531.562714] omap_hsmmc 481d8000.mmc: prep_slave_sg() failed
[ 4531.562726] omap_hsmmc 481d8000.mmc: MMC start dma failure
[ 4531.571575] mmcblk0: unknown error -1 sending read/write command, card status 0x900
[ 4531.571679] mmc1: tried to reset card
[ 4531.571696] blk_update_request: 229 callbacks suppressed
[ 4531.571707] blk_update_request: I/O error, dev mmcblk0, sector 1357784
[ 4531.571734] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71780 (offset 0 size 0 starting block 678895)
[ 4531.571751] buffer_io_error: 396 callbacks suppressed
[ 4531.571764] Buffer I/O error on device mmcblk0p16, logical block 138220
[ 4531.571775] Buffer I/O error on device mmcblk0p16, logical block 138221
[ 4531.571786] Buffer I/O error on device mmcblk0p16, logical block 138222
[ 4531.572243] blk_update_request: I/O error, dev mmcblk0, sector 1357790
[ 4531.572265] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71781 (offset 0 size 0 starting block 678898)
[ 4531.572282] Buffer I/O error on device mmcblk0p16, logical block 138223
[ 4531.572293] Buffer I/O error on device mmcblk0p16, logical block 138224
[ 4531.572304] Buffer I/O error on device mmcblk0p16, logical block 138225
[ 4531.572498] blk_update_request: I/O error, dev mmcblk0, sector 1357796
[ 4531.572518] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71782 (offset 0 size 0 starting block 678901)
[ 4531.572534] Buffer I/O error on device mmcblk0p16, logical block 138226
[ 4531.572545] Buffer I/O error on device mmcblk0p16, logical block 138227
[ 4531.572556] Buffer I/O error on device mmcblk0p16, logical block 138228
[ 4531.572703] blk_update_request: I/O error, dev mmcblk0, sector 1357802
[ 4531.572720] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71785 (offset 0 size 0 starting block 678904)
[ 4531.572734] Buffer I/O error on device mmcblk0p16, logical block 138229
[ 4531.572863] blk_update_request: I/O error, dev mmcblk0, sector 1357808
[ 4531.572881] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71786 (offset 0 size 0 starting block 678907)
[ 4531.573010] blk_update_request: I/O error, dev mmcblk0, sector 1357814
[ 4531.573027] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71787 (offset 0 size 0 starting block 678910)
[ 4531.573155] blk_update_request: I/O error, dev mmcblk0, sector 1357820
[ 4531.573172] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71788 (offset 0 size 0 starting block 678913)
[ 4531.573299] blk_update_request: I/O error, dev mmcblk0, sector 1357826
[ 4531.573316] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71789 (offset 0 size 0 starting block 678916)
[ 4531.573442] blk_update_request: I/O error, dev mmcblk0, sector 1357832
[ 4531.573459] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71790 (offset 0 size 0 starting block 678919)
[ 4531.573583] blk_update_request: I/O error, dev mmcblk0, sector 1357838
[ 4531.573600] EXT4-fs warning (device mmcblk0p16): ext4_end_bio:332: I/O error -5 writing to inode 71791 (offset 0 size 0 starting block 678922)
`

Hello Mr. Nelson,

We made some progress, With below two patches and setting CMA=64, Issue is not getting reproduced.

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/dma/edma.c?id=5ca9e7ce6eebec53362ff779264143860ccf68cd
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/dma/edma.c?id=e3ddc979465118b7ba46fed7cd10f4421edc3049

After this it seems to work, however now issue is tar file(size 460MB) extraction takes 1 hour as opposed to 4 mins without above patches.

Above two patches are mostly freeing memory in the edma driver, so root cause seems to be memory leak in the edma driver(kernel version 3.12).
Even if above two patches are solving issue they are giving major performance hit, do you know any other kernel patch for edma driver for kernel 3.12 ?

Thank you,

Regards,
Ankur

Sorry Ankur,

I've been focusing on v4.9.x, does this probem still exist on a v4.9.x
based kernel?

Regards,

I haven’t got a chance to test it in Linux 4.9, And I don’t see a possibility of checking it soon,

Once I test it, I will update status here.

Regards,
Ankur