3

Most of the user space processes are ending up in D-state after the unit runs for around 3-4 days, the unit is running on ARM processor. From the top o/p we can see that processes that are in D-state are waiting on system calls "page_fault" and "squashfs_readpage". Utimately this leads to a watchdog reset. The processes that go into D-sate would take unusually long time to recover.

Following is the top o/p when the system ends up in trouble:

top - 12:00:11 up 3 days,  2:40,  3 users,  load average: 2.77, 1.90, 1.72
Tasks: 250 total,   3 running, 238 sleeping,   0 stopped,   9 zombie
Cpu(s): 10.0% us, 75.5% sy,  0.0% ni,  0.0% id, 10.3% wa,  0.0% hi,  4.2% si
Mem:    191324k total,   188896k used,     2428k free,     2548k buffers
Swap:        0k total,        0k used,        0k free,    87920k cached



 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

 1003 root      20   0  225m  31m 4044 S 15.2 16.7   0:21.91 user_process_1             
 3745 root      20   0 80776 9476 3196 **D**  9.0  5.0   1:31.79 user_process_2                
  129 root      15  -5     0    0    0 S  7.4  0.0   0:27.65 **mtdblockd**          
 4624 root      20   0  3640  256  160 **D**  6.5  0.1   0:00.20 GetCounters_cus    
    3 root      15  -5     0    0    0 S  3.2  0.0  43:38.73 ksoftirqd/0        
31363 root      20   0  2356 1176  792 R  2.6  0.6  40:09.58 top                
  347 root      30  10     0    0    0 S  1.9  0.0  28:56.04 **jffs2_gcd_mtd3**     
 1169 root      20   0  225m  31m 4044 S  1.9 16.7  39:31.36 user_process_1             
  604 root      20   0     0    0    0 S  1.6  0.0  27:22.76 user_process_3           
 1069 root     -23   0  225m  31m 4044 S  1.3 16.7  20:45.39 user_process_1             
 4545 root      20   0  3640  564  468 S  1.0  0.3   0:00.08 GetCounters_cus    
   64 root      15  -5     0    0    0 **D**  0.3  0.0   0:00.83 **kswapd0**            
  969 root      20   0 20780 1856 1376 S  0.3  1.0  14:18.89 user_process_4             
  973 root      20   0  225m  31m 4044 S  0.3 16.7   3:35.74 user_process_1             
 1070 root     -23   0  225m  31m 4044 S  0.3 16.7  16:41.04 user_process_1             
 1151 root     -81   0  225m  31m 4044 S  0.3 16.7  23:13.05 user_process_1             
 1152 root     -99   0  225m  31m 4044 S  0.3 16.7   8:48.47 user_process_1  

One more interesting observation is that when the system lands up in this problem, we can consistently see "mtdblockd" process running in the top o/p. We have swap disabled on this unit. there is no apparent memory leak in the unit.

Any idea what could be the possible reasons, the processes are stuck in D-sates?

  • Probably I/O errors on the flash, and bugs in the squashfs error handling code. – CL. Jun 19 '13 at 09:11
  • We have replace the flash device and have tested the scenario again, however there is no change in the bheavior. One thing that still remains unanswered is why this happening always after 3 days. I will check on the Squashfs error handling. – user2500217 Jun 19 '13 at 10:34
  • I have checked for any squashfs related errors in the log, but there is none like that. The only problem seen here is processes are going into D-state because squashfs_readpage is taking more time than usual to complete. Can this happen? – user2500217 Jun 19 '13 at 11:52
  • The squashfw error handling code would log errors if it were working correctly. Do these squashfs_readpage calls ever complete? – CL. Jun 19 '13 at 11:54
  • Yes, they complete but will take longer time than usual – user2500217 Jun 19 '13 at 12:00

1 Answers1

3

D-state means the processes are stuck in the kernel in a TASK_UNINTERRUPTIBLE sleep, this is unlikely to be bugs in the Squashfs error handling code because if a process exited Squashfs holding a mutex, the system would quickly grind to a halt as other processes entered Squashfs and slept forever waiting for the mutex. You would also see a low load average/system time as most processes would be sleeping. Furthermore there is no evidence Squashfs has hit any I/O errors.

Load average (2.77) and system time (75.5%) is extremely high, coupled with the fact a lot of processes are in Squashfs_readpage (which is completing but slow), indicates the system is thrashing. There is too little memory and the system is spending all it's time constantly (re-)demand paging pages from disk. This will account for the fact a lot of processes are in Squashfs_readpage, system time is extremely high because the system is spending most of its time in Squashfs in the CPU intensive task of decompression. The other processes are stuck in Squashfs waiting on the decompressor mutex (only one process can be decompressing at a time because the decompressor state is shared).

  • Phil, you have hit the nail on the head!! Your observation and explanation perfectly fits the problem that we are seeing in our system. There is a lot of thrashing and memory crunch is also being observed. But the irony is we were not seeing any memory leaks in the system. Yesterday we found out that a daemon is continously writing in /tmp directory(tmpfs) which is mounted on RAM. This is continously eating up memory. I suppose this could be the root cause of this issue. Do you agree? – user2500217 Jun 20 '13 at 08:14
  • Yes it is highly probable that is the root cause. Tmpfs filesystem content lives entirely in page cache, and is non shrinkable, which will cause exactly the kind of page cache pressure and flushing which you are seeing. – Phillip Lougher Jun 20 '13 at 15:03
  • Thanks phil, we made correction in the code and we will update on the results. – user2500217 Jun 21 '13 at 07:00
  • This problem is fixed with this change. Thanks a lot! – user2500217 Jul 02 '13 at 12:36