Inconsistent lock state

Question

I am a biginner of Linux kernel. I modified the scheduling sub-module of linux kernel 3.3 and tried to boot the kernel on Beagleboard. I met the error of "Inconsistent lock state". Anyone could help me to analyze the following debug infos? Thx!

[    0.163452] =================================
[    0.167999] [ INFO: inconsistent lock state ]
[    0.172576] 3.3.0-rc7-00008-g8bd3d32-dirty #27 Not tainted
[    0.178314] ---------------------------------
[    0.182891] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[    0.189178] swapper/0/0 [HC0[0]:SC0[0]:HE1:SE1] takes:
[    0.194549]  (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4
[    0.202117] {IN-HARDIRQ-W} state was registered at:
[    0.207214]   [<c008fa7c>] __lock_acquire+0xc4c/0x1e28
[    0.212615]   [<c00912dc>] lock_acquire+0x98/0x100
[    0.217651]   [<c04761bc>] _raw_spin_lock+0x2c/0x3c
[    0.222778]   [<c0072890>] scheduler_tick+0x34/0x134
[    0.227996]   [<c0052738>] update_process_times+0x58/0x68
[    0.233642]   [<c00881ac>] tick_periodic+0x48/0xc4
[    0.238677]   [<c00882c0>] tick_handle_periodic+0x24/0x98
[    0.244323]   [<c00264a0>] omap2_gp_timer_interrupt+0x24/0x34
[    0.250366]   [<c00a1344>] handle_irq_event_percpu+0x5c/0x22c
[    0.256378]   [<c00a1550>] handle_irq_event+0x3c/0x5c
[    0.261657]   [<c00a39f8>] handle_level_irq+0xac/0x138
[    0.267059]   [<c00a0b58>] generic_handle_irq+0x30/0x48
[    0.272521]   [<c0014d18>] handle_IRQ+0x4c/0xac
[    0.277313]   [<c000872c>] omap3_intc_handle_irq+0x44/0x4c
[    0.283050]   [<c0476a64>] __irq_svc+0x44/0x60
[    0.287719]   [<c06286e4>] start_kernel+0x204/0x354
[    0.292846]   [<80008044>] 0x80008044
[    0.296691] irq event stamp: 2802
[    0.300201] hardirqs last  enabled at (2801): [<c0476348>] _raw_write_unlock_irq+0x24/0x2c
[    0.308807] hardirqs last disabled at (2802): [<c0476294>] _raw_spin_lock_irqsave+0x1c/0x58
[    0.317504] softirqs last  enabled at (2756): [<c004a6dc>] irq_exit+0x94/0x9c
[    0.324951] softirqs last disabled at (2751): [<c004a6dc>] irq_exit+0x94/0x9c
[    0.332397] 
[    0.332397] other info that might help us debug this:
[    0.339294]  Possible unsafe locking scenario:
[    0.339294] 
[    0.345581]        CPU0
[    0.348175]        ----
[    0.350769]   lock(&rq->lock);
[    0.354003]   <Interrupt>
[    0.356781]     lock(&rq->lock);
[    0.360198] 
[    0.360198]  *** DEADLOCK ***
[    0.360229] 
[    0.366577] 2 locks held by swapper/0/0:
[    0.370697]  #0:  (&p->pi_lock){+.....}, at: [<c0073920>] wake_up_new_task+0x1c/0x1d4
[    0.378875]  #1:  (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4
[    0.386871] 
[    0.386871] stack backtrace:
[    0.391571] [<c001b7a8>] (unwind_backtrace+0x0/0xf0) from [<c008e6fc>] (print_usage_bug+0x1d8/0x)
[    0.401062] [<c008e6fc>] (print_usage_bug+0x1d8/0x2c0) from [<c008ebac>] (mark_lock+0x3c8/0x64c)
[    0.410217] [<c008ebac>] (mark_lock+0x3c8/0x64c) from [<c0091c24>] (mark_held_locks+0xb0/0x144)
[    0.419250] [<c0091c24>] (mark_held_locks+0xb0/0x144) from [<c0091d60>] (trace_hardirqs_on_calle)
[    0.429565] [<c0091d60>] (trace_hardirqs_on_caller+0xa8/0x19c) from [<c000f284>] (do_vfp+0x8/0x2)
[   12.759429] BUG: spinlock lockup on CPU#0, swapper/0/0
[   12.764801]  lock: c0de0380, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
[   12.772613] [<c001b7a8>] (unwind_backtrace+0x0/0xf0) from [<c0261600>] (do_raw_spin_lock+0xa0/0x)
[   12.782135] [<c0261600>] (do_raw_spin_lock+0xa0/0x134) from [<c0072890>] (scheduler_tick+0x34/0x)
[   12.791625] [<c0072890>] (scheduler_tick+0x34/0x134) from [<c0052738>] (update_process_times+0x5)
[   12.801391] [<c0052738>] (update_process_times+0x58/0x68) from [<c00881ac>] (tick_periodic+0x48/)
[   12.811004] [<c00881ac>] (tick_periodic+0x48/0xc4) from [<c00882c0>] (tick_handle_periodic+0x24/)
[   12.820587] [<c00882c0>] (tick_handle_periodic+0x24/0x98) from [<c00264a0>] (omap2_gp_timer_inte)
[   12.831176] [<c00264a0>] (omap2_gp_timer_interrupt+0x24/0x34) from [<c00a1344>] (handle_irq_even)
[   12.842102] [<c00a1344>] (handle_irq_event_percpu+0x5c/0x22c) from [<c00a1550>] (handle_irq_even)
[   12.852325] [<c00a1550>] (handle_irq_event+0x3c/0x5c) from [<c00a39f8>] (handle_level_irq+0xac/0)
[   12.861907] [<c00a39f8>] (handle_level_irq+0xac/0x138) from [<c00a0b58>] (generic_handle_irq+0x3)
[   12.871704] [<c00a0b58>] (generic_handle_irq+0x30/0x48) from [<c0014d18>] (handle_IRQ+0x4c/0xac)
[   12.880828] [<c0014d18>] (handle_IRQ+0x4c/0xac) from [<c000872c>] (omap3_intc_handle_irq+0x44/0x)
[   12.890258] [<c000872c>] (omap3_intc_handle_irq+0x44/0x4c) from [<c0476a64>] (__irq_svc+0x44/0x6)
[   12.899566] Exception stack(0xc0677d98 to 0xc0677de0)
[   12.904846] 7d80:                                                       edd47a1a 00000000
[   12.913360] 7da0: c0076e30 c0692630 c0693ad8 600001d3 c0676050 00000001 00000a00 c0476acc
[   12.921874] 7dc0: c0676000 00000468 c0704b70 c0677de0 c0476ac4 c000f290 60000153 ffffffff
[   12.930419] [<c0476a64>] (__irq_svc+0x44/0x60) from [<c000f290>] (do_vfp+0x14/0x20)
[   12.938385] [<c000f290>] (do_vfp+0x14/0x20) from [<c0476ac4>] (__und_svc+0x44/0x80)
[   12.946380] [<c0476ac4>] (__und_svc+0x44/0x80) from [<c0076e30>] (enqueue_task_fair+0x1dc/0x5e8)
[   12.955505] [<c0076e30>] (enqueue_task_fair+0x1dc/0x5e8) from [<c00702c0>] (enqueue_task+0x64/0x)
[   12.964935] [<c00702c0>] (enqueue_task+0x64/0x74) from [<c00739e0>] (wake_up_new_task+0xdc/0x1d4)
[   12.974182] [<c00739e0>] (wake_up_new_task+0xdc/0x1d4) from [<c0042120>] (do_fork+0xe4/0x328)
[   12.983062] [<c0042120>] (do_fork+0xe4/0x328) from [<c0015068>] (kernel_thread+0x6c/0x7c)
[   12.991577] [<c0015068>] (kernel_thread+0x6c/0x7c) from [<c0464af4>] (rest_init+0x1c/0xd0)
[   13.000183] [<c0464af4>] (rest_init+0x1c/0xd0) from [<c06287c4>] (start_kernel+0x2e4/0x354)
[   13.008880] [<c06287c4>] (start_kernel+0x2e4/0x354) from [<80008044>] (0x80008044)

What specific modifications did you make to the scheduling module (apparently something involving the spinlock in `scheduler_tick`)? — Vilhelm Gray, May 21 '13 at 17:38

score 3 · Answer 1 · answered May 21 '13 at 17:50

Firstly, this a lockdep warning. Which indicates that your kernel might have inconsisting locking issue, which might lead towards a deadlock. lockdep creates logical group of locks of same class and does some rule checking upon them. If any rules violated it warns about it. lockdep basically has two sorts of state rules a) Single-lock state rules; b) Multi-lock dependency rules. From the warning log you've provided :

[    0.370697]  #0:  (&p->pi_lock){+.....}, at: [<c0073920>] wake_up_new_task+0x1c/0x1d4
[    0.378875]  #1:  (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4

lockdep treats both p->pi_lock and rq->lock are logically same class, first lock is held at try_to_wake_up() and the later lock is held at function __task_rq_lock(). So, the whole thing becomes :

    try_to_wake_up() -> p->pi_lock is held
        ttwu_remote()
            __task_rq_lock() -> rq->lock is held

So, on a single path two same class of locks are taken. And as part of Multi-lock dependency rules lockdep warns about it.

NOTE: I'm not sure about the modifications you've made, so not sure warning is a false-positive or not. All I tried is to explain the issue.

Thanks for the analysis. I have enabled the float usage in the kernel space, this could be a reason to the deadlock. As discussed in the link: http://stackoverflow.com/questions/13886338/use-of-floating-point-in-the-linux-kernel. I guess the deadlock is one type of the so-called "kernel can trap into itself". — coder, Jun 09 '13 at 12:11
Perhaps, cause you know better what you've modified. I tried to give an overview of the warning. — rakib_, Jun 09 '13 at 16:11

Inconsistent lock state

1 Answers1