[PATCH 1/3] ima: Remove inode lock

Roberto Sassu roberto.sassu at huaweicloud.com
Wed Oct 16 09:59:22 UTC 2024


On Fri, 2024-10-11 at 15:30 +0200, Roberto Sassu wrote:
> On Wed, 2024-10-09 at 18:43 +0200, Roberto Sassu wrote:
> > On Wed, 2024-10-09 at 18:25 +0200, Roberto Sassu wrote:
> > > On Wed, 2024-10-09 at 11:37 -0400, Paul Moore wrote:
> > > > On Wed, Oct 9, 2024 at 11:36 AM Paul Moore <paul at paul-moore.com> wrote:
> > > > > On Tue, Oct 8, 2024 at 12:57 PM Roberto Sassu
> > > > > <roberto.sassu at huaweicloud.com> wrote:
> > > > > > 
> > > > > > From: Roberto Sassu <roberto.sassu at huawei.com>
> > > > > > 
> > > > > > Move out the mutex in the ima_iint_cache structure to a new structure
> > > > > > called ima_iint_cache_lock, so that a lock can be taken regardless of
> > > > > > whether or not inode integrity metadata are stored in the inode.
> > > > > > 
> > > > > > Introduce ima_inode_security() to simplify accessing the new structure in
> > > > > > the inode security blob.
> > > > > > 
> > > > > > Move the mutex initialization and annotation in the new function
> > > > > > ima_inode_alloc_security() and introduce ima_iint_lock() and
> > > > > > ima_iint_unlock() to respectively lock and unlock the mutex.
> > > > > > 
> > > > > > Finally, expand the critical region in process_measurement() guarded by
> > > > > > iint->mutex up to where the inode was locked, use only one iint lock in
> > > > > > __ima_inode_hash(), since the mutex is now in the inode security blob, and
> > > > > > replace the inode_lock()/inode_unlock() calls in ima_check_last_writer().
> > > > > > 
> > > > > > Signed-off-by: Roberto Sassu <roberto.sassu at huawei.com>
> > > > > > ---
> > > > > >  security/integrity/ima/ima.h      | 26 ++++++++---
> > > > > >  security/integrity/ima/ima_api.c  |  4 +-
> > > > > >  security/integrity/ima/ima_iint.c | 77 ++++++++++++++++++++++++++-----
> > > > > >  security/integrity/ima/ima_main.c | 39 +++++++---------
> > > > > >  4 files changed, 104 insertions(+), 42 deletions(-)
> > > > > 
> > > > > I'm not an IMA expert, but it looks reasonable to me, although
> > > > > shouldn't this carry a stable CC in the patch metadata?
> > > > > 
> > > > > Reviewed-by: Paul Moore <paul at paul-moore.com>
> > > > 
> > > > Sorry, one more thing ... did you verify this patchset resolves the
> > > > syzbot problem?  I saw at least one reproducer.
> > > 
> > > Uhm, could not reproduce the deadlock with the reproducer. However,
> > > without the patch I have a lockdep warning, and with I don't.
> > > 
> > > I asked syzbot to try the patches. Let's see.
> > 
> > I actually got a different lockdep warning:
> > 
> > [  904.603365] ======================================================
> > [  904.604264] WARNING: possible circular locking dependency detected
> > [  904.605141] 6.12.0-rc2+ #20 Not tainted
> > [  904.605697] ------------------------------------------------------
> 
> I can reproduce by executing the syzbot reproducer and in another
> terminal by logging in with SSH (not all the times).
> 
> If I understood what the lockdep warning means, this is the scenario.
> 
> A task accesses a seq_file which is in the IMA policy, so we enter the
> critical region guarded by the iint lock. But before we get the chance
> to measure the file, a second task calls remap_file_pages() on the same
> seq_file.
> 
> remap_file_pages() takes the mmap lock and, if the event matches the
> IMA policy too, the second task waits for the iint lock to be released.
> 
> Now, the first task starts to measure the seq_file and takes the
> seq_file lock. I don't know if 3 processes must be involved, but I was
> thinking that reading the seq_file from the first task can trigger a
> page fault, which requires to take the mmap lock.
> 
> At this point, we reach a deadlock. The first task waits for the mmap
> lock to be released, and the second waits for the iint lock to be
> released, which both cannot happen.

+ mm, mm folks

(I leave the lockdep warning down for the new people included in the
thread).

I think I understood what the problem is. Any lock that is taken inside
security_file_mmap() would cause lock inversion since:

vm_mmap_pgoff():

	ret = security_mmap_file(file, prot, flag);
	if (!ret) {
		if (mmap_write_lock_killable(mm))



SYSCALL_DEFINE5(remap_file_pages, ...):

	if (mmap_write_lock_killable(mm))
		return -EINTR;

[...]

	file = get_file(vma->vm_file);
	ret = security_mmap_file(vma->vm_file, prot, flags);
	if (ret)
		goto out_fput;


Since I didn't see a good way to change the order of the second, I
changed the order of the first, i.e. I put security_file_mmap() under
the mmap lock.

(I don't know if this is a good idea.)

I cannot reproduce the lockdep warning anymore.

@mm folks, what is your opinion?

Thanks

Roberto

> Roberto
> 
> > [  904.606577] systemd/1 is trying to acquire lock:
> > [  904.607227] ffff88810e5c2580 (&p->lock){+.+.}-{4:4}, at: seq_read_iter+0x62/0x6b0
> > [  904.608290] 
> > [  904.608290] but task is already holding lock:
> > [  904.609105] ffff88810f4abf20 (&ima_iint_lock_mutex_key[depth]){+.+.}-{4:4}, at: ima_iint_lock+0x24/0x40
> > [  904.610429] 
> > [  904.610429] which lock already depends on the new lock.
> > [  904.610429] 
> > [  904.611574] 
> > [  904.611574] the existing dependency chain (in reverse order) is:
> > [  904.612628] 
> > [  904.612628] -> #2 (&ima_iint_lock_mutex_key[depth]){+.+.}-{4:4}:
> > [  904.613681]        __mutex_lock+0xaf/0x760
> > [  904.614266]        mutex_lock_nested+0x27/0x40
> > [  904.614897]        ima_iint_lock+0x24/0x40
> > [  904.615490]        process_measurement+0x176/0xef0
> > [  904.616168]        ima_file_mmap+0x98/0x120
> > [  904.616767]        security_mmap_file+0x408/0x560
> > [  904.617444]        __do_sys_remap_file_pages+0x2fa/0x4c0
> > [  904.618194]        __x64_sys_remap_file_pages+0x29/0x40
> > [  904.618937]        x64_sys_call+0x6e8/0x4550
> > [  904.619546]        do_syscall_64+0x71/0x180
> > [  904.620155]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  904.620952] 
> > [  904.620952] -> #1 (&mm->mmap_lock){++++}-{4:4}:
> > [  904.621813]        __might_fault+0x6f/0xb0
> > [  904.622400]        _copy_to_iter+0x12e/0xa80
> > [  904.623009]        seq_read_iter+0x593/0x6b0
> > [  904.623629]        proc_reg_read_iter+0x31/0xe0
> > [  904.624276]        vfs_read+0x256/0x3d0
> > [  904.624822]        ksys_read+0x6d/0x160
> > [  904.625372]        __x64_sys_read+0x1d/0x30
> > [  904.625964]        x64_sys_call+0x1068/0x4550
> > [  904.626594]        do_syscall_64+0x71/0x180
> > [  904.627188]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  904.627975] 
> > [  904.627975] -> #0 (&p->lock){+.+.}-{4:4}:
> > [  904.628787]        __lock_acquire+0x17f3/0x2320
> > [  904.629432]        lock_acquire+0xf2/0x420
> > [  904.630013]        __mutex_lock+0xaf/0x760
> > [  904.630596]        mutex_lock_nested+0x27/0x40
> > [  904.631225]        seq_read_iter+0x62/0x6b0
> > [  904.631831]        kernfs_fop_read_iter+0x1ef/0x2c0
> > [  904.632599]        __kernel_read+0x113/0x350
> > [  904.633206]        integrity_kernel_read+0x23/0x40
> > [  904.633902]        ima_calc_file_hash_tfm+0x14e/0x230
> > [  904.634621]        ima_calc_file_hash+0x97/0x250
> > [  904.635281]        ima_collect_measurement+0x4be/0x530
> > [  904.636008]        process_measurement+0x7c0/0xef0
> > [  904.636689]        ima_file_check+0x65/0x80
> > [  904.637295]        security_file_post_open+0xb1/0x1b0
> > [  904.638008]        path_openat+0x216/0x1280
> > [  904.638605]        do_filp_open+0xab/0x140
> > [  904.639185]        do_sys_openat2+0xba/0x120
> > [  904.639805]        do_sys_open+0x4c/0x80
> > [  904.640366]        __x64_sys_openat+0x23/0x30
> > [  904.640992]        x64_sys_call+0x2575/0x4550
> > [  904.641616]        do_syscall_64+0x71/0x180
> > [  904.642207]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  904.643003] 
> > [  904.643003] other info that might help us debug this:
> > [  904.643003] 
> > [  904.644149] Chain exists of:
> > [  904.644149]   &p->lock --> &mm->mmap_lock --> &ima_iint_lock_mutex_key[depth]
> > [  904.644149] 
> > [  904.645763]  Possible unsafe locking scenario:
> > [  904.645763] 
> > [  904.646614]        CPU0                    CPU1
> > [  904.647264]        ----                    ----
> > [  904.647909]   lock(&ima_iint_lock_mutex_key[depth]);
> > [  904.648617]                                lock(&mm->mmap_lock);
> > [  904.649479]                                lock(&ima_iint_lock_mutex_key[depth]);
> > [  904.650543]   lock(&p->lock);
> > [  904.650974] 
> > [  904.650974]  *** DEADLOCK ***
> > [  904.650974] 
> > [  904.651826] 1 lock held by systemd/1:
> > [  904.652376]  #0: ffff88810f4abf20 (&ima_iint_lock_mutex_key[depth]){+.+.}-{4:4}, at: ima_iint_lock+0x24/0x40
> > [  904.653759] 
> > [  904.653759] stack backtrace:
> > [  904.654391] CPU: 2 UID: 0 PID: 1 Comm: systemd Not tainted 6.12.0-rc2+ #20
> > [  904.655360] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > [  904.656497] Call Trace:
> > [  904.656856]  <TASK>
> > [  904.657166]  dump_stack_lvl+0x134/0x1a0
> > [  904.657728]  dump_stack+0x14/0x30
> > [  904.658206]  print_circular_bug+0x38d/0x450
> > [  904.658812]  check_noncircular+0xed/0x120
> > [  904.659396]  ? srso_return_thunk+0x5/0x5f
> > [  904.659972]  ? srso_return_thunk+0x5/0x5f
> > [  904.660569]  __lock_acquire+0x17f3/0x2320
> > [  904.661145]  lock_acquire+0xf2/0x420
> > [  904.661664]  ? seq_read_iter+0x62/0x6b0
> > [  904.662217]  ? srso_return_thunk+0x5/0x5f
> > [  904.662886]  __mutex_lock+0xaf/0x760
> > [  904.663408]  ? seq_read_iter+0x62/0x6b0
> > [  904.663961]  ? seq_read_iter+0x62/0x6b0
> > [  904.664525]  ? srso_return_thunk+0x5/0x5f
> > [  904.665098]  ? mark_lock+0x4e/0x750
> > [  904.665610]  ? mutex_lock_nested+0x27/0x40
> > [  904.666194]  ? find_held_lock+0x3a/0x100
> > [  904.666770]  mutex_lock_nested+0x27/0x40
> > [  904.667337]  seq_read_iter+0x62/0x6b0
> > [  904.667869]  kernfs_fop_read_iter+0x1ef/0x2c0
> > [  904.668536]  __kernel_read+0x113/0x350
> > [  904.669079]  integrity_kernel_read+0x23/0x40
> > [  904.669698]  ima_calc_file_hash_tfm+0x14e/0x230
> > [  904.670349]  ? __lock_acquire+0xc32/0x2320
> > [  904.670937]  ? srso_return_thunk+0x5/0x5f
> > [  904.671525]  ? __lock_acquire+0xfbb/0x2320
> > [  904.672113]  ? srso_return_thunk+0x5/0x5f
> > [  904.672693]  ? srso_return_thunk+0x5/0x5f
> > [  904.673280]  ? lock_acquire+0xf2/0x420
> > [  904.673818]  ? kernfs_iop_getattr+0x4a/0xb0
> > [  904.674424]  ? srso_return_thunk+0x5/0x5f
> > [  904.674997]  ? find_held_lock+0x3a/0x100
> > [  904.675564]  ? srso_return_thunk+0x5/0x5f
> > [  904.676156]  ? srso_return_thunk+0x5/0x5f
> > [  904.676740]  ? srso_return_thunk+0x5/0x5f
> > [  904.677322]  ? local_clock_noinstr+0x9/0xb0
> > [  904.677923]  ? srso_return_thunk+0x5/0x5f
> > [  904.678502]  ? srso_return_thunk+0x5/0x5f
> > [  904.679075]  ? lock_release+0x4e2/0x570
> > [  904.679639]  ima_calc_file_hash+0x97/0x250
> > [  904.680227]  ima_collect_measurement+0x4be/0x530
> > [  904.680901]  ? srso_return_thunk+0x5/0x5f
> > [  904.681496]  ? srso_return_thunk+0x5/0x5f
> > [  904.682070]  ? __kernfs_iattrs+0x4a/0x140
> > [  904.682658]  ? srso_return_thunk+0x5/0x5f
> > [  904.683242]  ? process_measurement+0x7c0/0xef0
> > [  904.683876]  ? srso_return_thunk+0x5/0x5f
> > [  904.684462]  process_measurement+0x7c0/0xef0
> > [  904.685078]  ? srso_return_thunk+0x5/0x5f
> > [  904.685654]  ? srso_return_thunk+0x5/0x5f
> > [  904.686228]  ? _raw_spin_unlock_irqrestore+0x5d/0xd0
> > [  904.686938]  ? srso_return_thunk+0x5/0x5f
> > [  904.687523]  ? srso_return_thunk+0x5/0x5f
> > [  904.688098]  ? srso_return_thunk+0x5/0x5f
> > [  904.688672]  ? local_clock_noinstr+0x9/0xb0
> > [  904.689273]  ? srso_return_thunk+0x5/0x5f
> > [  904.689846]  ? srso_return_thunk+0x5/0x5f
> > [  904.690430]  ? srso_return_thunk+0x5/0x5f
> > [  904.691005]  ? srso_return_thunk+0x5/0x5f
> > [  904.691583]  ? srso_return_thunk+0x5/0x5f
> > [  904.692180]  ? local_clock_noinstr+0x9/0xb0
> > [  904.692841]  ? srso_return_thunk+0x5/0x5f
> > [  904.693419]  ? srso_return_thunk+0x5/0x5f
> > [  904.693990]  ? lock_release+0x4e2/0x570
> > [  904.694544]  ? srso_return_thunk+0x5/0x5f
> > [  904.695115]  ? kernfs_put_active+0x5d/0xc0
> > [  904.695708]  ? srso_return_thunk+0x5/0x5f
> > [  904.696286]  ? kernfs_fop_open+0x376/0x6b0
> > [  904.696872]  ima_file_check+0x65/0x80
> > [  904.697409]  security_file_post_open+0xb1/0x1b0
> > [  904.698058]  path_openat+0x216/0x1280
> > [  904.698589]  do_filp_open+0xab/0x140
> > [  904.699106]  ? srso_return_thunk+0x5/0x5f
> > [  904.699693]  ? lock_release+0x554/0x570
> > [  904.700264]  ? srso_return_thunk+0x5/0x5f
> > [  904.700836]  ? do_raw_spin_unlock+0x76/0x140
> > [  904.701450]  ? srso_return_thunk+0x5/0x5f
> > [  904.702021]  ? _raw_spin_unlock+0x3f/0xa0
> > [  904.702606]  ? srso_return_thunk+0x5/0x5f
> > [  904.703178]  ? alloc_fd+0x1ca/0x3b0
> > [  904.703685]  do_sys_openat2+0xba/0x120
> > [  904.704223]  ? file_free+0x8d/0x110
> > [  904.704729]  do_sys_open+0x4c/0x80
> > [  904.705221]  __x64_sys_openat+0x23/0x30
> > [  904.705784]  x64_sys_call+0x2575/0x4550
> > [  904.706337]  do_syscall_64+0x71/0x180
> > [  904.706863]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  904.707587] RIP: 0033:0x7f3462123037
> > [  904.708120] Code: 55 9c 48 89 75 a0 89 7d a8 44 89 55 ac e8 a1 7a f8 ff 44 8b 55 ac 8b 55 9c 41 89 c0 48 8b 75 a0 8b 7d a8 b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 89 45 ac e8 f6 7a f8 ff 8b 45 ac
> > [  904.710744] RSP: 002b:00007ffdd1a79370 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
> > [  904.711821] RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 00007f3462123037
> > [  904.712829] RDX: 0000000000080100 RSI: 00007ffdd1a79420 RDI: 00000000ffffff9c
> > [  904.713839] RBP: 00007ffdd1a793e0 R08: 0000000000000000 R09: 0000000000000000
> > [  904.714848] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffdd1a794c8
> > [  904.715855] R13: 00007ffdd1a794b8 R14: 00007ffdd1a79690 R15: 00007ffdd1a79690
> > [  904.716877]  </TASK>
> > 
> > Roberto
> 




More information about the Linux-security-module-archive mailing list