lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes()

Jan Kara jack at suse.cz
Thu Oct 3 16:17:31 UTC 2024


On Thu 03-10-24 23:59:51, Dave Chinner wrote:
> On Thu, Oct 03, 2024 at 02:56:50PM +0200, Jan Kara wrote:
> > On Thu 03-10-24 05:39:23, Christoph Hellwig wrote:
> > > @@ -789,11 +789,23 @@ static bool dispose_list(struct list_head *head)
> > >   */
> > >  static int evict_inode_fn(struct inode *inode, void *data)
> > >  {
> > > +	struct super_block *sb = inode->i_sb;
> > >  	struct list_head *dispose = data;
> > > +	bool post_unmount = !(sb->s_flags & SB_ACTIVE);
> > >  
> > >  	spin_lock(&inode->i_lock);
> > > -	if (atomic_read(&inode->i_count) ||
> > > -	    (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) {
> > > +	if (atomic_read(&inode->i_count)) {
> > > +		spin_unlock(&inode->i_lock);
> > > +
> > > +		/* for each watch, send FS_UNMOUNT and then remove it */
> > > +		if (post_unmount && fsnotify_sb_info(sb)) {
> > > +			fsnotify_inode(inode, FS_UNMOUNT);
> > > +			fsnotify_inode_delete(inode);
> > > +		}
> > 
> > This will not work because you are in unsafe iterator holding
> > sb->s_inode_list_lock. To be able to call into fsnotify, you need to do the
> > iget / iput dance and releasing of s_inode_list_lock which does not work
> > when a filesystem has its own inodes iterator AFAICT... That's why I've
> > called it a layering violation.
> 
> The whole point of the iget/iput dance is to stabilise the
> s_inodes list iteration whilst it is unlocked - the actual fsnotify
> calls don't need an inode reference to work correctly.
> 
> IOWs, we don't need to run the fsnotify stuff right here - we can
> defer that like we do with the dispose list for all the inodes we
> mark as I_FREEING here.
> 
> So if we pass a structure:
> 
> struct evict_inode_args {
> 	struct list_head	dispose;
> 	struct list_head	fsnotify;
> };
> 
> If we use __iget() instead of requiring an inode state flag to keep
> the inode off the LRU for the fsnotify cleanup, then the code
> fragment above becomes:
> 
> 	if (atomic_read(&inode->i_count)) {
> 		if (post_unmount && fsnotify_sb_info(sb)) {
> 			__iget(inode);
> 			inode_lru_list_del(inode);
> 			spin_unlock(&inode->i_lock);
> 			list_add(&inode->i_lru, &args->fsnotify);
> 		}

Nit: Need to release i_lock in else branch here.  Otherwise interesting
idea. Yes, something like this could work even in unsafe iterator.

> 		return INO_ITER_DONE;
> 	}
> And then once we return to evict_inodes(), we do this:
> 
> 	while (!list_empty(args->fsnotify)) {
> 		struct inode *inode
> 
> 		inode = list_first_entry(head, struct inode, i_lru);
>                 list_del_init(&inode->i_lru);
> 
> 		fsnotify_inode(inode, FS_UNMOUNT);
> 		fsnotify_inode_delete(inode);
> 		iput(inode);
> 		cond_resched();
> 	}
> 
> And so now all the fsnotify cleanup is done outside the traversal in
> one large batch from evict_inodes().

Yup.

> As for the landlock code, I think it needs to have it's own internal
> tracking mechanism and not search the sb inode list for inodes that
> it holds references to. LSM cleanup should be run before before we
> get to tearing down the inode cache, not after....

Well, I think LSM cleanup could in principle be handled together with the
fsnotify cleanup but I didn't check the details.


								Honza
-- 
Jan Kara <jack at suse.com>
SUSE Labs, CR



More information about the Linux-security-module-archive mailing list