lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes()
Jan Kara
jack at suse.cz
Thu Oct 3 16:17:31 UTC 2024
On Thu 03-10-24 23:59:51, Dave Chinner wrote:
> On Thu, Oct 03, 2024 at 02:56:50PM +0200, Jan Kara wrote:
> > On Thu 03-10-24 05:39:23, Christoph Hellwig wrote:
> > > @@ -789,11 +789,23 @@ static bool dispose_list(struct list_head *head)
> > > */
> > > static int evict_inode_fn(struct inode *inode, void *data)
> > > {
> > > + struct super_block *sb = inode->i_sb;
> > > struct list_head *dispose = data;
> > > + bool post_unmount = !(sb->s_flags & SB_ACTIVE);
> > >
> > > spin_lock(&inode->i_lock);
> > > - if (atomic_read(&inode->i_count) ||
> > > - (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) {
> > > + if (atomic_read(&inode->i_count)) {
> > > + spin_unlock(&inode->i_lock);
> > > +
> > > + /* for each watch, send FS_UNMOUNT and then remove it */
> > > + if (post_unmount && fsnotify_sb_info(sb)) {
> > > + fsnotify_inode(inode, FS_UNMOUNT);
> > > + fsnotify_inode_delete(inode);
> > > + }
> >
> > This will not work because you are in unsafe iterator holding
> > sb->s_inode_list_lock. To be able to call into fsnotify, you need to do the
> > iget / iput dance and releasing of s_inode_list_lock which does not work
> > when a filesystem has its own inodes iterator AFAICT... That's why I've
> > called it a layering violation.
>
> The whole point of the iget/iput dance is to stabilise the
> s_inodes list iteration whilst it is unlocked - the actual fsnotify
> calls don't need an inode reference to work correctly.
>
> IOWs, we don't need to run the fsnotify stuff right here - we can
> defer that like we do with the dispose list for all the inodes we
> mark as I_FREEING here.
>
> So if we pass a structure:
>
> struct evict_inode_args {
> struct list_head dispose;
> struct list_head fsnotify;
> };
>
> If we use __iget() instead of requiring an inode state flag to keep
> the inode off the LRU for the fsnotify cleanup, then the code
> fragment above becomes:
>
> if (atomic_read(&inode->i_count)) {
> if (post_unmount && fsnotify_sb_info(sb)) {
> __iget(inode);
> inode_lru_list_del(inode);
> spin_unlock(&inode->i_lock);
> list_add(&inode->i_lru, &args->fsnotify);
> }
Nit: Need to release i_lock in else branch here. Otherwise interesting
idea. Yes, something like this could work even in unsafe iterator.
> return INO_ITER_DONE;
> }
> And then once we return to evict_inodes(), we do this:
>
> while (!list_empty(args->fsnotify)) {
> struct inode *inode
>
> inode = list_first_entry(head, struct inode, i_lru);
> list_del_init(&inode->i_lru);
>
> fsnotify_inode(inode, FS_UNMOUNT);
> fsnotify_inode_delete(inode);
> iput(inode);
> cond_resched();
> }
>
> And so now all the fsnotify cleanup is done outside the traversal in
> one large batch from evict_inodes().
Yup.
> As for the landlock code, I think it needs to have it's own internal
> tracking mechanism and not search the sb inode list for inodes that
> it holds references to. LSM cleanup should be run before before we
> get to tearing down the inode cache, not after....
Well, I think LSM cleanup could in principle be handled together with the
fsnotify cleanup but I didn't check the details.
Honza
--
Jan Kara <jack at suse.com>
SUSE Labs, CR
More information about the Linux-security-module-archive
mailing list