lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes()

Christian Brauner brauner at kernel.org
Tue Oct 8 12:16:04 UTC 2024


On Tue, Oct 08, 2024 at 01:23:44PM GMT, Jan Kara wrote:
> On Tue 08-10-24 10:57:22, Amir Goldstein wrote:
> > On Tue, Oct 8, 2024 at 1:33 AM Dave Chinner <david at fromorbit.com> wrote:
> > >
> > > On Mon, Oct 07, 2024 at 01:37:19PM -0700, Linus Torvalds wrote:
> > > > On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack at suse.cz> wrote:
> > > > >
> > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in
> > > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and
> > > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But
> > > > > I'd wait with that convertion until this series lands.
> > > >
> > > > Honza, I looked at this a bit more, particularly with an eye of "what
> > > > happens if we just end up making the inode lifetimes subject to the
> > > > dentry lifetimes" as suggested by Dave elsewhere.
> > >
> > > ....
> > >
> > > > which makes the fsnotify_inode_delete() happen when the inode is
> > > > removed from the dentry.
> > >
> > > There may be other inode references being held that make
> > > the inode live longer than the dentry cache. When should the
> > > fsnotify marks be removed from the inode in that case? Do they need
> > > to remain until, e.g, writeback completes?
> > >
> > 
> > fsnotify inode marks remain until explicitly removed or until sb
> > is unmounted (*), so other inode references are irrelevant to
> > inode mark removal.
> > 
> > (*) fanotify has "evictable" inode marks, which do not hold inode
> > reference and go away on inode evict, but those mark evictions
> > do not generate any event (i.e. there is no FAN_UNMOUNT).
> 
> Yes. Amir beat me with the response so let me just add that FS_UMOUNT event
> is for inotify which guarantees that either you get an event about somebody
> unlinking the inode (e.g. IN_DELETE_SELF) or event about filesystem being
> unmounted (IN_UMOUNT) if you place mark on some inode. I also don't see how
> we would maintain this behavior with what Linus proposes.
> 
> > > > Then at umount time, the dentry shrinking will deal with all live
> > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to
> > > > just the root dentry inodes?
> > >
> > > I don't think even that is necessary, because
> > > shrink_dcache_for_umount() drops the sb->s_root dentry after
> > > trimming the dentry tree. Hence the dcache drop would cleanup all
> > > inode references, roots included.
> > >
> > > > Wouldn't that make things much cleaner, and remove at least *one* odd
> > > > use of the nasty s_inodes list?
> > >
> > > Yes, it would, but someone who knows exactly when the fsnotify
> > > marks can be removed needs to chime in here...
> 
> So fsnotify needs a list of inodes for the superblock which have marks
> attached and for which we hold inode reference. We can keep it inside
> fsnotify code although it would practically mean another list_head for the
> inode for this list (probably in our fsnotify_connector structure which
> connects list of notification marks to the inode). If we actually get rid
> of i_sb_list in struct inode, this will be a win for the overall system,
> otherwise it is a net loss IMHO. So if we can figure out how to change
> other s_inodes owners we can certainly do this fsnotify change.
> 
> > > > And I wonder if the quota code (which uses the s_inodes list to enable
> > > > quotas on already mounted filesystems) could for all the same reasons
> > > > just walk the dentry tree instead (and remove_dquot_ref similarly
> > > > could just remove it at dentry_unlink_inode() time)?
> > >
> > > I don't think that will work because we have to be able to modify
> > > quota in evict() processing. This is especially true for unlinked
> > > inodes being evicted from cache, but also the dquots need to stay
> > > attached until writeback completes.
> > >
> > > Hence I don't think we can remove the quota refs from the inode
> > > before we call iput_final(), and so I think quotaoff (at least)
> > > still needs to iterate inodes...
> 
> Yeah, I'm not sure how to get rid of the s_inodes use in quota code. One of
> the things we need s_inodes list for is during quotaoff on a mounted
> filesystem when we need to iterate all inodes which are referencing quota
> structures and free them.  In theory we could keep a list of inodes
> referencing quota structures but that would require adding list_head to
> inode structure for filesystems that support quotas. Now for the sake of
> full context I'll also say that enabling / disabling quotas on a mounted
> filesystem is a legacy feature because it is quite easy that quota
> accounting goes wrong with it. So ext4 and f2fs support for quite a few
> years a mode where quota tracking is enabled on mount and disabled on
> unmount (if appropriate fs feature is enabled) and you can only enable /
> disable enforcement of quota limits during runtime.  So I could see us
> deprecating this functionality altogether although jfs never adapted to
> this new way we do quotas so we'd have to deal with that somehow.  But one
> way or another it would take a significant amount of time before we can
> completely remove this so it is out of question for this series.

I still maintain that we don't need to solve the fsnotify and lsm rework
as part of this particular series.



More information about the Linux-security-module-archive mailing list