[PATCH v6 2/5] landlock: Support file truncation

Sun Sep 25 18:09:26 UTC 2022

On Fri, Sep 23, 2022 at 10:53:23PM +0200, Mickaël Salaün wrote:
> On 23/09/2022 13:21, Günther Noack wrote:
> > On Mon, Sep 12, 2022 at 09:41:32PM +0200, Mickaël Salaün wrote:
> > > On 08/09/2022 21:58, Günther Noack wrote:
> > > > Introduce the LANDLOCK_ACCESS_FS_TRUNCATE flag for file truncation.
> > > 
> > > [...]
> > > 
> > > > +/**
> > > > + * get_path_access_rights - Returns the subset of rights in access_request
> > > > + * which are permitted for the given path.
> > > > + *
> > > > + * @domain: The domain that defines the current restrictions.
> > > > + * @path: The path to get access rights for.
> > > > + * @access_request: The rights we are interested in.
> > > > + *
> > > > + * Returns: The access mask of the rights that are permitted on the given path,
> > > > + * which are also a subset of access_request (to save some calculation time).
> > > > + */
> > > > +static inline access_mask_t
> > > > +get_path_access_rights(const struct landlock_ruleset *const domain,
> > > > +		       const struct path *const path,
> > > > +		       access_mask_t access_request)
> > > > +{
> > > > +	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
> > > > +	unsigned long access_bit;
> > > > +	unsigned long access_req;
> > > > +
> > > > +	init_layer_masks(domain, access_request, &layer_masks);
> > > > +	if (!check_access_path_dual(domain, path, access_request, &layer_masks,
> > > > +				    NULL, 0, NULL, NULL)) {
> > > > +		/*
> > > > +		 * Return immediately for successful accesses and for cases
> > > > +		 * where everything is permitted because the path belongs to an
> > > > +		 * internal filesystem.
> > > > +		 */
> > > > +		return access_request;
> > > > +	}
> > > > +
> > > > +	access_req = access_request;
> > > > +	for_each_set_bit(access_bit, &access_req, ARRAY_SIZE(layer_masks)) {
> > > > +		if (layer_masks[access_bit]) {
> > > > +			/* If any layer vetoed the access right, remove it. */
> > > > +			access_request &= ~BIT_ULL(access_bit);
> > > > +		}
> > > > +	}
> > > 
> > > This seems to be redundant with the value returned by init_layer_masks(),
> > > which should be passed to check_access_path_dual() to avoid useless path
> > > walk.
> > 
> > True, I'll use the result of init_layer_masks() to feed it back to
> > check_access_path_dual() to avoid a bit of computation.
> > 
> > Like this:
> > 
> >          effective_access_request =
> > 		init_layer_masks(domain, access_request, &layer_masks);
> > 	if (!check_access_path_dual(domain, path, effective_access_request,
> > 	    &layer_masks, NULL, 0, NULL, NULL)) {
> > 		// ...
> > 	}
> 
> correct
> 
> > 
> > Overall, the approach here is:
> > 
> > * Initialize the layer_masks, so that it has a bit set for every
> >    access right in access_request and layer where that access right is
> >    handled.
> > 
> > * check_access_path_dual() with only the first few parameters -- this
> >    will clear all the bits in layer masks which are actually permitted
> >    according to the individual rules.
> > 
> >    As a special case, this *may* return 0 immediately, in which case we
> >    can (a) save a bit of calculation in the loop below and (b) we might
> >    be in the case where access is permitted because it's a file from a
> >    special file system (even though not all bits are cleared). If
> >    check_access_path_dual() returns 0, we return the full requested
> >    access_request that we received as input. >
> > * In the loop below, if there are any bits left in layer_masks, those
> >    are rights which are not permitted for the given path. We remove
> >    these from access_request and return the modified access_request.
> > 
> > 
> > > This function is pretty similar to check_access_path(). Can't you change it
> > > to use an access_mask_t pointer and get almost the same thing?
> > 
> > I'm shying away from this approach. Many of the existing different use
> > cases are already realized by "doing if checks deep down". I think it
> > would make the code more understandable if we managed to model these
> > differences between use cases already at the layer of function calls.
> > (This is particularly true for check_access_path_dual(), where in
> > order to find out how the "single" case works, you need to disentangle
> > to a large extent how the much more complicated dual case works.)
> 
> I agree that check_access_path_dual() is complex, but I couldn't find a
> better way.

It seems out of the scope of this patch set, but I sometimes find it
OK to just duplicate the code and have a set of tests to demonstrate
that the two variants do the same thing.

check_access_path_dual() is mostly complex because of performance
reasons, as far as I can tell, and it might be possible to check its
results against a parallel implementation of it which runs slower,
uses more memory, but is more obviously correct. (I have used one
myself to check against when developing the truncate patch set.)

> > If you want to unify these two functions, what do you think of the
> > approach of just using get_path_access_rights() instead of
> > check_access_path()?
> > 
> > Basically, it would turn
> > 
> > return check_access_path(dom, path, access_request);
> > 
> > into
> > 
> > if (get_path_access_rights(dom, path, access_request) == access_request)
> > 	return 0;
> > return -EACCES;
> > 
> > This is slightly more verbose in the places where it's called, but it
> > would be more orthogonal, and it would also clarify that -EACCES is
> > the only possible error in the "single" path walk case.
> > 
> > Let me know what you think.
> 
> What about adding an additional argument `access_mask_t *const
> access_allowed` to check_access_path_dual() which returns the set of
> accesses (i.e. access_masked_parent1 & access_masked_parent2) that could
> then be stored to landlock_file(file)->allowed_access? If this argument is
> NULL it should just be ignored. What is left from get_path_access_rights()
> could then be merged into hook_file_open().

IMHO, check_access_path_dual() does not seem like the right place to
add this. This functionality is not needed in any of the "dual path"
cases so far, and I'm not sure what it would mean. The necessary
information can also be easily derived from the resulting layer_masks,
which is already exposed in the check_access_path_dual() interface,
and I also believe that this approach is at least equally fast as
updating it on the fly when changing the layer_masks.

I could be convinced to add a `access_mask_t *const access_allowed`
argument to check_access_path() if you prefer that, but then again, in
that case the returned boolean can be reconstructed from the new
access_allowed variable, and we could as well make check_access_path()
return the access_allowed result instead of the boolean and let
callers check equality with what they expected...? (I admittedly don't
have a good setup to test the performance right now, but it looks like
a negligible difference to me?)

Here are the options we have discussed, in the order that I would
prefer them:

* to keep it as a separate function as it already is,
  slightly duplicating check_access_path(). (I think it's cleaner,
  because the code path for the rest of the hooks other than
  security_file_open() stays simpler.)

* to make check_access_path() return the access_allowed access mask
  and make callers check that it covers the access_request that they
  asked for (see example from my previous mail on this thread). (This
  is equivalent to discarding the existing check_access_path() and
  using the get_path_access() function instead.)

* to add a `access_mask_t *const access_allowed` argument to
  check_access_path(), which is calculated if it's non-NULL based on
  the layer_masks result. It would be used from the security_file_open
  hook.

* to add a `access_mask_t *const access_allowed` argument to
  check_access_path_dual(). This doesn't make much sense, IMHO,
  because an on-the-fly calculation of this result does not look like
  a performance benefit to me, and calculating it based on the two
  resulting layer_masks is already possible now. It's also not clear
  to me what it would mean to calculate an access_allowed on two paths
  at once, and what that would be used for.

Let me know which option you prefer. In the end, I don't feel that
strongly about it and I'm happy to do this either way.

> > > > +	return access_request;
> > > > +}
> > > > +
> > > >    /**
> > > >     * current_check_refer_path - Check if a rename or link action is allowed
> > > >     *
> > > > @@ -1142,6 +1184,11 @@ static int hook_path_rmdir(const struct path *const dir,
> > > >    	return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_DIR);
> > > >    }
> > > > +static int hook_path_truncate(const struct path *const path)
> > > > +{
> > > > +	return current_check_access_path(path, LANDLOCK_ACCESS_FS_TRUNCATE);
> > > > +}
> > > > +
> > > >    /* File hooks */
> > > >    static inline access_mask_t get_file_access(const struct file *const file)
> > > > @@ -1159,22 +1206,55 @@ static inline access_mask_t get_file_access(const struct file *const file)
> > > >    	/* __FMODE_EXEC is indeed part of f_flags, not f_mode. */
> > > >    	if (file->f_flags & __FMODE_EXEC)
> > > >    		access |= LANDLOCK_ACCESS_FS_EXECUTE;
> > > > +
> > > >    	return access;
> > > >    }
> > > >    static int hook_file_open(struct file *const file)
> > > >    {
> > > > +	access_mask_t access_req, access_rights;
> > > 
> > > "access_request" is used for access_mask_t, and "access_req" for unsigned
> > > int. I'd like to stick to this convention.
> > 
> > Done.
> > 
> > > > +	const access_mask_t optional_rights = LANDLOCK_ACCESS_FS_TRUNCATE;
> > > 
> > > You use "rights" often and I'm having some trouble to find a rational for
> > > that (compared to "access")…
> > 
> > Done. Didn't realize you already had a different convention here.
> > 
> > I'm renaming get_path_access_rights() to get_path_access() as well
> > then (and I'll rename get_file_access() to
> > get_required_file_open_access() - that's more verbose, but it sounded
> > too similar to get_path_access(), and it might be better to clarify
> > that this is a helper for the file_open hook). Does that sound
> > reasonable?
> 
> I think it is better, but I'm not convinced this helper is useful.
> 
> > 
> > 
> > > >    	const struct landlock_ruleset *const dom =
> > > >    		landlock_get_current_domain();
> > > > -	if (!dom)
> > > > +	if (!dom) {
> > > > +		/* Grant all rights. */
> > > > +		landlock_file(file)->rights = LANDLOCK_MASK_ACCESS_FS;
> > > >    		return 0;
> > > > +	}
> > > > +
> > > >    	/*
> > > >    	 * Because a file may be opened with O_PATH, get_file_access() may
> > > >    	 * return 0.  This case will be handled with a future Landlock
> > > >    	 * evolution.
> > > >    	 */
> > > > -	return check_access_path(dom, &file->f_path, get_file_access(file));
> > > > +	access_req = get_file_access(file);
> > > > +	access_rights = get_path_access_rights(dom, &file->f_path,
> > > > +					       access_req | optional_rights);
> > > > +	if (access_req & ~access_rights)
> > > > +		return -EACCES;
> > > 
> > > We should add a test to make sure this (optional_rights) logic is correct
> > > (and doesn't change), with a matrix of cases involving a ruleset handling
> > > either FS_WRITE, FS_TRUNCATE or both. This should be easy to do with test
> > > variants.
> > 
> > OK, adding one to the selftests.
> > 
> > > > +	/*
> > > > +	 * For operations on already opened files (i.e. ftruncate()), it is the
> > > > +	 * access rights at the time of open() which decide whether the
> > > > +	 * operation is permitted. Therefore, we record the relevant subset of
> > > > +	 * file access rights in the opened struct file.
> > > > +	 */
> > > > +	landlock_file(file)->rights = access_rights;
> > > > +
> > > > +	return 0;
> > > > +}
> > 

--