[PATCH v6 2/5] landlock: Support file truncation

Wed Sep 28 18:32:02 UTC 2022

On 25/09/2022 20:09, Günther Noack wrote:
> On Fri, Sep 23, 2022 at 10:53:23PM +0200, Mickaël Salaün wrote:
>> On 23/09/2022 13:21, Günther Noack wrote:
>>> On Mon, Sep 12, 2022 at 09:41:32PM +0200, Mickaël Salaün wrote:
>>>> On 08/09/2022 21:58, Günther Noack wrote:
>>>>> Introduce the LANDLOCK_ACCESS_FS_TRUNCATE flag for file truncation.
>>>>
>>>> [...]
>>>>
>>>>> +/**
>>>>> + * get_path_access_rights - Returns the subset of rights in access_request
>>>>> + * which are permitted for the given path.
>>>>> + *
>>>>> + * @domain: The domain that defines the current restrictions.
>>>>> + * @path: The path to get access rights for.
>>>>> + * @access_request: The rights we are interested in.
>>>>> + *
>>>>> + * Returns: The access mask of the rights that are permitted on the given path,
>>>>> + * which are also a subset of access_request (to save some calculation time).
>>>>> + */
>>>>> +static inline access_mask_t
>>>>> +get_path_access_rights(const struct landlock_ruleset *const domain,
>>>>> +		       const struct path *const path,
>>>>> +		       access_mask_t access_request)
>>>>> +{
>>>>> +	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
>>>>> +	unsigned long access_bit;
>>>>> +	unsigned long access_req;
>>>>> +
>>>>> +	init_layer_masks(domain, access_request, &layer_masks);
>>>>> +	if (!check_access_path_dual(domain, path, access_request, &layer_masks,
>>>>> +				    NULL, 0, NULL, NULL)) {
>>>>> +		/*
>>>>> +		 * Return immediately for successful accesses and for cases
>>>>> +		 * where everything is permitted because the path belongs to an
>>>>> +		 * internal filesystem.
>>>>> +		 */
>>>>> +		return access_request;
>>>>> +	}
>>>>> +
>>>>> +	access_req = access_request;
>>>>> +	for_each_set_bit(access_bit, &access_req, ARRAY_SIZE(layer_masks)) {
>>>>> +		if (layer_masks[access_bit]) {
>>>>> +			/* If any layer vetoed the access right, remove it. */
>>>>> +			access_request &= ~BIT_ULL(access_bit);
>>>>> +		}
>>>>> +	}
>>>>
>>>> This seems to be redundant with the value returned by init_layer_masks(),
>>>> which should be passed to check_access_path_dual() to avoid useless path
>>>> walk.
>>>
>>> True, I'll use the result of init_layer_masks() to feed it back to
>>> check_access_path_dual() to avoid a bit of computation.
>>>
>>> Like this:
>>>
>>>           effective_access_request =
>>> 		init_layer_masks(domain, access_request, &layer_masks);
>>> 	if (!check_access_path_dual(domain, path, effective_access_request,
>>> 	    &layer_masks, NULL, 0, NULL, NULL)) {
>>> 		// ...
>>> 	}
>>
>> correct
>>
>>>
>>> Overall, the approach here is:
>>>
>>> * Initialize the layer_masks, so that it has a bit set for every
>>>     access right in access_request and layer where that access right is
>>>     handled.
>>>
>>> * check_access_path_dual() with only the first few parameters -- this
>>>     will clear all the bits in layer masks which are actually permitted
>>>     according to the individual rules.
>>>
>>>     As a special case, this *may* return 0 immediately, in which case we
>>>     can (a) save a bit of calculation in the loop below and (b) we might
>>>     be in the case where access is permitted because it's a file from a
>>>     special file system (even though not all bits are cleared). If
>>>     check_access_path_dual() returns 0, we return the full requested
>>>     access_request that we received as input. >
>>> * In the loop below, if there are any bits left in layer_masks, those
>>>     are rights which are not permitted for the given path. We remove
>>>     these from access_request and return the modified access_request.
>>>
>>>
>>>> This function is pretty similar to check_access_path(). Can't you change it
>>>> to use an access_mask_t pointer and get almost the same thing?
>>>
>>> I'm shying away from this approach. Many of the existing different use
>>> cases are already realized by "doing if checks deep down". I think it
>>> would make the code more understandable if we managed to model these
>>> differences between use cases already at the layer of function calls.
>>> (This is particularly true for check_access_path_dual(), where in
>>> order to find out how the "single" case works, you need to disentangle
>>> to a large extent how the much more complicated dual case works.)
>>
>> I agree that check_access_path_dual() is complex, but I couldn't find a
>> better way.
> 
> It seems out of the scope of this patch set, but I sometimes find it
> OK to just duplicate the code and have a set of tests to demonstrate
> that the two variants do the same thing.
> 
> check_access_path_dual() is mostly complex because of performance
> reasons, as far as I can tell, and it might be possible to check its
> results against a parallel implementation of it which runs slower,
> uses more memory, but is more obviously correct. (I have used one
> myself to check against when developing the truncate patch set.)
> 
>>> If you want to unify these two functions, what do you think of the
>>> approach of just using get_path_access_rights() instead of
>>> check_access_path()?
>>>
>>> Basically, it would turn
>>>
>>> return check_access_path(dom, path, access_request);
>>>
>>> into
>>>
>>> if (get_path_access_rights(dom, path, access_request) == access_request)
>>> 	return 0;
>>> return -EACCES;
>>>
>>> This is slightly more verbose in the places where it's called, but it
>>> would be more orthogonal, and it would also clarify that -EACCES is
>>> the only possible error in the "single" path walk case.
>>>
>>> Let me know what you think.
>>
>> What about adding an additional argument `access_mask_t *const
>> access_allowed` to check_access_path_dual() which returns the set of
>> accesses (i.e. access_masked_parent1 & access_masked_parent2) that could
>> then be stored to landlock_file(file)->allowed_access? If this argument is
>> NULL it should just be ignored. What is left from get_path_access_rights()
>> could then be merged into hook_file_open().
> 
> IMHO, check_access_path_dual() does not seem like the right place to
> add this. This functionality is not needed in any of the "dual path"
> cases so far, and I'm not sure what it would mean. The necessary
> information can also be easily derived from the resulting layer_masks,
> which is already exposed in the check_access_path_dual() interface,
> and I also believe that this approach is at least equally fast as
> updating it on the fly when changing the layer_masks.
> 
> I could be convinced to add a `access_mask_t *const access_allowed`
> argument to check_access_path() if you prefer that, but then again, in
> that case the returned boolean can be reconstructed from the new
> access_allowed variable, and we could as well make check_access_path()
> return the access_allowed result instead of the boolean and let
> callers check equality with what they expected...? (I admittedly don't
> have a good setup to test the performance right now, but it looks like
> a negligible difference to me?)

Good idea, let's try to make check_access_path_dual() returns the 
allowed accesses (according to the request) and rename it to 
get_access_path_dual(). unmask_layers() could be changed to return the 
still-denied accesses instead of a boolean, and we could use this values 
(for potential both parents) to return allowed_parent1 & allowed_parent2 
(with access_mask_t types). This would also simplify is_eaccess() and 
its calls could be moved to current_check_refer_path(). This would merge 
get_path_access_rights() into check_access_path_dual() and make the 
errno codes more explicit per hook or defined in check_access_path().

> 
> Here are the options we have discussed, in the order that I would
> prefer them:
> 
> * to keep it as a separate function as it already is,
>    slightly duplicating check_access_path(). (I think it's cleaner,
>    because the code path for the rest of the hooks other than
>    security_file_open() stays simpler.)
> 
> * to make check_access_path() return the access_allowed access mask
>    and make callers check that it covers the access_request that they
>    asked for (see example from my previous mail on this thread). (This
>    is equivalent to discarding the existing check_access_path() and
>    using the get_path_access() function instead.)
> 
> * to add a `access_mask_t *const access_allowed` argument to
>    check_access_path(), which is calculated if it's non-NULL based on
>    the layer_masks result. It would be used from the security_file_open
>    hook.
> 
> * to add a `access_mask_t *const access_allowed` argument to
>    check_access_path_dual(). This doesn't make much sense, IMHO,
>    because an on-the-fly calculation of this result does not look like
>    a performance benefit to me, and calculating it based on the two
>    resulting layer_masks is already possible now. It's also not clear
>    to me what it would mean to calculate an access_allowed on two paths
>    at once, and what that would be used for.
> 
> Let me know which option you prefer. In the end, I don't feel that
> strongly about it and I'm happy to do this either way.