[PATCH v2 22/35] vfs: don't open real

Daniel Walsh dwalsh at redhat.com
Mon May 14 14:03:42 UTC 2018


On 05/11/2018 03:42 PM, Vivek Goyal wrote:
> On Fri, May 11, 2018 at 02:54:30PM -0400, Vivek Goyal wrote:
>> On Mon, May 07, 2018 at 10:37:54AM +0200, Miklos Szeredi wrote:
>>> Let overlayfs do its thing when opening a file.
>>>
>>> This enables stacking and fixes the corner case when a file is opened for
>>> read, modified through a writable open, and data is read from the read-only
>>> file.  After this patch the read-only open will not return stale data even
>>> in this case.
>> [CC Dan, Steven, Paul, linux-security-module list]
>>
>> Hi Miklos,
>>
>> I was running selinux-testsuite and one of the tests seems to fail. I
>> think this is side effect of installing overlay inode in file->f_inode
>> instead of real underlying inode.
>>
>> Following test is failing.
>>
>> sub test_90_1 {
>>      print "Attempting to enter domain with bad entrypoint, should fail.\n";
>>      $result = system(
>> "runcon -t test_overlay_client_t -l s0:c10,c20 $basedir/container1/merged/badentrypoint >/dev/null 2>&1"
>>      );
>>      ok($result);
>>      return;
>> }
> I am wondering, shouldn't do_open_execat() have failed. It should have called
> into inode_permission(MAY_EXEC). And then ovl_inode_permission()
> will in turn call inode_permission(realinode, MAY_EXEC) with mounter's
> creds. Shouldn't selinux_inode_permission() have returned that mounter
> does not have MAY_EXEC permission on inode.
>
> Dan, I am wondering if this is a selinux policy issue? In my testing
> on upstream kernel, do_open_execat() succeeds and it fails much later.
> I am wondering why that's the case. Is it expected.
>
> Thanks
> Vivek
>
>
>> Basically, this test has an executable named "badentrypoint" with selinux
>> label "unconfined_u:object_r:test_overlay_files_ro_t:s0". And we mount
>> overlay with context=unconfined_u:object_r:test_overlay_files_rwx_t:s0:c10,c20
>>
>> So effectively overlay inode of "badentrypoint" now gets the label
>> specified by "context=".
>>
>> I think intent of test is that this file's real label is "...ro_t". That
>> means this file is not supposed to be executed and any attempt to execute
>> it should be denied.
>>
>> Currently test works and execution fails with following avc.
>>
>> AVC avc:  denied  { entrypoint } for  pid=1425 comm="runcon" path="/root/git/selinux-testsuite/tests/overlay/container1/merged/badentrypoint" dev="dm-0" ino=34515261 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0
>>
>> But with new patches, this test starts passing.
>>
>> I think currently selinux_bprm_set_creds() returns error. It does
>> checks on inode returned by file_inode() and as of now that inode is
>> real inode and that inode has real lable of "...ro_t" and permission
>> to execute that file is denied.
>>
>> But after the patches file_inode() returns overlay inode. Which has
>> the label specified by context= mount option "...rwx_t". And that
>> label allows executing file, so file execution is not blocked by
>> selinux.
>>
>> I feel that even now code is working accidently. Ideally our theme was
>> that task's credential as checked against overlay inode and mounter's
>> creds are checked against underlying inode to determine if certain
>> permission is allowed. So ideally mounter should not have been allwed
>> to execute a file of type "...ro_t". But we don't have that workflow
>> and VFS calls into selinux and selinux checks the underlying file's
>> label against task.
>>
>> It worked so far but the moment we install overlay inode in file, selinux
>> checks it against overlay inode label and allows permission to execute and
>> mounter is never checked against real inode.
>>
>> I am not sure what's the right solution. So far selinux is not aware of
>> two levels of checks and if two levels of checks are to be performed, it
>> somehow needs to be enforced by overlay and call same hook on two levels.
>>
>> Thought of atleast starting a conversation on this.
>>
>> Thanks
>> Vivek
>>
>>
>>> Signed-off-by: Miklos Szeredi <mszeredi at redhat.com>
>>> ---
>>>   fs/open.c | 7 +------
>>>   1 file changed, 1 insertion(+), 6 deletions(-)
>>>
>>> diff --git a/fs/open.c b/fs/open.c
>>> index 6e52fd6fea7c..244cd2ecfefd 100644
>>> --- a/fs/open.c
>>> +++ b/fs/open.c
>>> @@ -897,13 +897,8 @@ EXPORT_SYMBOL(file_path);
>>>   int vfs_open(const struct path *path, struct file *file,
>>>   	     const struct cred *cred)
>>>   {
>>> -	struct dentry *dentry = d_real(path->dentry, NULL, file->f_flags, 0);
>>> -
>>> -	if (IS_ERR(dentry))
>>> -		return PTR_ERR(dentry);
>>> -
>>>   	file->f_path = *path;
>>> -	return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
>>> +	return do_dentry_open(file, d_backing_inode(path->dentry), NULL, cred);
>>>   }
>>>   
>>>   /**
>>> -- 
>>> 2.14.3
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
>>> the body of a message to majordomo at vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vivek and I talked, and I believe the SELinux check on Entrypoint is 
wrong.  We should be checking on the overlay context not on the lower 
level label for entrypoint.


A little back ground.  Entrypoint check is looking at the target domain 
whether it can be entered via the executable.


For example we might have a label like apache_t and apache_exec_t, we 
would write a rules like:


allow apache_t apache_exec_t:file entrypoint.

allow user_t apache_t:process transition

allow user_t apache_file_t:file execute

allow user_t bin_t:file execute


These rules say a process running as user_t can execute files labeles 
apache_exec_t and bin_t.  It also says that the user_t type can 
transition or start a process as apache_t, BUT since we have the 
entrypoint rule, the only type that user_t can transition to apache_t is 
the apache_exec_t type.

This would prevent user_t from executing something like

runcon -t apache_t /bin/sh


In the case of these tests currently SELinux is verifying that the 
mounter is able to mount a directory with a different label rwx_t, and 
then providing the user with content via this label. So the entrypoint 
check should happen on the new context label, not on the lower label.  
We need to fix the SELinux test suite to reflect the new behaviour.  I 
think the current test and current code is actually a bug.


would say that the apache_t process type can be entered via


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the Linux-security-module-archive mailing list