[PATCH v2] proc: Allow pid_revalidate() during LOOKUP_RCU
stephen.s.brennan at oracle.com
Wed Dec 16 01:05:27 UTC 2020
Casey Schaufler <casey at schaufler-ca.com> writes:
> On 12/15/2020 2:04 PM, Eric W. Biederman wrote:
>> Casey Schaufler <casey at schaufler-ca.com> writes:
>>> On 12/13/2020 3:00 PM, Paul Moore wrote:
>>>> On Sun, Dec 13, 2020 at 11:30 AM Matthew Wilcox <willy at infradead.org> wrote:
>>>>> On Sun, Dec 13, 2020 at 08:22:32AM -0600, Eric W. Biederman wrote:
>>>>>> Matthew Wilcox <willy at infradead.org> writes:
>>>>>>> On Thu, Dec 03, 2020 at 04:02:12PM -0800, Stephen Brennan wrote:
>>>>>>>> -void pid_update_inode(struct task_struct *task, struct inode *inode)
>>>>>>>> +static int do_pid_update_inode(struct task_struct *task, struct inode *inode,
>>>>>>>> + unsigned int flags)
>>>>>>> I'm really nitpicking here, but this function only _updates_ the inode
>>>>>>> if flags says it should. So I was thinking something like this
>>>>>>> (compile tested only).
>>>>>>> I'd really appreocate feedback from someone like Casey or Stephen on
>>>>>>> what they need for their security modules.
>>>>>> Just so we don't have security module questions confusing things
>>>>>> can we please make this a 2 patch series? With the first
>>>>>> patch removing security_task_to_inode?
>>>>>> The justification for the removal is that all security_task_to_inode
>>>>>> appears to care about is the file type bits in inode->i_mode. Something
>>>>>> that never changes. Having this in a separate patch would make that
>>>>>> logical change easier to verify.
>>>>> I don't think that's right, which is why I keep asking Stephen & Casey
>>>>> for their thoughts.
>>>> The SELinux security_task_to_inode() implementation only cares about
>>>> inode->i_mode S_IFMT bits from the inode so that we can set the object
>>>> class correctly. The inode's SELinux label is taken from the
>>>> associated task.
>>>> Casey would need to comment on Smack's needs.
>>> SELinux uses different "class"es on subjects and objects.
>>> Smack does not differentiate, so knows the label it wants
>>> the inode to have when smack_task_to_inode() is called,
>>> and sets it accordingly. Nothing is allocated in the process,
>>> and the new value is coming from the Smack master label list.
>>> It isn't going to go away. It appears that this is the point
>>> of the hook. Am I missing something?
>> security_task_to_inode (strangely named as this is proc specific) is
>> currently called both when the inode is initialized in proc and when
>> pid_revalidate is called and the uid and gid of the proc inode
>> are updated to match the traced task.
>> I am suggesting that the call of security_task_to_inode in
>> pid_revalidate be removed as neither of the two implementations of this
>> security hook smack nor selinux care of the uid or gid changes.
> If you're sure that the only case where pid_revalidate() would matter
> is for the uid/gid cases that would be OK.
>> Removal of the security check will allow proc to be accessed in rcu look
>> mode. AKA give proc go faster stripes.
>> The two implementations are:
>> static void selinux_task_to_inode(struct task_struct *p,
>> struct inode *inode)
>> struct inode_security_struct *isec = selinux_inode(inode);
>> u32 sid = task_sid(p);
>> isec->sclass = inode_mode_to_security_class(inode->i_mode);
>> isec->sid = sid;
>> isec->initialized = LABEL_INITIALIZED;
>> static void smack_task_to_inode(struct task_struct *p, struct inode *inode)
>> struct inode_smack *isp = smack_inode(inode);
>> struct smack_known *skp = smk_of_task_struct(p);
>> isp->smk_inode = skp;
>> isp->smk_flags |= SMK_INODE_INSTANT;
>> I see two questions gating the safe removal of the call of
>> security_task_to_inode from pid_revalidate.
>> 1) Does any of this code care about uids or gids.
>> It appears the answer is no from a quick inspection of the code.
> It looks that way.
>> 2) Does smack_task_to_inode need to be called after exec?
>> - Exec especially suid exec changes the the cred on a task.
>> - Execing of a non-leader thread changes the thread_pid of a task
>> so that it is the pid of the entire thread group.
> I think so. If SMACK64EXEC is set on a binary the label will
> be changed on exec. The /proc inode Smack label would need to
> be changed.
>> If either of those are significant perhaps we can limit calling
>> security_task_to_inode if task->self_exec_id is different.
Given these answers then, it seems like a proper implementation would
leave the security_task_to_inode() call in pid_update_inode(). Then,
pid_revalidate() would drop out of RCU mode whenever some function like
this (drawing on Matthew's idea above) returns true:
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 449204e9f749..02805076c42b 100644
@@ -1820,6 +1820,26 @@ void pid_update_inode(struct task_struct *task, struct inode *inode)
inode->i_mode &= ~(S_ISUID | S_ISGID);
+/* See if we can avoid the above call. Assumes RCU lock held */
+static bool pid_inode_needs_update(struct task_struct *task, struct inode *inode)
+ kuid_t uid;
+ kgid_t gid;
+ u32 exec_id, last_exec_id;
+ if (inode->i_mode & (S_ISUID | S_ISGID))
+ return true;
+ task_dump_owner(task, inode->i_mode, &uid, &gid);
+ if (!uid_eq(uid, inode->i_uid) || !gid_eq(gid, inode->i_gid))
+ return true;
+ last_exec_id = /* find this stored somewhere? */;
+ exec_id = task->self_exec_id;
+ return exec_id != last_exec_id;
* Rewrite the inode's ownerships here because the owning task may have
* performed a setuid(), etc.
Does this make sense?
>> I haven't yet take the time to trace through and see if
>> task_sid(p) or smk_of_task_struct(p) could change based on
>> the security hooks called during exec. Or how bad the races are if
>> such a change can happen.
>> Does that clarify the question that is being asked?
More information about the Linux-security-module-archive