[PATCH v14 22/23] LSM: Add /proc attr entry for full LSM context

Mon Feb 10 18:32:17 UTC 2020

On 2/10/2020 6:55 AM, Stephen Smalley wrote:
> On 2/10/20 8:25 AM, Stephen Smalley wrote:
>> On 2/10/20 6:56 AM, Simon McVittie wrote:
>>> On Mon, 03 Feb 2020 at 13:54:45 -0500, Stephen Smalley wrote:
>>>> The printable ASCII bit is based on what the dbus maintainer requested in
>>>> previous discussions.
>>>
>>> I thought in previous discussions, we had come to the conclusion that
>>> I can't assume it's 7-bit ASCII. (If I *can* assume that for this new
>>> API, that's even better.)
>>>
>>> To be clear, when I say ASCII I mean a sequence of bytes != '\0' with
>>> their high bit unset (x & 0x7f == x) and the obvious mapping to/from
>>> Unicode (bytes '\1' to '\x7f' represent codepoints U+0001 to U+007F). Is
>>> that the same thing you mean?
>>
>> I mean the subset of 7-bit ASCII that satisfies isprint() using the "C" locale.  That is already true for SELinux with the existing interfaces. I can't necessarily speak for the others.
>
> Looks like Smack labels are similarly restricted, per Documentation/admin-guide/LSM/Smack.rst.  So I guess the only one that is perhaps unclear is AppArmor, since its labels are typically derived from pathnames?  Can an AppArmor label returned via its getprocattr() hook be any legal pathname?

Because attr/context (and later, SO_PEERCONTEXT) are new interfaces
there is no need to exactly duplicate what is in attr/current (later
SO_PEERSEC). I already plan to omit the "mode" component of the
AppArmor data in the AppArmor hook, as was discussed earlier. I would
prefer ASCII, but if AppArmor needs bytestrings, that's what we'll
have to do.

>
>>> I thought the conclusion we had come to in previous conversations was
>>> that the LSM context is what GLib calls a "bytestring", the same as
>>> filenames and environment variables - an opaque sequence of bytes != '\0',
>>> with no further guarantees, and no specified encoding or mapping to/from
>>> Unicode (most likely some superset of ASCII like UTF-8 or Latin-1,
>>> but nobody knows which one, and they coould equally well be some binary
>>> encoding with no Unicode meaning, as long as it avoids '\0').
>>>
>>> If I can safely assume that a new kernel <-> user-space API is constrained
>>> to UTF-8 or a UTF-8 subset like ASCII, then I can provide more friendly
>>> APIs for user-space features built over it. If that isn't possible, the
>>> next best thing is a "bytestring" like filenames, environment variables,
>>> and most kernel <-> user-space strings in general.
>>>
>>>      smcv
>>>
>>
>