[PATCH bpf-next 6/8] libbpf: wire up BPF token support at BPF object level

Tue Dec 12 00:05:25 UTC 2023

On Mon, Dec 11, 2023 at 2:56 PM John Fastabend <john.fastabend at gmail.com> wrote:
>
> Andrii Nakryiko wrote:
> > Add BPF token support to BPF object-level functionality.
> >
> > BPF token is supported by BPF object logic either as an explicitly
> > provided BPF token from outside (through BPF FS path or explicit BPF
> > token FD), or implicitly (unless prevented through
> > bpf_object_open_opts).
> >
> > Implicit mode is assumed to be the most common one for user namespaced
> > unprivileged workloads. The assumption is that privileged container
> > manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token
> > delegation options (delegate_{cmds,maps,progs,attachs} mount options).
> > BPF object during loading will attempt to create BPF token from
> > /sys/fs/bpf location, and pass it for all relevant operations
> > (currently, map creation, BTF load, and program load).
> >
> > In this implicit mode, if BPF token creation fails due to whatever
> > reason (BPF FS is not mounted, or kernel doesn't support BPF token,
> > etc), this is not considered an error. BPF object loading sequence will
> > proceed with no BPF token.
> >
> > In explicit BPF token mode, user provides explicitly either custom BPF
> > FS mount point path or creates BPF token on their own and just passes
> > token FD directly. In such case, BPF object will either dup() token FD
> > (to not require caller to hold onto it for entire duration of BPF object
> > lifetime) or will attempt to create BPF token from provided BPF FS
> > location. If BPF token creation fails, that is considered a critical
> > error and BPF object load fails with an error.
> >
> > Libbpf provides a way to disable implicit BPF token creation, if it
> > causes any troubles (BPF token is designed to be completely optional and
> > shouldn't cause any problems even if provided, but in the world of BPF
> > LSM, custom security logic can be installed that might change outcome
> > dependin on the presence of BPF token). To disable libbpf's default BPF
> > token creation behavior user should provide either invalid BPF token FD
> > (negative), or empty bpf_token_path option.
> >
> > BPF token presence can influence libbpf's feature probing, so if BPF
> > object has associated BPF token, feature probing is instructed to use
> > BPF object-specific feature detection cache and token FD.
> >
> > Signed-off-by: Andrii Nakryiko <andrii at kernel.org>
> > ---
> >  tools/lib/bpf/btf.c             |   7 +-
> >  tools/lib/bpf/libbpf.c          | 120 ++++++++++++++++++++++++++++++--
> >  tools/lib/bpf/libbpf.h          |  28 +++++++-
> >  tools/lib/bpf/libbpf_internal.h |  17 ++++-
> >  4 files changed, 160 insertions(+), 12 deletions(-)
> >
>
> ...
>
> >
> > +static int bpf_object_prepare_token(struct bpf_object *obj)
> > +{
> > +     const char *bpffs_path;
> > +     int bpffs_fd = -1, token_fd, err;
> > +     bool mandatory;
> > +     enum libbpf_print_level level = LIBBPF_DEBUG;
>
> redundant set on level?
>

yep, removed initialization

> > +
> > +     /* token is already set up */
> > +     if (obj->token_fd > 0)
> > +             return 0;
> > +     /* token is explicitly prevented */
> > +     if (obj->token_fd < 0) {
> > +             pr_debug("object '%s': token is prevented, skipping...\n", obj->name);
> > +             /* reset to zero to avoid extra checks during map_create and prog_load steps */
> > +             obj->token_fd = 0;
> > +             return 0;
> > +     }
> > +
> > +     mandatory = obj->token_path != NULL;
> > +     level = mandatory ? LIBBPF_WARN : LIBBPF_DEBUG;
> > +
> > +     bpffs_path = obj->token_path ?: BPF_FS_DEFAULT_PATH;
> > +     bpffs_fd = open(bpffs_path, O_DIRECTORY, O_RDWR);
> > +     if (bpffs_fd < 0) {
> > +             err = -errno;
> > +             __pr(level, "object '%s': failed (%d) to open BPF FS mount at '%s'%s\n",
> > +                  obj->name, err, bpffs_path,
> > +                  mandatory ? "" : ", skipping optional step...");
> > +             return mandatory ? err : 0;
> > +     }
> > +
> > +     token_fd = bpf_token_create(bpffs_fd, 0);
>
> Did this get tested on older kernels? In that case TOKEN_CREATE will
> fail with -EINVAL.

yep, I did actually test, it will generate expected *debug*-level
"failed to create BPF token" message

>
> > +     close(bpffs_fd);
> > +     if (token_fd < 0) {
> > +             if (!mandatory && token_fd == -ENOENT) {
> > +                     pr_debug("object '%s': BPF FS at '%s' doesn't have BPF token delegation set up, skipping...\n",
> > +                              obj->name, bpffs_path);
> > +                     return 0;
> > +             }
>
> Isn't there a case here we should give a warning about?  If BPF_TOKEN_CREATE
> exists and !mandatory, but default BPFFS failed for enomem, or eperm reasons?
> If the user reall/y doesn't want tokens here they should maybe override with
> -1 token? My thought is if you have delegations set up then something on the
> system is trying to configure this and an error might be ok? I'm asking just
> because I paused on it for a bit not sure either way at the moment. I might
> imagine a lazy program not specifying the default bpffs, but also really
> thinking its going to get a valid token.

Interesting perspective! I actually came from the direction that BPF
token is not really all that common and expected thing, and so in
majority of cases (at least for some time) we won't be expecting to
have BPF FS with delegation options. So emitting a warning that
"something something BPF token failed" would be disconcerting to most
users.

What's the worst that would happen if BPF token was expected but we
failed to instantiate it? You'll get a BPF object load failure with
-EPERM, so it will be a pretty clear signal that whatever delegation
was supposed to happen didn't happen.

Also, if a user wants a BPF token for sure, they can explicitly set
bpf_token_path = "/sys/fs/bpf" and then it becomes mandatory.

So tl;dr, my perspective is that most users won't know or care about
BPF tokens. If sysadmin set up BPF FS correctly, it should just work
without the BPF application being aware. But for those rare cases
where a BPF token is expected and necessary, explicit bpf_token_path
or bpf_token_fd is the way to fail early, if something is not set up
the way it is expected.

>
>
> > +             __pr(level, "object '%s': failed (%d) to create BPF token from '%s'%s\n",
> > +                  obj->name, token_fd, bpffs_path,
> > +                  mandatory ? "" : ", skipping optional step...");
> > +             return mandatory ? token_fd : 0;
> > +     }
> > +
> > +     obj->feat_cache = calloc(1, sizeof(*obj->feat_cache));
> > +     if (!obj->feat_cache) {
> > +             close(token_fd);
> > +             return -ENOMEM;
> > +     }
> > +
> > +     obj->token_fd = token_fd;
> > +     obj->feat_cache->token_fd = token_fd;
> > +
> > +     return 0;
> > +}
> > +
> >  static int
> >  bpf_object__probe_loading(struct bpf_object *obj)
> >  {
> > @@ -4601,6 +4664,7 @@ bpf_object__probe_loading(struct bpf_object *obj)
> >               BPF_EXIT_INSN(),
> >       };
> >       int ret, insn_cnt = ARRAY_SIZE(insns);
> > +     LIBBPF_OPTS(bpf_prog_load_opts, opts, .token_fd = obj->token_fd);
> >
> >       if (obj->gen_loader)
> >               return 0;
> > @@ -4610,9 +4674,9 @@ bpf_object__probe_loading(struct bpf_object *obj)
> >               pr_warn("Failed to bump RLIMIT_MEMLOCK (err = %d), you might need to do it explicitly!\n", ret);
> >
> >       /* make sure basic loading works */
> > -     ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, NULL);
> > +     ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &opts);
> >       if (ret < 0)
> > -             ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, NULL);
> > +             ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);
> >       if (ret < 0) {
> >               ret = errno;
> >               cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
> > @@ -4635,6 +4699,9 @@ bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
> >                */
> >               return true;
> >
> > +     if (obj->token_fd)
> > +             return feat_supported(obj->feat_cache, feat_id);
>
> OK that answers feat_supported() non null from earlier patch. Just
> was reading in order.
>

yep, no worries, that's what I assumed :)

> > +
> >       return feat_supported(NULL, feat_id);
> >  }
>
> ...
>
> >       btf_fd = bpf_object__btf_fd(obj);
> > @@ -7050,10 +7119,10 @@ static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object
> >  static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, size_t obj_buf_sz,
> >                                         const struct bpf_object_open_opts *opts)
> >  {
> > -     const char *obj_name, *kconfig, *btf_tmp_path;
> > +     const char *obj_name, *kconfig, *btf_tmp_path, *token_path;
> >       struct bpf_object *obj;
> >       char tmp_name[64];
> > -     int err;
> > +     int err, token_fd;
> >       char *log_buf;
> >       size_t log_size;
> >       __u32 log_level;
> > @@ -7087,6 +7156,20 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
> >       if (log_size && !log_buf)
> >               return ERR_PTR(-EINVAL);
> >
> > +     token_path = OPTS_GET(opts, bpf_token_path, NULL);
> > +     token_fd = OPTS_GET(opts, bpf_token_fd, -1);
> > +     /* non-empty token path can't be combined with invalid token FD */
> > +     if (token_path && token_path[0] != '\0' && token_fd < 0)
> > +             return ERR_PTR(-EINVAL);
> > +     if (token_path && token_path[0] == '\0') {
> > +             /* empty token path can't be combined with valid token FD */
> > +             if (token_fd > 0)
> > +                     return ERR_PTR(-EINVAL);
> > +             /* empty token_path is equivalent to invalid token_fd */
> > +             token_path = NULL;
> > +             token_fd = -1;
> > +     }
> > +
> >       obj = bpf_object__new(path, obj_buf, obj_buf_sz, obj_name);
> >       if (IS_ERR(obj))
> >               return obj;
> > @@ -7095,6 +7178,23 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
> >       obj->log_size = log_size;
> >       obj->log_level = log_level;
> >
> > +     obj->token_fd = token_fd <= 0 ? token_fd : dup_good_fd(token_fd);
> > +     if (token_fd > 0 && obj->token_fd < 0) {
> > +             err = -errno;
> > +             goto out;
> > +     }
> > +     if (token_path) {
> > +             if (strlen(token_path) >= PATH_MAX) {
>
> small nit, might be cleaner to just have this up where the other sanity
> checks are done? e.g.
>
>    `token_path[0] !=` `\0` && token_path(token_path) < PATH_MAX`
>
> just to abort earlier. But not sure I care much.

yep, makes sense, I'll move ENAMETOOLONG up

>
> > +                     err = -ENAMETOOLONG;
> > +                     goto out;
> > +             }
> > +             obj->token_path = strdup(token_path);
> > +             if (!obj->token_path) {
> > +                     err = -ENOMEM;
> > +                     goto out;
> > +             }
> > +     }
> > +
> >       btf_tmp_path = OPTS_GET(opts, btf_custom_path, NULL);
> >       if (btf_tmp_path) {
> >               if (strlen(btf_tmp_path) >= PATH_MAX) {

[...]