Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

Eric Biggers ebiggers at kernel.org
Thu Oct 22 20:59:32 UTC 2020


On Thu, Oct 22, 2020 at 10:00:44AM -0700, Nick Desaulniers wrote:
> On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <willy at infradead.org> wrote:
> >
> > On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > > Wait...
> > > readv(2) defines:
> > >       ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
> >
> > It doesn't really matter what the manpage says.  What does the AOSP
> > libc header say?
> 
> Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38
> 
> Theoretically someone could bypass libc to make a system call, right?
> 
> >
> > > But the syscall is defined as:
> > >
> > > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> > >                 unsigned long, vlen)
> > > {
> > >         return do_readv(fd, vec, vlen, 0);
> > > }
> >
> 

FWIW, glibc makes the readv() syscall assuming that fd and vlen are 'int' as
well.  So this problem isn't specific to Android's libc.

>From objdump -d /lib/x86_64-linux-gnu/libc.so.6:

	00000000000f4db0 <readv@@GLIBC_2.2.5>:
	   f4db0:       64 8b 04 25 18 00 00    mov    %fs:0x18,%eax
	   f4db7:       00
	   f4db8:       85 c0                   test   %eax,%eax
	   f4dba:       75 14                   jne    f4dd0 <readv@@GLIBC_2.2.5+0x20>
	   f4dbc:       b8 13 00 00 00          mov    $0x13,%eax
	   f4dc1:       0f 05                   syscall
	   ...

There's some code for pthread cancellation, but no zeroing of the upper half of
the fd and vlen arguments, which are in %edi and %edx respectively.  But the
glibc function prototype uses 'int' for them, not 'unsigned long'
'ssize_t readv(int fd, const struct iovec *iov, int iovcnt);'.

So the high halves of the fd and iovcnt registers can contain garbage.  Or at
least that's what gcc (9.3.0) and clang (9.0.1) assume; they both compile the
following

void g(unsigned int x);

void f(unsigned long x)
{
        g(x);
}

into f() making a tail call to g(), without zeroing the top half of %rdi.

Also note the following program succeeds on Linux 5.9 on x86_64.  On kernels
that have this bug, it should fail.  (I couldn't get it to actually fail, so it
must depend on the compiler and/or the kernel config...)

	#include <fcntl.h>
	#include <stdio.h>
	#include <sys/syscall.h>
	#include <sys/uio.h>
	#include <unistd.h>

	int main()
	{
		int fd = open("/dev/zero", O_RDONLY);
		char buf[1000];
		struct iovec iov = { .iov_base = buf, .iov_len = sizeof(buf) };
		long ret;

		ret = syscall(__NR_readv, fd, &iov, 0x100000001);
		if (ret < 0)
			perror("readv failed");
		else
			printf("read %ld bytes\n", ret);
	}

I think the right fix is to change the readv() (and writev(), etc.) syscalls to
take 'unsigned int' rather than 'unsigned long', as that is what the users are
assuming...

- Eric



More information about the Linux-security-module-archive mailing list