Unprivileged filesystem mounts

Wed Mar 19 21:25:17 UTC 2025

On Wed, Mar 19, 2025 at 01:44:13PM -0400, Demi Marie Obenour wrote:
> > Note that this won't help if you have a malicious hardware that
> > *pretends* to be a USB storage device, but which doens't behave a like
> > a honest storage device.  For example, reading a particular sector
> > with one data at time T, and a different data at time T+X, with no
> > intervening writes.  There is no real defense to this attack, since
> > there is no way that you can authentiate the external storage device;
> > you could have a registry of USB vendor and model id's, but a device
> > can always lie about its id numbers.
> 
> This attack can be defended against by sandboxing the filesystem driver
> and copying files to trusted storage before using them.  You can
> authenticate devices based on what port they are plugged into, and Qubes
> OS is working on exactly that.

Copying files to trusted storge is not sufficient.  The problem is
that an untrustworthy storage device can still play games with
metadata blocks.  If you are willing to copy the entire storage device
to trustworthy storage, and then run fsck on the file system, and then
mount it, then *sure* that would help.  But if the storage device is
very large or very slow, this might not be practical.

> > Like everything else, security and usability and performance and costs
> > are all engineering tradeoffs....
>
> Is the tradeoff fundamental, or is it a consequence of Linux being a
> monolithic kernel?  If Linux were a microkernel and every filesystem
> driver ran as a userspace process with no access to anything but the
> device it is accessing, then there would be no tradeoff when it comes to
> filesystems: a compromised filesystem driver would have no more access
> than the device itself would, so compromising a filesystem driver would
> be of much less value to an attacker.  There is still the problem that
> plug and play is incompatible with not trusting devices to identify
> themselves, but that's a different concern.

Microkernels have historically been a performance disaster.  Yes, you
can invest a *vast* amount of effort into trying to make a microkernel
OS more performant, but in the meantime, the competing monolithic
kernel will have gotten even faster, or added more features, leaving
the microkernel in the dust.

The effort needed to create a new file system from scratch, taking it
all the way from the initial design, implementation, testing and
performance tuning, and making it something customers are comfortable
depending on it for enterprise workloads is between 50 and 100
engineer years.  This estimate came from looking at the development
effort needed for various file systems implemented on monolithic
kernels, including Digital's Advfs (part of Digital Unix and OSF/1),
IBM's AIX, and Sun's ZFS, as well as GPFS from IBM (although that was
a cluster file sytem, and the effort estimated from my talking to the
engineering managers and tech leads was around 200 PY's.)

I'm not sure how much harder it will be to make a performant file
system which is suitable for enterprise workloads from a performance,
feature, and stability perspective, *and* to make it secure against
storage devices which are outside the TCB, *and* to make it work on a
microkernel.  But I'm going to guess it would inflate these effort
estimates by at least 50%, if not more.

Of course, if we're just witing a super simple file system that is
suitable for backups and file transfers, but not much else, that would
probably take much less efort.  But if we need to support file
exchange with storge devices with NTFS or HFS, thos aren't simple file
sytes.  So the VM sandbox approach might still be the better way to go.

Cheers,

					- Ted