[CRIU] OverlayFS Patch

Wed Aug 5 14:36:44 PDT 2015

More specifically, I'm looking at this code:

/**
 * ovl_statfs
 * @sb: The overlayfs super block
 * @buf: The struct kstatfs to fill in with stats
 *
 * Get the filesystem statistics.  As writes always target the upper layer
 * filesystem pass the statfs to the same filesystem.
 */
static int ovl_statfs(struct dentry *dentry, struct kstatfs *buf)
{
        struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
        struct dentry *root_dentry = dentry->d_sb->s_root;
        struct path path;
        int err;

        ovl_path_upper(root_dentry, &path);

        err = vfs_statfs(&path, buf);
        if (!err) {
                buf->f_namelen = max(buf->f_namelen, ofs->lower_namelen);
                buf->f_type = OVERLAYFS_SUPER_MAGIC;
        }

        return err;
}

So, indeed, it's forwarding the call to the upper layer (since it's
where writes happen) but before returning it's overwriting f_type with
OVERLAYFS_SUPER_MAGIC which is the constant with the value I posted
above...

Cheers,
Filipe

On Wed, Aug 5, 2015 at 2:34 PM, Filipe Brandenburger
<filbranden at google.com> wrote:
> Can you try a directory instead of a file?
>
> Either a subdirectory or the root of the filesystem (where it's mounted.)
>
>
> On Wed, Aug 5, 2015 at 2:31 PM, Gabriel Guimaraes
> <gabriellimaguimaraes at gmail.com> wrote:
>> Hi Filipe,
>>
>> I just checked the value of p->fs_type when referring to /tmp/file within a
>> Docker container, using overlayFS. It reports
>>
>> p->fs_type = 0xef53 which I believe is the magic number of ext2/3/4
>>
>> 2015-08-05 14:25 GMT-07:00 Filipe Brandenburger <filbranden at google.com>:
>>>
>>> Hi Gabriel,
>>>
>>> Looking at the code, I believe it f_type in statfs() output will
>>> return this value for overlayfs:
>>>
>>> fs/overlayfs/super.c:#define OVERLAYFS_SUPER_MAGIC 0x794c7630
>>>
>>> You can easily double check that with a test program running it
>>> against a mounted overlayfs.
>>>
>>> Cheers,
>>> Filipe
>>>
>>>
>>> On Wed, Aug 5, 2015 at 2:19 PM, Gabriel Guimaraes
>>> <gabriellimaguimaraes at gmail.com> wrote:
>>> >
>>> >
>>> > 2015-08-05 13:36 GMT-07:00 Pavel Emelyanov <xemul at parallels.com>:
>>> >>
>>> >> On 08/05/2015 08:51 PM, Gabriel Guimaraes wrote:
>>> >> > Hi Pavel,
>>> >> >
>>> >> > Thanks for the review.
>>> >> >
>>> >> > On Wed, Aug 5, 2015 at 4:00 AM, Pavel Emelyanov <xemul at parallels.com
>>> >> > <mailto:xemul at parallels.com>> wrote:
>>> >> >
>>> >> >     Gabriel,
>>> >> >
>>> >> >     Please, find my comments inline.
>>> >> >
>>> >> >     > diff --git a/crtools.c b/crtools.c
>>> >> >     > index 6af6080..521d0ff 100644
>>> >> >     > --- a/crtools.c
>>> >> >     > +++ b/crtools.c
>>> >> >     > @@ -235,6 +235,7 @@ int main(int argc, char *argv[], char
>>> >> > *envp[])
>>> >> >     >               { "enable-fs",
>>> >> > required_argument,
>>> >> > 0, 1065 },
>>> >> >     >               { "enable-external-sharing",    no_argument,
>>> >> > 0, 1066 },
>>> >> >     >               { "enable-external-masters",    no_argument,
>>> >> > 0, 1067 },
>>> >> >     > +             { "overlayfs",                  no_argument,
>>> >> > 0, 1068 },
>>> >> >
>>> >> >     With AUFS workaround we ended up with auto-detection scheme --
>>> >> > when
>>> >> >     meeting an AUFS mount the opts.aufs is set to true. Can we do the
>>> >> >     same with overlayfs?
>>> >> >
>>> >> >
>>> >> > Technically we could always enable overlayfs by default, since we
>>> >> > only
>>> >> > do something if there
>>> >> > is an overlayfs mounted entry in the mountinfo table. Should I enable
>>> >> > it
>>> >> > always?
>>> >>
>>> >> In the fill_fdlink() there's fd_parms structure with fs_type field
>>> >> filled
>>> >> from
>>> >> statfs() syscall. If we check for this value being overlayfs magic, we
>>> >> can
>>> >> silently
>>> >> imply the workaround is on and go calling fixup_overlayfs().
>>> >>
>>> > Per man statfs, I don't think Overlayfs has a magic number. OverlayFS
>>> > just
>>> > overlays two filesystems on top of each other. From statfs' point of
>>> > view, a
>>> > file's filesystem type is the original one, not OverlayFS.
>>> >
>>> > More information here:
>>> > https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt
>>> >>
>>> >> >     > +static struct mount_info *__lookup_overlayfs(struct mount_info
>>> >> > *list, char *rpath,
>>> >> >     > +                                             unsigned int
>>> >> > st_dev,
>>> >> > unsigned int st_ino,
>>> >> >     > +                                             unsigned int
>>> >> > mnt_id)
>>> >> >     > +{
>>> >> >     > +     /*
>>> >> >     > +      * Goes through all entries in the mountinfo table
>>> >> >     > +      * looking for a mount point that contains the file
>>> >> > specified
>>> >> >     > +      * in rpath. Uses the device number st_dev and the inode
>>> >> > number st_ino
>>> >> >     > +      * to make sure the file is correct.
>>> >> >     > +      */
>>> >> >     > +     struct mount_info *mi_ret = NULL;
>>> >> >     > +     struct mount_info *m;
>>> >> >     > +
>>> >> >     > +     for (m = list; m != NULL; m = m->next) {
>>> >> >     > +             /* Case 1: If mnt_id is correct, don't do
>>> >> > anything
>>> >> > */
>>> >> >     > +             if (st_dev == m->s_dev && mnt_id == m->mnt_id)
>>> >> >     > +                     return NULL;
>>> >> >
>>> >> >     Shouldn't this if () be under the next one, so that everything
>>> >> > that
>>> >> > happens
>>> >> >     here applies to overlayfs only?
>>> >> >
>>> >> >
>>> >> > This if () is basically a check to see if the bug exists. We look for
>>> >> > all instances of the device
>>> >> > st_dev and see if any of them match our mnt_id. If that's not the
>>> >> > case,
>>> >> > the bug exists and we want
>>> >> > to proceed to try to fix it.
>>> >>
>>> >> OK
>>> >>
>>> >> > I could have left this as a separate for loop to make this more
>>> >> > clear,
>>> >> > but this way we only use
>>> >> > one for loop for both. I believe it's better if we separate it into
>>> >> > two
>>> >> > for loops since we prevent
>>> >> > some unnecessary stating. I can also move this part to
>>> >> > lookup_overlayfs,
>>> >> > before dealing with
>>> >> > namespaces. What do you think?
>>> >>
>>> >> Yup, makes sense to me.
>>> >>
>>> >
>>
>>