[Devel] Re: [C/R v20][PATCH 37/96] c/r: introduce new 'file_operations': ->checkpoint, ->collect()
Matt Helsley
matthltc at us.ibm.com
Mon Mar 22 03:16:35 PDT 2010
On Mon, Mar 22, 2010 at 05:34:28PM +1100, Nick Piggin wrote:
> On Thu, Mar 18, 2010 at 08:59:46PM -0400, Oren Laadan wrote:
> > While we assume all normal files and directories can be checkpointed,
> > there are, as usual in the VFS, specialized places that will always
> > need an ability to override these defaults. Although we could do this
> > completely in the checkpoint code, that would bitrot quickly.
> >
> > This adds a new 'file_operations' function for checkpointing a file.
> > It is assumed that there should be a dirt-simple way to make something
> > (un)checkpointable that fits in with current code.
> >
> > As you can see in the ext[234] patches down the road, all that we have
> > to do to make something simple be supported is add a single "generic"
> > f_op entry.
> >
> > Also adds a new 'file_operations' function for 'collecting' a file for
> > leak-detection during full-container checkpoint. This is useful for
> > those files that hold references to other "collectable" objects. Two
> > examples are pty files that point to corresponding tty objects, and
> > eventpoll files that refer to the files they are monitoring.
> >
> > Finally, this patch introduces vfs_fcntl() so that it can be called
> > from restart (see patch adding restart of files).
> >
> > Changelog[v17]
> > - Introduce 'collect' method
> > Changelog[v17]
> > - Forward-declare 'ckpt_ctx' et-al, don't use checkpoint_types.h
> >
> > Signed-off-by: Oren Laadan <orenl at cs.columbia.edu>
> > Acked-by: Serge E. Hallyn <serue at us.ibm.com>
> > Tested-by: Serge E. Hallyn <serue at us.ibm.com>
> > ---
> > fs/fcntl.c | 21 +++++++++++++--------
> > include/linux/fs.h | 7 +++++++
> > 2 files changed, 20 insertions(+), 8 deletions(-)
> >
> > diff --git a/fs/fcntl.c b/fs/fcntl.c
> > index 97e01dc..e1f02ca 100644
> > --- a/fs/fcntl.c
> > +++ b/fs/fcntl.c
> > @@ -418,6 +418,18 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
> > return err;
> > }
> >
> > +int vfs_fcntl(int fd, unsigned int cmd, unsigned long arg, struct file *filp)
> > +{
> > + int err;
> > +
> > + err = security_file_fcntl(filp, cmd, arg);
> > + if (err)
> > + goto out;
> > + err = do_fcntl(fd, cmd, arg, filp);
> > + out:
> > + return err;
> > +}
> > +
> > SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
> > {
> > struct file *filp;
> > @@ -427,14 +439,7 @@ SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
> > if (!filp)
> > goto out;
> >
> > - err = security_file_fcntl(filp, cmd, arg);
> > - if (err) {
> > - fput(filp);
> > - return err;
> > - }
> > -
> > - err = do_fcntl(fd, cmd, arg, filp);
> > -
> > + err = vfs_fcntl(fd, cmd, arg, filp);
> > fput(filp);
> > out:
> > return err;
>
> There is no point combining these two logically distinct patches.
Good point.
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 6c08df2..65ebec5 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -394,6 +394,7 @@ struct kstatfs;
> > struct vm_area_struct;
> > struct vfsmount;
> > struct cred;
> > +struct ckpt_ctx;
> >
> > extern void __init inode_init(void);
> > extern void __init inode_init_early(void);
> > @@ -1093,6 +1094,8 @@ struct file_lock {
> >
> > #include <linux/fcntl.h>
> >
> > +extern int vfs_fcntl(int fd, unsigned cmd, unsigned long arg, struct file *fp);
> > +
> > extern void send_sigio(struct fown_struct *fown, int fd, int band);
> >
> > #ifdef CONFIG_FILE_LOCKING
> > @@ -1504,6 +1507,8 @@ struct file_operations {
> > ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int);
> > ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int);
> > int (*setlease)(struct file *, long, struct file_lock **);
> > + int (*checkpoint)(struct ckpt_ctx *, struct file *);
> > + int (*collect)(struct ckpt_ctx *, struct file *);
> > };
> >
> > struct inode_operations {
>
> You didn't add any documentation for this (unless it is in a following
> patch, which it shouldn't be).
Another good point -- we should have added that to
Documentation/filesystems/vfs.txt
>
> > @@ -2313,6 +2318,8 @@ void inode_sub_bytes(struct inode *inode, loff_t bytes);
> > loff_t inode_get_bytes(struct inode *inode);
> > void inode_set_bytes(struct inode *inode, loff_t bytes);
> >
> > +#define generic_file_checkpoint NULL
> > +
> > extern int vfs_readdir(struct file *, filldir_t, void *);
> >
> > extern int vfs_stat(char __user *, struct kstat *);
>
> Hmm, what does generic_file_checkpoint mean? A NULL checkpoint op means
> that checkpointing is allowed, and no action is required? Shouldn't it
> be an opt-in operation, where NULL means not allowed?
generic_file_checkpoint is for files that have a seek operation and can be
backed up or restored with a simple copy.
A NULL checkpoint op means "not allowed" as you thought it should. What
gave you the impression it was otherwise? Here's the relevant snippet
from checkpoint/files.c:
/* checkpoint callback for file pointer */
int checkpoint_file(struct ckpt_ctx *ctx, void *ptr)
{
struct file *file = (struct file *) ptr;
int ret;
if (!file->f_op || !file->f_op->checkpoint) {
ckpt_err(ctx, -EBADF, "%(T)%(P)%(V)f_op lacks checkpoint\n",
file, file->f_op);
return -EBADF;
}
> Either way, I don't know if you need to have this #define, provided you
> have sufficient documentation.
We need it (or a suitable replacement) to avoid adding #ifdef around
assignments to the operation in every filesystem. It's used if
CONFIG_CHECKPOINT is not defined.
Thanks for the review.
Cheers,
-Matt Helsley
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list