[CRIU] implementing some kind of --leave-frozen option for c/r in CRIU
Andrew Vagin
avagin at virtuozzo.com
Mon May 16 15:34:59 PDT 2016
On Tue, May 10, 2016 at 11:04:56AM -0600, Tycho Andersen wrote:
> Hi guys,
>
> I'm looking at implementing some kind of --leave-frozen option in
> CRIU, so that we can have a basic UX in LXD where we can wait for the
> restore to be successful before we kill the checkpointed container. I
> know p.haul does this by just using a callback, but it would be sort
> of painful to absorb just the callback part without doing a lot of
> extra engineering. We'll get LXD using p.haul someday, though :)
>
> The actual --leave-frozen patch is not so bad (see attached), but I'm
> not sure what to do about the network locking/unlocking bits.
>
> It seems like it is always safe to do the bits in
> cpt_unlock_tcp_connections() since that's just disabling tcp repair
> mode, but all of the iptables rules seem necessary in order to keep
> the network locked.
>
> So my question is: is there a nice way we can "tag" these rules so
> that something can come by and delete them later? I was thinking about
> having criu add a comment (via -m comment --comment "CRIU-LOCK-RULE")
> to each rule it adds, but I'm not sure if there's a better way, or if
> I've missed something entirely.
>
> Thanks!
We create a separate CRIU table, when we dump/restore a network
namespace. Maybe we need to do the same for a case when we don't dump
netns and collect all rules in this table.
In this case, we need only drop this table to unlock network.
I'm sorry for late response.
>
> Tycho
> From bf73295d1bc4c50bd25192ce5507e97237ea3d2b Mon Sep 17 00:00:00 2001
> From: Tycho Andersen <tycho.andersen at canonical.com>
> Date: Fri, 6 May 2016 12:51:33 -0500
> Subject: [PATCH] opts: add a --leave-frozen option
>
> In LXD, we want to invoke criu, but leave tasks in the freezer until we are
> sure that the restore on the other side worked, and then kill the tasks.
>
> Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
> ---
> criu/cr-dump.c | 7 ++++---
> criu/crtools.c | 9 +++++++++
> criu/include/image.h | 1 +
> criu/ptrace.c | 2 ++
> criu/seize.c | 7 ++++++-
> 5 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
> index 5ac9fd0..cae288f 100644
> --- a/criu/cr-dump.c
> +++ b/criu/cr-dump.c
> @@ -1491,7 +1491,7 @@ int cr_pre_dump_tasks(pid_t pid)
> opts.track_mem = true;
> }
>
> - if (opts.final_state == TASK_DEAD) {
> + if (opts.final_state == TASK_DEAD || opts.final_state == TASK_FROZEN) {
> pr_info("Enforcing tasks run after pre-dump.\n");
> opts.final_state = TASK_ALIVE;
> }
> @@ -1594,8 +1594,9 @@ static int cr_dump_finish(int ret)
> * consistency of the FS and other resources, we simply
> * start rollback procedure and cleanup everyhting.
> */
> - if (ret || post_dump_ret || opts.final_state == TASK_ALIVE) {
> - network_unlock();
> + if (ret || post_dump_ret || opts.final_state == TASK_ALIVE || opts.final_state == TASK_FROZEN) {
> + if (opts.final_state != TASK_FROZEN)
> + network_unlock();
> delete_link_remaps();
> }
> pstree_switch_state(root_item,
> diff --git a/criu/crtools.c b/criu/crtools.c
> index 7a0f977..262cd77 100644
> --- a/criu/crtools.c
> +++ b/criu/crtools.c
> @@ -320,6 +320,7 @@ int main(int argc, char *argv[], char *envp[])
> { "extra", no_argument, 0, 1077 },
> { "experimental", no_argument, 0, 1078 },
> { "all", no_argument, 0, 1079 },
> + { "leave-frozen", no_argument, 0, 1080 },
> { },
> };
>
> @@ -630,6 +631,9 @@ int main(int argc, char *argv[], char *envp[])
> case 'h':
> usage_error = false;
> goto usage;
> + case 1080:
> + opts.final_state = TASK_FROZEN;
> + break;
> default:
> goto usage;
> }
> @@ -650,6 +654,11 @@ int main(int argc, char *argv[], char *envp[])
> return 1;
> }
>
> + if (!opts.freeze_cgroup && opts.final_state == TASK_FROZEN) {
> + pr_msg("--leave-frozen requires --freeze-cgroup\n");
> + return 1;
> + }
> +
> if (work_dir == NULL)
> work_dir = imgs_dir;
>
> diff --git a/criu/include/image.h b/criu/include/image.h
> index f141915..eb51c4e 100644
> --- a/criu/include/image.h
> +++ b/criu/include/image.h
> @@ -118,6 +118,7 @@
> #define TASK_STOPPED 0x3
> #define TASK_HELPER 0x4
> #define TASK_THREAD 0x5
> +#define TASK_FROZEN 0x6
>
> #define CR_PARENT_LINK "parent"
>
> diff --git a/criu/ptrace.c b/criu/ptrace.c
> index 25970fc..85513fb 100644
> --- a/criu/ptrace.c
> +++ b/criu/ptrace.c
> @@ -47,6 +47,8 @@ int unseize_task(pid_t pid, int orig_st, int st)
> */
> if (orig_st == TASK_STOPPED)
> kill(pid, SIGSTOP);
> + } else if (st == TASK_FROZEN) {
> + /* don't need to send any signals */
> } else
> pr_err("Unknown final state %d\n", st);
>
> diff --git a/criu/seize.c b/criu/seize.c
> index 0ea7a28..88dd392 100644
> --- a/criu/seize.c
> +++ b/criu/seize.c
> @@ -468,8 +468,13 @@ void pstree_switch_state(struct pstree_item *root_item, int st)
> if (!root_item)
> return;
>
> - if (st != TASK_DEAD)
> + if (st == TASK_FROZEN) {
> + /* force restoring the FROZEN state */
> + freezer_thawed = false;
> freezer_restore_state();
> + } else if (st != TASK_DEAD) {
> + freezer_restore_state();
> + }
>
> /*
> * We need to detach from all processes before waiting the init
> --
> 2.7.4
>
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
More information about the CRIU
mailing list