[CRIU] Looking into checkpoint/restore of ROCm applications

Felix Kuehling felix.kuehling at gmail.com
Tue Jun 16 03:59:08 MSK 2020


Hi all,

I'm investigating the possibility of making CRIU work with ROCm, the AMD
Radeon Open Compute Platform. I need some advice, but I'll give you some
background first.

ROCm uses the /dev/kfd device as well as /dev/dri/renderD*. I'm planning
to do most of the state saving using /dev/kfd with a cr_plugin_dump_file
callback in a plugin. I've spent some time reading documentation on
criu.org and also CRIU source code. At this point I believe I have a
fairly good understanding of the low level details of saving kernel mode
state associated with ROCm processes.

I have more trouble with restoring the state. The main issue is the way
KFD maps system memory for device access using HMM (or get_user pages
and MMU notifiers with DKMS on older kernels). This requires the VMAs to
be at the expected virtual addresses before we try to mirror them into
the GPU page table. Resuming execution on the GPU also needs to be
delayed until after the GPU memory mappings have been restored.

At the time of the cr_plugin_restore_file callback, the VMAs are not at
the right place in the restored process, so this is too early to restore
the GPU memory mappings. I can send the mappings and their properties to
KFD but KFD needs to wait for some later trigger event before it
activates the mappings and their MMU notifiers.

So this is my question: What would be a good trigger event to indicate
that VMAs have been moved to their proper location by the restorer
parasite code? I have considered two possibilities that will not work.
I'm hoping you can give me some better ideas:

  * cr_plugin_fini
      o Doesn't get called in all the child processes, not sure if there
        is synchronization with the child processes' restore completion
  * An MMU notifier on the munmap of the restorer parasite blob itself
      o In cr_plugin_restore_file this address is not known yet

I noticed that the child processes are resumed through sigreturn. I'm
not familiar with this mechanism. Does this mean there is some signal I
may be able to intercept just before execution of the child process resumes?

Thank you in advance for your insights.

Best regards,
  Felix


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20200615/5676debb/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvz.org/pipermail/criu/attachments/20200615/5676debb/attachment-0001.sig>


More information about the CRIU mailing list