[CRIU] dump/restore of POSIX named semaphores
Rainhard Driessler
rainhard.driessler at artech.at
Tue Dec 14 15:03:49 MSK 2021
I am trying to dump/restore a process running third-party software that is now using POSIX named semaphores after a recent update.
After initial dump/restores failed due to invisible, seemingly random /dev/shm/* files, I took a deep dive into the intricacies of named
POSIX semaphores. They create random-named files (mktemp) first which are later unlinked once the desired sem.X file has been created (linked).
I wrote a simple producer / consumer process pair incrementing/decrementing a System V shared-memory counter protected by a POSIX named semaphore to
simulate the productive software and tried to dump/restore it using the latest build (criu-devel).
After starting the producer process, the named semaphore is available in /dev/shm as expected:
raini at rd-devel:~/Development/semtest$ sudo setsid unshare -i ./producer
raini at rd-devel:~/Development/semtest$ ls -lhai /dev/shm/*
5 -rw-r--r-- 1 root root 32 Dez 13 17:32 /dev/shm/sem.pcsync
The problematic links can already be seen in map_files:
raini at rd-devel:~/Development/semtest$ sudo ls -l /proc/$(pidof producer)/map_files/
...
lrw------- 1 root root 64 Dez 13 17:33 7f599057e000-7f599057f000 -> '/SYSV00000539 (deleted)'
lrw------- 1 root root 64 Dez 13 17:33 7f599057f000-7f5990580000 -> '/dev/shm/sem.4y8Ymq (deleted)'
...
Dumping the process fails due to the invisible file created in sem_open:
raini at rd-devel:~/Development/semtest/dump$ sudo criu dump -t $(pidof producer) --shell-job -D . -vvvv
...
(00.003612) Dumping path for -3 fd via self 12 [/dev/shm/sem.4y8Ymq (deleted)]
(00.003618) Strip ' (deleted)' tag from './dev/shm/sem.4y8Ymq (deleted)'
(00.003624) Error (criu/files-reg.c:991): Can't create link remap for /dev/shm/sem.4y8Ymq. Use link-remap option.
(00.003629) Error (criu/cr-dump.c:1269): Collect mappings (pid: 1895) failed with -1
(00.003672) Unlock network
(00.003677) Unfreezing tasks into 1
(00.003679) Unseizing 1895 into 1
(00.003700) Error (criu/cr-dump.c:1788): Dumping FAILED.
Using the link-remap option as instructed, the dump succeeds and files.img contains both the random named semaphore file
and a link remap file, but not the named semaphore file "sem.pcsync" as created by the producer process:
raini at rd-devel:~/Development/semtest/dump$ sudo crit decode --pretty -i files.img | grep shm
"name": "/dev/shm/link_remap.4"
"name": "/dev/shm/sem.4y8Ymq",
Since the semaphore file is not removed (sem_unlink) it still persists, together with the link_remap file that's not cleaned up:
raini at rd-devel:~/Development/semtest/dump$ ls -lhai /dev/shm/
total 8,0K
1 drwxrwxrwt 2 root root 80 Dez 13 17:42 .
1 drwxr-xr-x 20 root root 4,2K Dez 13 17:13 ..
5 -rw-r--r-- 2 root root 32 Dez 13 17:32 link_remap.4
5 -rw-r--r-- 2 root root 32 Dez 13 17:32 sem.pcsync
If I try to restore locally directly after dumping, the restore succeeds and semaphore operations (sem_wait/sem_post) still work as expected:
raini at rd-devel:~/Development/semtest/dump$ sudo criu restore -d -D . --shell-job --link-remap -vvvv
...
(00.003590) Found regular file mapping, OK
(00.003592) path: /SYSV00000539 (deleted)
(00.003615) Found regular file mapping, OK
(00.003630) Dumping path for -3 fd via self 12 [/dev/shm/sem.W5abgh (deleted)]
(00.003635) Strip ' (deleted)' tag from './dev/shm/sem.W5abgh (deleted)'
(00.003659) Only file size could be stored for validation for file /dev/shm/sem.W5abgh
(00.003672) Found regular file mapping, OK
(00.003689) Dumping path for -3 fd via self 12 [/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2]
...
(00.011397) Namespaces dump complete
(00.011446) cg: All tasks in criu's cgroups. Nothing to dump.
(00.011449) unix: Dumping external sockets
(00.011469) tty: Unpaired slave 0
(00.011474) Writing image inventory (version 1)
(00.011536) Running post-dump scripts
(00.011541) Unfreezing tasks into 2
(00.011543) Unseizing 3037 into 2
(00.011717) Writing stats
(00.011748) Dumping finished successfully
However, if I repeat the same procedure and restore either on another machine or on the same machine after reboot, the restore fails:
raini at rd-devel:~/Development/semtest/dump$ sudo criu restore -d -D . --shell-job --link-remap -vvvv
...
(00.066370) 3037: Opening 0x007f7659a78000-0x007f7659a79000 0000000000000000 (81) vma
(00.066387) 3037: Warn (criu/files-reg.c:1748): Can't link dev/shm/link_remap.4 -> dev/shm/sem.W5abgh
(00.066415) 3037: Error (criu/files-reg.c:2125): Can't link dev/shm/link_remap.4 -> dev/shm/sem.W5abgh: No such file or directory
(00.066432) 3037: Error (criu/mem.c:1349): `- Can't open vma
(00.066502) Error (criu/cr-restore.c:2470): Restoring FAILED.
I would have assumed that the link_remap.4 file would be restored by criu itself, however restore fails if it does not exist, which it wouldn't in a productive scenario.
Furthermore I noticed that even if the link_remap.4 file created by the original dump persists but the /dev/shm.pcsync semaphore file is deleted, it is not restored during a successful restore.
How can I get criu to consistently dump/restore POSIX named semaphores?
If it is not supported - what's the problem compared to SysV semaphores/shared-memory and how could I tackle adding named semaphore support within criu?
FN 181686 k. HG Wien, UID-Nr. ATU 47056901, zertifiziert nach ISO 9001:2015 Nr. 04036/0
Der Inhalt dieser E-Mail ist vertraulich und ausschlie?lich f?r den bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Ver?ffentlichung, Vervielf?ltigung oder Weitergabe des Inhaltes dieser E-Mail unzul?ssig ist. Wir bitten Sie, sich in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen und die E-Mail zu vernichten. F?r ?bermittlungsfehler oder sonstige Irrt?mer bei der ?bermittlung besteht keine Haftung.
This e-mail is intended solely for the person to whom it is addressed and may contain confidential or legally privileged information. Access to this e-mail by anyone else is unauthorized. If an addressing or transmission error has misdirected this e-mail, please notify the author by replying to this e-mail and destroy this e-mail and any attachments. E-mail may be susceptible to data corruption, interception, unauthorized amendment, viruses and delays or the consequences thereof. If you are not the intended recipient, be advised that you have received this e-mail in error and that any use, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20211214/07759741/attachment.html>
More information about the CRIU
mailing list