[Users] Issues after updating to 7.0.14 (136)
Jehan Procaccia IMT
jehan.procaccia at imtbs-tsp.eu
Mon Jul 6 21:40:07 MSK 2020
Hello
If it can help, what I did so far to try to re-enable dead CTs
# prlctl stop ldap2
Stopping the CT...
Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: Cannot
lock the Container
)
# cat /vz/lock/144dc737-b4e3-4c03-852c-25a6df06cee4.lck
6227
resuming
# ps auwx | grep 6227
root 6227 0.0 0.0 92140 6984 ? S 15:10 0:00
/usr/sbin/vzctl resume 144dc737-b4e3-4c03-852c-25a6df06cee4
# kill -9 6227
still cannot stop the CT (Cannot lock the Container...)
# df |grep 144dc737-b4e3-4c03-852c-25a6df06cee4
/dev/ploop11432p1 10188052 2546636 7100848 27%
/vz/root/144dc737-b4e3-4c03-852c-25a6df06cee4
none 1048576 0 1048576 0%
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/dump/Dump/.criu.cgyard.56I2ls
# umount /dev/ploop11432p1
# ploop check -F
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds
Reopen rw /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds
Error in ploop_check (check.c:663): Dirty flag is set
# ploop mount
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/DiskDescriptor.xml
Error in ploop_mount_image (ploop.c:2495): Image
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds
already used by device /dev/ploop11432
# df -H | grep ploop11432
=> nothing
I am lost , any help appreciated .
Thanks .
Le 06/07/2020 à 15:37, Jehan Procaccia IMT a écrit :
>
> Hello,
>
> I am back to the initial pb related to that post , since I updated to
> /OpenVZ release 7.0.14 (136) | ///Virtuozzo Linux release 7.8.0
> (609)// , I am also facing CT corrupted status .
>
> I don't see the exact same error as mentioned by Kevin Drysdale below
> (ploop/fsck) , but I am not able to enter certain CT neither can I
> stop them
>
> /[root at olb~]# prlctl stop trans8//
> //Stopping the CT...//
> //Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details:
> Cannot lock the Container//
> //)//
> /
>
> /[root at olb ~]# prlctl enter trans8//
> //Unable to get init pid//
> //enter into CT failed//
> //
> //exited from CT 02faecdd-ddb6-42eb-8103-202508f18256/
>
> For those CTs that fail to enter or stop, I noticed that there is a
> 2nd device mounted with name ending in /dump/Dump/.criu.cgyard.4EJB8c//
> /
>
> /[root at olb ~]# df -H |grep 02faecdd-ddb6-42eb-8103-202508f18256//
> ///dev/ploop53152p1 11G 2,2G 7,7G 23%
> /vz/root/02faecdd-ddb6-42eb-8103-202508f18256//
> //none 537M 0 537M 0%
> /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/dump/Dump/.criu.cgyard.4EJB8c/
>
>
> //[root at olb ~]# prlctl list | grep 02faecdd-ddb6-42eb-8103-202508f18256//
> //{02faecdd-ddb6-42eb-8103-202508f18256} running 157.159.196.17 CT
> isptrans8//
> //
>
> I rebooted the whole hardware node, and since reboot here is the
> related vzctl.log
>
> /2020-07-06T15:10:38+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Removing the stale lock file
> /vz/lock/02faecdd-ddb6-42eb-8103-202508f18256.lck//
> //2020-07-06T15:10:38+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Restoring the Container ...//
> //2020-07-06T15:10:38+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Mount image:
> /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd //
> //2020-07-06T15:10:38+0200 : Opening delta
> /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
> //2020-07-06T15:10:38+0200 : Opening delta
> /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
> //2020-07-06T15:10:38+0200 : Opening delta
> /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
> //2020-07-06T15:10:38+0200 : Adding delta dev=/dev/ploop53152
> img=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds
> (rw)//
> //2020-07-06T15:10:39+0200 : Mounted /dev/ploop53152p1 at
> /vz/root/02faecdd-ddb6-42eb-8103-202508f18256 fstype=ext4
> data=',balloon_ino=12' //
> //2020-07-06T15:10:39+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Container is mounted//
> //2020-07-06T15:10:40+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Setting permissions for
> image=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd//
> //2020-07-06T15:10:40+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Configure memguarantee: 0%//
> //2020-07-06T15:18:12+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid//
> //2020-07-06T15:18:12+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed//
> //2020-07-06T15:19:49+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Cannot lock the Container//
> //2020-07-06T15:25:33+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid//
> //2020-07-06T15:25:33+0200 vzctl : CT
> 02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed/
>
> on another CT failing to enter / stop same kind of logs + /Error
> (criu /:
>
> /2020-07-06T15:10:38+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Restoring the Container ...//
> //2020-07-06T15:10:38+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Mount image:
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd //
> //2020-07-06T15:10:38+0200 : Opening delta
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
> //2020-07-06T15:10:39+0200 : Opening delta
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
> //2020-07-06T15:10:39+0200 : Opening delta
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
> //2020-07-06T15:10:39+0200 : Adding delta dev=/dev/ploop36049
> img=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds
> (rw)//
> //2020-07-06T15:10:41+0200 : Mounted /dev/ploop36049p1 at
> /vz/root/4ae48335-5b63-475d-8629-c8d742cb0ba0 fstype=ext4
> data=',balloon_ino=12' //
> //2020-07-06T15:10:41+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Container is mounted//
> //2020-07-06T15:10:41+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Setting permissions for
> image=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd//
> //2020-07-06T15:10:41+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Configure memguarantee: 0%//
> //2020-07-06T15:10:57+0200 vzeventd : Run: /etc/vz/vzevent.d/ve-stop
> id=4ae48335-5b63-475d-8629-c8d742cb0ba0//
> //2020-07-06T15:10:57+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : (03.038774) Error
> (criu/util.c:666): exited, status=4//
> //2020-07-06T15:10:57+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : (14.446513) 1: Error
> (criu/files.c:230): Empty list on file desc id 0x1f(5)//
> //2020-07-06T15:10:57+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : (14.446518) 1: Error
> (criu/files.c:231): BUG at criu/files.c:231//
> //2020-07-06T15:10:57+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : (15.589529) Error
> (criu/cr-restore.c:1612): 7130 killed by signal 11: Segmentation fault//
> //2020-07-06T15:10:57+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : (15.604550) Error
> (criu/cr-restore.c:2614): Restoring FAILED.//
> //2020-07-06T15:10:57+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : The restore log was saved in
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/dump/Dump/restore.log//
> //2020-07-06T15:10:57+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : criu exited with rc=17//
> //2020-07-06T15:10:57+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Unmount image:
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd (190)//
> //2020-07-06T15:10:57+0200 : Unmounting file system at
> /vz/root/4ae48335-5b63-475d-8629-c8d742cb0ba0//
> //2020-07-06T15:11:31+0200 : Opening delta
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
> //2020-07-06T15:11:31+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Container is unmounted//
> //2020-07-06T15:11:31+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Failed to restore the Container//
> //2020-07-06T15:11:31+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Restoring the Container ...//
> //2020-07-06T15:11:31+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Mount image:
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd //
> //2020-07-06T15:11:31+0200 : Opening delta
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
> //2020-07-06T15:11:31+0200 : Opening delta
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
> //2020-07-06T15:11:31+0200 : Opening delta
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
> //2020-07-06T15:11:31+0200 : Adding delta dev=/dev/ploop36049
> img=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds
> (rw)//
> //2020-07-06T15:11:31+0200 : Mounted /dev/ploop36049p1 at
> /vz/root/4ae48335-5b63-475d-8629-c8d742cb0ba0 fstype=ext4
> data=',balloon_ino=12' //
> //2020-07-06T15:11:31+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Container is mounted//
> //2020-07-06T15:11:31+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Setting permissions for
> image=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd//
> //2020-07-06T15:11:31+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Configure memguarantee: 0%//
> //2020-07-06T15:14:18+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Unable to get init pid//
> //2020-07-06T15:14:18+0200 vzctl : CT
> 4ae48335-5b63-475d-8629-c8d742cb0ba0 : enter into CT failed//
> /
>
> in prl-disp.log
>
> /07-06 15:10:30.797 F /virtuozzo:4836:4836/ register CT:
> 4ae48335-5b63-475d-8629-c8d742cb0ba0//
> //07-06 15:10:38.717 F /disp:4836:6163/ Processing command
> 'DspCmdVmStartEx' 1036 for CT
> uuid='{4ae48335-5b63-475d-8629-c8d742cb0ba0}' //
> //07-06 15:10:38.738 I /virtuozzo:4836:6234/ /usr/sbin/vzctl resume
> 4ae48335-5b63-475d-8629-c8d742cb0ba0//
> //07-06 15:10:48.542 I /disp:4836:5196/ vzevent: state=6,
> envid=4ae48335-5b63-475d-8629-c8d742cb0ba0//
> //07-06 15:10:57.364 I /disp:4836:5196/ vzevent: state=8,
> envid=4ae48335-5b63-475d-8629-c8d742cb0ba0//
> //07-06 15:10:57.475 I /disp:4836:5196/ vzevent: state=12,
> envid=4ae48335-5b63-475d-8629-c8d742cb0ba0//
> //07-06 15:11:31.161 F /virtuozzo:4836:6234/ /usr/sbin/vzctl utility
> failed: /usr/sbin/vzctl resume 4ae48335-5b63-475d-8629-c8d742cb0ba0 [6]//
> //Mount image:
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd //
> //Setting permissions for
> image=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd//
> //Unmount image:
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd (190)//
> //The restore log was saved in
> /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/dump/Dump/restore.log//
> //07-06 15:11:31.162 I /virtuozzo:4836:6234/ /usr/sbin/vzctl start
> 4ae48335-5b63-475d-8629-c8d742cb0ba0/
>
> Is this related to the update ? how can I renable those CT ?
>
> Thanks .
>
>
> //
>
>>
>>
>> Le 29/06/2020 à 12:30, Kevin Drysdale a écrit :
>>> Hello,
>>>
>>> After updating one of our OpenVZ VPS hosting nodes at the end of
>>> last week, we've started to have issues with corruption apparently
>>> occurring inside containers. Issues of this nature have never
>>> affected the node previously, and there do not appear to be any
>>> hardware issues that could explain this.
>>>
>>> Specifically, a few hours after updating, we began to see containers
>>> experiencing errors such as this in the logs:
>>>
>>> [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25
>>> [90471.679022] EXT4-fs (ploop35454p1): initial error at time
>>> 1593205255: ext4_ext_find_extent:904: inode 136399
>>> [90471.679030] EXT4-fs (ploop35454p1): last error at time
>>> 1593232922: ext4_ext_find_extent:904: inode 136399
>>> [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67
>>> [95189.954582] EXT4-fs (ploop42983p1): initial error at time
>>> 1593210174: htree_dirblock_to_tree:918: inode 926441: block 3683060
>>> [95189.954589] EXT4-fs (ploop42983p1): last error at time
>>> 1593276902: ext4_iget:4435: inode 1849777
>>> [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42
>>> [95714.207447] EXT4-fs (ploop60706p1): initial error at time
>>> 1593210489: ext4_ext_find_extent:904: inode 136272
>>> [95714.207452] EXT4-fs (ploop60706p1): last error at time
>>> 1593231063: ext4_ext_find_extent:904: inode 136272
>>>
>>> Shutting the containers down and manually mounting and e2fsck'ing
>>> their filesystems did clear these errors, but each of the containers
>>> (which were mostly used for running Plesk) had widespread issues
>>> with corrupt or missing files after the fsck's completed,
>>> necessitating their being restored from backup.
>>>
>>> Concurrently, we also began to see messages like this appearing in
>>> /var/log/vzctl.log, which again have never appeared at any point
>>> prior to this update being installed:
>>>
>>> /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole
>>> (check.c:240): Warning: ploop image
>>> '/vz/private/8288448/root.hdd/root.hds' is sparse
>>> /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole
>>> (check.c:240): Warning: ploop image
>>> '/vz/private/8288450/root.hdd/root.hds' is sparse
>>> /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole
>>> (check.c:240): Warning: ploop image
>>> '/vz/private/8288451/root.hdd/root.hds' is sparse
>>> /var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole
>>> (check.c:240): Warning: ploop image
>>> '/vz/private/8288452/root.hdd/root.hds' is sparse
>>>
>>> The basic procedure we follow when updating our nodes is as follows:
>>>
>>> 1, Update the standby node we keep spare for this process
>>> 2. vzmigrate all containers from the live node being updated to the
>>> standby node
>>> 3. Update the live node
>>> 4. Reboot the live node
>>> 5. vzmigrate the containers from the standby node back to the live
>>> node they originally came from
>>>
>>> So the only tool which has been used to affect these containers is
>>> 'vzmigrate' itself, so I'm at something of a loss as to how to
>>> explain the root.hdd images for these containers containing sparse
>>> gaps. This is something we have never done, as we have always been
>>> aware that OpenVZ does not support their use inside a container's
>>> hard drive image. And the fact that these images have suddenly
>>> become sparse at the same time they have started to exhibit
>>> filesystem corruption is somewhat concerning.
>>>
>>> We can restore all affected containers from backups, but I wanted to
>>> get in touch with the list to see if anyone else at any other site
>>> has experienced these or similar issues after applying the 7.0.14
>>> (136) update.
>>>
>>> Thank you,
>>> Kevin Drysdale.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at openvz.org
>>> https://lists.openvz.org/mailman/listinfo/users
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at openvz.org
>> https://lists.openvz.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20200706/8baa15ed/attachment-0001.html>
More information about the Users
mailing list