[CRIU] [PATCH RFC] pram: persistent over-kexec memory file system
Marco Stornelli
marco.stornelli at gmail.com
Sun Jul 28 07:02:12 EDT 2013
Il 28/07/2013 12:05, Vladimir Davydov ha scritto:
> On 07/27/2013 09:37 PM, Marco Stornelli wrote:
>> Il 27/07/2013 19:35, Vladimir Davydov ha scritto:
>>> On 07/27/2013 07:41 PM, Marco Stornelli wrote:
>>>> Il 26/07/2013 14:29, Vladimir Davydov ha scritto:
>>>>> Hi,
>>>>>
>>>>> We want to propose a way to upgrade a kernel on a machine without
>>>>> restarting all the user-space services. This is to be done with CRIU
>>>>> project, but we need help from the kernel to preserve some data in
>>>>> memory while doing kexec.
>>>>>
>>>>> The key point of our implementation is leaving process memory in-place
>>>>> during reboot. This should eliminate most io operations the services
>>>>> would produce during initialization. To achieve this, we have
>>>>> implemented a pseudo file system that preserves its content during
>>>>> kexec. We propose saving CRIU dump files to this file system,
>>>>> kexec'ing
>>>>> and then restoring the processes in the newly booted kernel.
>>>>>
>>>>
>>>> http://pramfs.sourceforge.net/
>>>
>>> AFAIU it's a bit different thing: PRAMFS as well as pstore, which has
>>> already been merged, requires hardware support for over-reboot
>>> persistency, so called non-volatile RAM, i.e. RAM which is not directly
>>> accessible and so is not used by the kernel. On the contrary, what we'd
>>> like to have is preserving usual RAM on kexec. It is possible, because
>>> RAM is not reset during kexec. This would allow leaving applications
>>> working set as well as filesystem caches in place, speeding the reboot
>>> process as a whole and reducing the downtime significantly.
>>>
>>> Thanks.
>>
>> Actually not. You can use normal system RAM reserved at boot with mem
>> parameter without any kernel change. Until an hard reset happens, that
>> area will be "persistent".
>
> Thank you, we'll look at PRAMFS closer, but right now, after trying it I
> have a couple of concerns I'd appreciate if you could clarify:
>
> 1) As you advised, I tried to reserve a range of memory (passing
> memmap=4G$4G at boot) and mounted PRAMFS using the following options:
>
> # mount -t pramfs -o physaddr=0x100000000,init=4G,bs=4096 none /mnt/pramfs
>
> And it turned out that PRAMFS is very slow as compared to ramfs:
>
> # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy
> bs=4096 count=$[100*1024]
> 102400+0 records in
> 102400+0 records out
> 419430400 bytes (419 MB) copied, 9.23498 s, 45.4 MB/s
> # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy
> bs=4096 count=$[100*1024] conv=notrunc
> 102400+0 records in
> 102400+0 records out
> 419430400 bytes (419 MB) copied, 3.04692 s, 138 MB/s
>
> We need it to be as fast as usual RAM, because otherwise the benefit of
> it over hdd disappears. So before diving into the code, I'd like to ask
> you if it's intrinsic to PRAMFS, or can it be fixed? Or, perhaps, I used
> wrong mount/boot/config options (btw, I enabled only CONFIG_PRAMFS)?
>
In x86 you should have the write protection enabled. Turn it off or
mount it with noprotect option.
> 2) To enable saving application dump files in memory using PRAMFS, one
> should reserve half of RAM for it. That's too expensive. While with
> ramfs, once SPLICE_F_MOVE flag is implemented, one could move anonymous
> memory pages to ramfs page cache and after kexec move it back so that
> almost no extra memory space costs would be required. Of course,
> SPLICE_F_MOVE is to be yet implemented, but with PRAMFS significant
> memory costs are inevitable... or am I wrong?
>
> Thanks.
From this point of view you are right. Pramfs (or other solution like
that) are out of page cache, so you can't do any memory transfer. It's
like to have a disk but it's actually a separate piece of RAM. We could
talk about it again when this kind of implementation will be done.
Marco
More information about the CRIU
mailing list