[CRIU] [PATCH 0/5] pipes: support for packetized mode (with O_DIRECT)

Stanislav Kinsburskiy skinsbursky at odin.com
Fri Dec 18 06:25:24 PST 2015



18.12.2015 14:16, Pavel Emelyanov пишет:
> On 12/17/2015 05:38 PM, Stanislav Kinsburskiy wrote:
>>
>> 17.12.2015 15:20, Stanislav Kinsburskiy пишет:
>>>
>>> 17.12.2015 14:16, Pavel Emelyanov пишет:
>>>> On 12/17/2015 03:30 PM, Stanislav Kinsburskiy wrote:
>>>>> 15.12.2015 21:16, Pavel Emelyanov пишет:
>>>>>> On 12/15/2015 11:11 PM, Stanislav Kinsburskiу wrote:
>>>>>>> 15 дек. 2015 г. 20:39 пользователь Pavel Emelyanov
>>>>>>> <xemul at parallels.com> написал:
>>>>>>>> On 12/15/2015 08:55 PM, Stanislav Kinsburskiy wrote:
>>>>>>>>> There is something, I would like to discuss.
>>>>>>>>> Currently, only _write_ end as O_DIRECT flag. Criu uses _read_
>>>>>>>>> end to
>>>>>>>>> tee data from it to local pipe. There is a problem, _how_ to
>>>>>>>>> discover,
>>>>>>>>> whether pipe is "packetized" or not. This knowledge is required,
>>>>>>>>> because
>>>>>>>>> splice() can be used only for non-packetized pipes (otherwise all
>>>>>>>>> individual packets will be merged to one in image file).
>>>>>>>>>
>>>>>>>>> But, I'm afraid, it's not enough... Because "packetized" pipe
>>>>>>>>> mode is
>>>>>>>>> not something stable, because it's represented by file flag.
>>>>>>>>> Regular
>>>>>>>>> pipe can be reopened with O_DIRECT flag or vise versa, but packets,
>>>>>>>>> which were sent, _when_ pipe was _"packetized"_, are already
>>>>>>>>> _marked_ as
>>>>>>>>> _"non-mergeable"_.
>>>>>>>>> So, probably, in generic case, it's not enough to create a pipe on
>>>>>>>>> restore, fill it with data and set correct file flags, because
>>>>>>>>> pipe can
>>>>>>>>> contain both "packets" and regular content.
>>>>>>>> I wouldn't say it's typical use for pipes. Can we support only those
>>>>>>>> that have all ends in the same state -- either packetized or not?
>>>>>>>>
>>>>>>> No, it's not a typical way for sure.
>>>>>>> But I don't know yet, how to distinguish...
>>>>>>> One of major problems, as I mentioned, is that only _write_ end is
>>>>>>> marked with O_DIRECT.
>>>>>>> What to do, if we have a bunch of pipe ends (inherited, whatever)?
>>>>>>> Should we gather them somehow on dump and check, that all write
>>>>>>> ends are either with O_DIRECT or without?
>>>>>> Yes.
>>>>> I need some help with this. To make the above check, all the pipes have
>>>>> to be collected, somehow organized and checked.
>>>>> To do so, all the processes have to infected first, then pipes have to
>>>>> be collected.
>>>>> With current code, all this pipes are collected during dump.
>>>>> There are some problems because of it:
>>>>> 1) Pipe data is dumped before all the pipes are collected. Thus, if we
>>>>> have pipe with O_DIRECT, but we found read end first, we can't
>>>>> discover,
>>>>> whether it's packetized pipe or not.
>>>> Can you enlighten us a little. What if we find read end of packetized
>>>> pipe and start reading data from it. The data should go in packets,
>>>> right? Even though there's not "packetized" bit on this read end.
>>> Right.
>>>
>>>> Next, what if we tee the data from such pipe into another pipe and
>>>> read data from the latter one? Would data go in packets in this case?
>>> Buffers became packets on write by special per-buffer flag.
>>> Thus tee will have copy them in the packetized state.
>>>
>>> The problem is with splice to image file. We can't use it for
>>> packetized pipes.
>>> We can go with strict way:
>>> 1) Throw away splice to image fail.
>>> 2) Read packet one by one, and write each if them with it's own pipe
>>> entry.
>>>
>>> This will allow us to dump things properly. But not restore.
>>> On restore we have to create a pipe in either packetized mode or not
>>> before putting packets into it.
>>> If we have only read end, we don't know how to create the pipe.
>>>
>>> The best we can do, if to forbid migration of pipes with only read
>>> end. But that's not backward compatible.
>>>
>> We can try to read many times with size, equal to pipe size.
>> If we will be able to get more, than one buffer, then it's packetized pipe.
>  From my perspective we don't have to distinguish packetized pipe from regular
> when dumping queueu. Just write the data in the available chinks into the
> pipe-date image and that's it.

Packetized pipe is expected to be read by chunks sizeof PIPE_BUF (== 
PAGE_SIZE).
In case of packetized pipe read will return only one buffer.
In case of normal pipe it will return as many buffers, as fit into PIPE_BUF.

>
>> If only one, then on restore:
>> 1) If it's only read end without write, we don't care. We open a regular
>> pipe and write this packet.
>> 2) If it's a full pipe, then we a have a write end somewhere with
>> O_DIRECT flag or without.
>>
>> Looks like this approach covers all the cases (except one very special,
>> which we can not take into account).
>>
>> But it means, that we have to get rid of splice/vmsplice:
>> 1) On dump, because we have to write all the packets individually.
>> 2) On restore, because vmsplice doesn't work with packetized pipes.
> BTW, I've glanced through the kernel and haven't found why splice doesn't
> work for packetized pipes. Can you point out one?

Sure.
1) Splice pipe to image file on dump. In this case we loose all the 
packets structure and having one single data chunk.
2) Vmsplice on restore. Packetized buffers have to be marked with 
special flag, which vmsplice is not aware of. This leads to lsituation, 
when all the packets they are returned as one single data chunk to a 
pipe reader.

>> But I don't know, does it worth it.
>> So, I'll do what you say: either implement the algorithm above or just
>> drop this feature expect for autofs pipes, which must be empty.
>>
>> .
>>
> -- Pavel



More information about the CRIU mailing list