[CRIU] [PATCH 0/7] CRiu Image Tool(crit), v4
Ruslan Kuprieiev
kupruser at gmail.com
Tue Jan 13 09:48:09 PST 2015
On 01/13/2015 04:35 PM, Pavel Emelyanov wrote:
>>>> b) About payload and extra fields. I think we need them, as they
>>>> show the structure of criu images according to one that is
>>>> described in our Images wiki page:
>>>>
>>>> All criu-specific image files begin with 32-bit magic cookie. Images in
>>>> PB format are followed by zero or more entries of the same type (not
>>>> size!), each entry is preceded with 32-bit entry size value (not
>>>> including this 32-bit value itself). Optionally each entry may be
>>>> followed by extra payload which depends on the entry type.
>>>>
>>>> IOW protocol-buffers image files look like
>>>>
>>>> IMAGE_FILE ::= MAGIC { ENTRY }
>>>> ENTRY ::= SIZE PAYLOAD [ EXTRA ]
>>>> PAYLOAD ::= "message encoded in ProtocolBuffer format"
>>>> EXTRA ::= "arbitrary blob, depends on the PAYLOAD contents"
>>>>
>>>> MAGIC ::= "32 bit integer"
>>>> SIZE ::= "32 bit integer, equals the PAYLOAD length"
>>>>
>>>> So the current structure with "payload" and "extra" fields
>>>> makes perfect sense to me. The linear structure that
>>>> we've used previously in our text-format confuses the user,
>>>> just the way I was confused about pagemap\pipes-data image
>>>> structure.
>>> Hm... The wiki page is not a rule written in stone. Let's make the
>>> json representation look human-readable and fix the wiki page. Like
>>> this:
>>>
>>> IMAGE_FILE ::= MAGIC { SIZE ENTRY [PAYLOAD] }
>>> ENTRY ::= "message encoded in PB"
>>> PAYLOAD ::= "arbitrary blob"
>>>
>>> :)
>>>
>>> Too many braces and extra fields make things harder to understand.
>> As I said before, linear { SIZE ENTRY [PAYLOAD] } structure is confusing.
> But this one and what's on wiki are equal.
Yes, this one is a low level description of what's on wiki =).
>> I like the one on wiki better, especially when describing pipes-data and
>> pagemap images. I think that even though there are extra fields and braces
>> involved, current structure looks a lot better and is self-describing.
>> And the crit sources look a lot better when sticking to wiki description.
>> I would really like to stick with wiki description.
> Right now we're arguing in "like" "don't like" terms. With this we won't
> make any decision, but "just do it" one. Thus I propose to stick to the
> only measurable argument we have -- less text is easier to read.
Yes, you are right, sorry for "like/don't like" terms.
> Is it technically possible to make the json text look like this:
>
> {
> "magic": "REG_FILES",
> "entries": [
> {
> "foo": "1",
> "bar": "BAR",
> "extra": {
> /* entry-specific stuff here */
> }
> },
>
> ?
>
> And how bad would the code look like for this?
You mean like add "extra" field to the payload, right?
Yes, it does look better.
>>>> c) I wrap every value in "" as it isn't that easy to convert
>>>> from pb to json, as theirs supported types do not match,
>>>> and wrapping everything into strings is just the easiest
>>>> way to get everything work. I'm currently thinking on
>>>> a way to improve that, but for now it works just fine.
>>> OK. Can we at least keep strings as string, not as string with string?
>>> I mean this:
>>>
>>> "path": "/foo/bar"
>>>
>>> instead of
>>>
>>> "path": "\"/foo/bar\""
>> Yes, we can, but it is a next step, i'm working on it in whole context
>> of impoving pb<->json convertation.
> This step should be accomplished till v1.5 (beginning of March). Can we?
Yes, definitely.
But I'll try to fix it asap.
>>>>> 3. Let's now look at reg-files.img
>>>>>
>>>>> {
>>>>> "magic": "REG_FILES",
>>>>> "entries": [
>>>>> {
>>>>> "payload": {
>>>>> "fown": {
>>>>> "pid_type": "0",
>>>>> "signum": "0",
>>>>> "pid": "0",
>>>>> "uid": "0",
>>>>> "euid": "0"
>>>>> },
>>>>> "flags": "32770",
>>>>> "id": "1",
>>>>> "name": "\"/dev/null\"",
>>>>> "pos": "0"
>>>>> }
>>>>> },
>>>>> {
>>>>> "payload": {
>>>>> "name": "\"/\"",
>>>>> "fown": {
>>>>> "pid_type": "0",
>>>>> "signum": "0",
>>>>> "pid": "0",
>>>>> "uid": "0",
>>>>> "euid": "0"
>>>>> },
>>>>> "pos": "0",
>>>>> "flags": "33793",
>>>>> "id": "2",
>>>>> "size": "0"
>>>>> }
>>>>> },
>>>>>
>>>>> Other than those from inventory, the concerns here are:
>>>>>
>>>>> a) The order of fields differs from entry to entry.
>>>> It is normal for json =).
>>> Yes, but not for human eyes :)
>>>> I thought about it and concluded that it doesn't harm us in any way,
>>>> as we only care about the order in "repeated" fields and we treat
>>>> them properly by converting to an array, so the order is saved as
>>>> the order of elements in json arrays is preserved.
>>> How hard would it be to keep names in the order they are met in pb file?
>> Well, json specification says that the order shouldn't be preserved.
> I agree :)
>
>> Same thing with dicts, as the names are hashable, used to find
>> items and are tossed around .
> Absolutely.
>
>> Hm... But what if we just store protobuf-text format string inside json?
> I don't understand this. How would the text look like?
Here is how reg-files.img would look like:
{
"magic": "REG_FILES",
"entries": [
{
"payload": "
fown: {
pid_type: 0,
signum: 0,
pid: 0,
uid: 0,
euid: 0
},
flags: 32770,
id: 1,
name: "/dev/null",
pos: 0"
},
So the payload is a string that contains pb entry in protobuf text format.
>> It would solve all those problems with convertation and saving the names
>> order, but whould brake integrity check(which is still weak compared to
>> protobuf)...
>>
>> I was also considering packing image into one protobuf message,
>> similar to current json, when it looks like:
>> message criu_image {
>> required string magic = 1;
>> repeated entry entries = 2;
>> }
>>
>> message criu_entry {
>> required criu_payload payload = 1;
>> optional criu_extra extra = 2;
>> }
>>
>> message criu_payload {
>> required string name = 1; // for example "core"
>> optional core_entry core = 1;
>> ....
>> }
>>
>> message criu_extra {
>> required string name = 1; // for example "pagemaps"
>> repeated pagemap_entry pagemaps = 2;
>> ...
>> }
> PB docs say, that the format is good for encoding "small" messages (without
> specification of how small the small is). According to this, keeping all
> that CRIU generates in one entry (with sub-entries) is not good idea.
Yes, but we are using it just to print the images. I mean, we will read them
from *.img, pack in criu_image structure and print using pb text format.
And our images don't usually contain large amounts of data.
>> And it now looks like it fits better, even though it requires
>> a bit more code to write(actually, it can be autogenerated =) ).
>> This way it will be still readable, easy to parse from file
>> and will preserve the names order.
>> What do you think?
>>
>>
>>>>> b) The name of the 2nd entry is just /, but in the output it takes 7 characters to print one.
>>>>>
>>>> Yep, that is a side-effect of that simplified pb<->json convertation.
>>>>
>>>>> 4. Next the pagemap.img
>>>>>
>>>>> {
>>>>> "magic": "PAGEMAP",
>>>>> "entries": [
>>>>> {
>>>>> "payload": {
>>>>> "pages_id": "1"
>>>>> },
>>>>> "extra": [
>>>>> {
>>>>> "nr_pages": "1",
>>>>> "vaddr": "4194304"
>>>>> },
>>>>>
>>>>> Similar to the inventory, field "extra" is obfuscating.
>>>>>
>>>> See 1.
>>> Disagree. The pagemap_entry-s are not "extra". They are entries w/o extra payload.
>> But wiki says that " Images in PB format are followed by zero or more entries of the _SAME TYPE_",
> Wiki is wrong here :) The EXTRA is stated to be "arbitrary blob", but it's
> not such in this case.
Why? It fits perfectly if we state that pagemap_entry's are the extra
for pagemap_head.
>> which makes perfect sense to me. As I mentioned before, wiki description looks a
>> lot better than linear.
> The images we generate are not nice, yes. We even have a page describing how
> exactly they are not nice. But this is what we have, and starting developing
> new format is good task, but too early. Let's try to make crit decode produce
> as nice-looking output as can possibly be with the current images set.
Those messages (criu_image, criu_payload etc) are not for changing our
images =).
They are just to contain and treat images within crit, just to keep them
sorted and
nice-looking and be able to just parse human readable image using standart
ParseFromString protobuf method=).
> Thanks,
> Pavel
>
More information about the CRIU
mailing list