[CRIU] [PATCH 3/3] Issue #360: Anonymize image files
Pavel Emelianov
xemul at virtuozzo.com
Tue Jun 25 12:52:23 MSK 2019
>>> diff --git a/lib/py/strip.py b/lib/py/strip.py
>>> new file mode 100644
>>> index 00000000..4069275c
>>> --- /dev/null
>>> +++ b/lib/py/strip.py
> The indentation of this file is using spaces, (IMHO we should be using
> space for python code) however, the rest of the code base is using tabs.
> For consistency it might be better to use tabs in this file as well?
I've been told a lot that true pythonic indentation is with spaces, not tabs.
That said -- should we take sed and re-format the whole py stuff into spaces?
>>> @@ -0,0 +1,66 @@
>>> +# This file contains methods to deal with anonymising images.
>>> +#
>>> +# Contents being anonymised can be found at: https://github.com/checkpoint-restore/criu/issues/360
> Could you please add the content that is being anonymised instead of
> providing an external link to the github issue? This will be helpful
> when reading the source code offline.
>>> +#
>>> +# Inorder to anonymise the image files three steps are followed:
> s/Inorder/In order/g
>>> +# - decode the binary image to json
>>> +# - strip the necessary information from the json dict
>>> +# - encode the json dict back to a binary image, which is now anonymised
>>> +
>>> +import sys
>>> +import json
>>> +import random
>>> +
>>> +def files_anon(image):
>>> + levels = {}
>>> +
>>> + for e in image['entries']:
>>> + f_path = e['reg']['name']
> we should handle KeyError: 'reg' or check if the reg key exists.
>>> + f_path = f_path.split('/')
>>> +
>>> + lev_num = 0
>>> + for p in f_path:
>>> + if p == '':
>>> + continue
>>> + if lev_num in levels.keys():
>>> + if p not in levels[lev_num].keys():
> is .keys() necessary here?
>>> + temp = list(p)
>>> + random.shuffle(temp)
>> Erm, I'm not 100% it's OK to anonymize file paths like that.
> Computing a hash could be another option?
Yes.
I was also thinking on checking the 1st level to be one of "known" names like
var, home, usr, etc, etc ( ;) ) and not shuffling them.
-- Pavel
More information about the CRIU
mailing list