[CRIU] [PATCH v5 15/33] files: Implement {set, clear, wait}_fds_event()

Kirill Tkhai ktkhai at virtuozzo.com
Thu Jan 19 03:47:25 PST 2017


JFYI: The fix ([PATCH]files: Fix test and set endianess problem) from previous letter
and v6 on top of it:

https://travis-ci.org/tkhai/criu/builds/193343958

On 19.01.2017 14:18, Kirill Tkhai wrote:
> On 19.01.2017 05:46, Andrei Vagin wrote:
>> On Mon, Dec 26, 2016 at 05:27:08PM +0300, Kirill Tkhai wrote:
>>> The idea is symilar to kernel's wake_up() and wait_event().
>>> One task needs some event. It checks the event has not
>>> happened yet (fle hasn't received, unix peer hasn't bound, etc)
>>> and calls get_fds_event(). Other task makes the event
>>> (sends a fle, binds the peer to a name, etc) and calls set_fds_event().
>>>
>>> So, while there is no an event, the first task is sleeping,
>>> and the second wakes it up later:
>>>
>>> Task A:      clear_fds_event();
>>>              if (!socket_bound)
>>>                      wait_fds_event(); /* sleep */
>>>
>>> Task B:      bind_socket();
>>>              set_fds_event();         /* wake up */
>>>
>>> For the details of using see next patches.
>>>
>>> v5: Use bit operations.
>>>     Split clear_fds_event from wait function.
>>>
>>> v2: Do not wait for foreign transport sock is ready,
>>> as it's guarantied by we create it before CR_STATE_FORKING.
>>>
>>> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
>>> ---
>>>  criu/files.c          |   29 +++++++++++++++++++++++++++++
>>>  criu/include/files.h  |    2 ++
>>>  include/common/lock.h |    6 ++++++
>>>  3 files changed, 37 insertions(+)
>>>
>>> diff --git a/criu/files.c b/criu/files.c
>>> index bff8880bf..87d8fc655 100644
>>> --- a/criu/files.c
>>> +++ b/criu/files.c
>>> @@ -149,6 +149,35 @@ unsigned int find_unused_fd(struct list_head *head, int hint_fd)
>>>  	return fd;
>>>  }
>>>  
>>> +int set_fds_event(pid_t virt)
>>> +{
>>> +	struct pstree_item *item;
>>> +	int old;
>>> +
>>> +	item = pstree_item_by_virt(virt);
>>> +	BUG_ON(!item);
>>> +
>>> +	old = test_and_set_bit(FDS_EVENT_BIT, (unsigned long *)&item->task_st);
>>> +
>>> +	if (!(old & FDS_EVENT))
>>> +		futex_wake(&item->task_st);
>>> +	return 0;
>>> +}
>>> +
>>> +void clear_fds_event(void)
>>> +{
>>> +	futex_t *f = &current->task_st;
>>> +
>>> +	clear_bit(FDS_EVENT_BIT, (unsigned long *)&f->raw.counter);
>>
>> CID 174702 (#1 of 1): Out-of-bounds access (INCOMPATIBLE_CAST)
> 
> JFI, what is CID?
> 
>> incompatible_cast: Pointer &f->raw.counter points to an object whose
>> effective type is int (32 bits, signed) but is dereferenced as a wider
>> unsigned long (64 bits, unsigned). This may lead to memory corruption.
>> [show details]
> 
> [PATCH]files: Fix test and set endianess problem
>     
> Andrew Vagin reported the problem found by a checker:
> 
>     CID 174702 (#1 of 1): Out-of-bounds access (INCOMPATIBLE_CAST)
>     incompatible_cast: Pointer &f->raw.counter points to an object whose
>     effective type is int (32 bits, signed) but is dereferenced as a wider
>     unsigned long (64 bits, unsigned). This may lead to memory corruption.
>     
> It looks like, this points to real problem, which may happen on big-endian
> platforms. In the code I relay on the fact, that FDS_EVENT_BIT has a small
> number and the value, it determines, fits into int type without problems.
> But it's correct only for little-endian.
> 
> In case of big-endian, if the word size is 8 bytes, then FDS_EVENT value
> is in the last bytes, so there is an access to wrong memory.
> 
> To fix the problem, I suggest to use little-endian byte order to work
> with task_st futex. Then, the bits from 0 to 31 will be in the low adresses,
> i.e. in task_st futex. There is new primitives test_and_set_bit_le() and
> set_bit_le() borrowed from the linux kernel for that.
> 
> This fixes the problem, but I suppose, the checker does not see the problem
> so deep, and just compares the types size, so it will fail again.
> So, let's enlarge the bit field size to silence it.
> 
> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
> 
> diff --git a/criu/files.c b/criu/files.c
> index 23a70deee..d22b280e0 100644
> --- a/criu/files.c
> +++ b/criu/files.c
> @@ -152,30 +152,33 @@ unsigned int find_unused_fd(struct list_head *head, int hint_fd)
>  int set_fds_event(pid_t virt)
>  {
>  	struct pstree_item *item;
> -	int old;
> +	bool is_set;
>  
>  	item = pstree_item_by_virt(virt);
>  	BUG_ON(!item);
>  
> -	old = test_and_set_bit(FDS_EVENT_BIT, (unsigned long *)&item->task_st);
> +	is_set = !!test_and_set_bit_le(FDS_EVENT_BIT, &item->task_st_le_bits);
>  
> -	if (!(old & FDS_EVENT))
> +	if (!is_set)
>  		futex_wake(&item->task_st);
>  	return 0;
>  }
>  
>  void clear_fds_event(void)
>  {
> -	futex_t *f = &current->task_st;
> -
> -	clear_bit(FDS_EVENT_BIT, (unsigned long *)&f->raw.counter);
> +	clear_bit_le(FDS_EVENT_BIT, &current->task_st_le_bits);
>  }
>  
>  void wait_fds_event(void)
>  {
>  	futex_t *f = &current->task_st;
> -
> -	futex_wait_if_cond(f, FDS_EVENT, &);
> +	int value;
> +#if BITS_PER_LONG == 64
> +	value = htole64(FDS_EVENT);
> +#else
> +	value = htole32(FDS_EVENT);
> +#endif
> +	futex_wait_if_cond(f, value, &);
>  	clear_fds_event();
>  }
>  /*
> diff --git a/criu/include/pstree.h b/criu/include/pstree.h
> index 10e391d39..b4dc6b635 100644
> --- a/criu/include/pstree.h
> +++ b/criu/include/pstree.h
> @@ -25,7 +25,10 @@ struct pstree_item {
>  	struct pid		*threads;	/* array of threads */
>  	CoreEntry		**core;
>  	TaskKobjIdsEntry	*ids;
> -	futex_t			task_st;
> +	union {
> +		futex_t		task_st;
> +		unsigned long	task_st_le_bits;
> +	};
>  };
>  
>  enum {
> diff --git a/include/common/bitops.h b/include/common/bitops.h
> index c594a0fad..1e6411243 100644
> --- a/include/common/bitops.h
> +++ b/include/common/bitops.h
> @@ -1,4 +1,23 @@
>  #ifndef __CR_COMMON_BITOPS_H__
>  #define __CR_COMMON_BITOPS_H__
>  #include "common/asm/bitops.h"
> +
> +#include "common/bitsperlong.h"
> +#include <endian.h>
> +
> +#if __BYTE_ORDER == __BIG_ENDIAN
> +#define BITOP_LE_SWIZZLE	((BITS_PER_LONG-1) & ~0x7)
> +#else
> +#define BITOP_LE_SWIZZLE	0
> +#endif
> +
> +static inline int test_and_set_bit_le(int nr, void *addr)
> +{
> +	return test_and_set_bit(nr ^ BITOP_LE_SWIZZLE, addr);
> +}
> +
> +static inline void clear_bit_le(int nr, void *addr)
> +{
> +	clear_bit(nr ^ BITOP_LE_SWIZZLE, addr);
> +}
>  #endif
> 


More information about the CRIU mailing list