[CRIU] [PATCH v5 15/33] files: Implement {set, clear, wait}_fds_event()

Andrei Vagin avagin at virtuozzo.com
Thu Jan 19 11:42:42 PST 2017


On Thu, Jan 19, 2017 at 02:18:01PM +0300, Kirill Tkhai wrote:
> On 19.01.2017 05:46, Andrei Vagin wrote:
> > On Mon, Dec 26, 2016 at 05:27:08PM +0300, Kirill Tkhai wrote:
> >> The idea is symilar to kernel's wake_up() and wait_event().
> >> One task needs some event. It checks the event has not
> >> happened yet (fle hasn't received, unix peer hasn't bound, etc)
> >> and calls get_fds_event(). Other task makes the event
> >> (sends a fle, binds the peer to a name, etc) and calls set_fds_event().
> >>
> >> So, while there is no an event, the first task is sleeping,
> >> and the second wakes it up later:
> >>
> >> Task A:      clear_fds_event();
> >>              if (!socket_bound)
> >>                      wait_fds_event(); /* sleep */
> >>
> >> Task B:      bind_socket();
> >>              set_fds_event();         /* wake up */
> >>
> >> For the details of using see next patches.
> >>
> >> v5: Use bit operations.
> >>     Split clear_fds_event from wait function.
> >>
> >> v2: Do not wait for foreign transport sock is ready,
> >> as it's guarantied by we create it before CR_STATE_FORKING.
> >>
> >> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
> >> ---
> >>  criu/files.c          |   29 +++++++++++++++++++++++++++++
> >>  criu/include/files.h  |    2 ++
> >>  include/common/lock.h |    6 ++++++
> >>  3 files changed, 37 insertions(+)
> >>
> >> diff --git a/criu/files.c b/criu/files.c
> >> index bff8880bf..87d8fc655 100644
> >> --- a/criu/files.c
> >> +++ b/criu/files.c
> >> @@ -149,6 +149,35 @@ unsigned int find_unused_fd(struct list_head *head, int hint_fd)
> >>  	return fd;
> >>  }
> >>  
> >> +int set_fds_event(pid_t virt)
> >> +{
> >> +	struct pstree_item *item;
> >> +	int old;
> >> +
> >> +	item = pstree_item_by_virt(virt);
> >> +	BUG_ON(!item);
> >> +
> >> +	old = test_and_set_bit(FDS_EVENT_BIT, (unsigned long *)&item->task_st);
> >> +
> >> +	if (!(old & FDS_EVENT))
> >> +		futex_wake(&item->task_st);
> >> +	return 0;
> >> +}
> >> +
> >> +void clear_fds_event(void)
> >> +{
> >> +	futex_t *f = &current->task_st;
> >> +
> >> +	clear_bit(FDS_EVENT_BIT, (unsigned long *)&f->raw.counter);
> > 
> > CID 174702 (#1 of 1): Out-of-bounds access (INCOMPATIBLE_CAST)
> 
> JFI, what is CID?

It is a bug id in https://scan.coverity.com/

https://scan.coverity.com/projects/avagin-criu?tab=overview

> 
> > incompatible_cast: Pointer &f->raw.counter points to an object whose
> > effective type is int (32 bits, signed) but is dereferenced as a wider
> > unsigned long (64 bits, unsigned). This may lead to memory corruption.
> > [show details]
> 
> [PATCH]files: Fix test and set endianess problem
>     
> Andrew Vagin reported the problem found by a checker:
> 
>     CID 174702 (#1 of 1): Out-of-bounds access (INCOMPATIBLE_CAST)
>     incompatible_cast: Pointer &f->raw.counter points to an object whose
>     effective type is int (32 bits, signed) but is dereferenced as a wider
>     unsigned long (64 bits, unsigned). This may lead to memory corruption.
>     
> It looks like, this points to real problem, which may happen on big-endian
> platforms. In the code I relay on the fact, that FDS_EVENT_BIT has a small
> number and the value, it determines, fits into int type without problems.
> But it's correct only for little-endian.
> 
> In case of big-endian, if the word size is 8 bytes, then FDS_EVENT value
> is in the last bytes, so there is an access to wrong memory.
> 
> To fix the problem, I suggest to use little-endian byte order to work
> with task_st futex. Then, the bits from 0 to 31 will be in the low adresses,
> i.e. in task_st futex. There is new primitives test_and_set_bit_le() and
> set_bit_le() borrowed from the linux kernel for that.
> 
> This fixes the problem, but I suppose, the checker does not see the problem
> so deep, and just compares the types size, so it will fail again.
> So, let's enlarge the bit field size to silence it.
> 
> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
> 
> diff --git a/criu/files.c b/criu/files.c
> index 23a70deee..d22b280e0 100644
> --- a/criu/files.c
> +++ b/criu/files.c
> @@ -152,30 +152,33 @@ unsigned int find_unused_fd(struct list_head *head, int hint_fd)
>  int set_fds_event(pid_t virt)
>  {
>  	struct pstree_item *item;
> -	int old;
> +	bool is_set;
>  
>  	item = pstree_item_by_virt(virt);
>  	BUG_ON(!item);
>  
> -	old = test_and_set_bit(FDS_EVENT_BIT, (unsigned long *)&item->task_st);
> +	is_set = !!test_and_set_bit_le(FDS_EVENT_BIT, &item->task_st_le_bits);
>  
> -	if (!(old & FDS_EVENT))
> +	if (!is_set)
>  		futex_wake(&item->task_st);
>  	return 0;
>  }
>  
>  void clear_fds_event(void)
>  {
> -	futex_t *f = &current->task_st;
> -
> -	clear_bit(FDS_EVENT_BIT, (unsigned long *)&f->raw.counter);
> +	clear_bit_le(FDS_EVENT_BIT, &current->task_st_le_bits);
>  }
>  
>  void wait_fds_event(void)
>  {
>  	futex_t *f = &current->task_st;
> -
> -	futex_wait_if_cond(f, FDS_EVENT, &);
> +	int value;
> +#if BITS_PER_LONG == 64
> +	value = htole64(FDS_EVENT);
> +#else
> +	value = htole32(FDS_EVENT);
> +#endif
> +	futex_wait_if_cond(f, value, &);
>  	clear_fds_event();
>  }
>  /*
> diff --git a/criu/include/pstree.h b/criu/include/pstree.h
> index 10e391d39..b4dc6b635 100644
> --- a/criu/include/pstree.h
> +++ b/criu/include/pstree.h
> @@ -25,7 +25,10 @@ struct pstree_item {
>  	struct pid		*threads;	/* array of threads */
>  	CoreEntry		**core;
>  	TaskKobjIdsEntry	*ids;
> -	futex_t			task_st;
> +	union {
> +		futex_t		task_st;
> +		unsigned long	task_st_le_bits;
> +	};
>  };
>  
>  enum {
> diff --git a/include/common/bitops.h b/include/common/bitops.h
> index c594a0fad..1e6411243 100644
> --- a/include/common/bitops.h
> +++ b/include/common/bitops.h
> @@ -1,4 +1,23 @@
>  #ifndef __CR_COMMON_BITOPS_H__
>  #define __CR_COMMON_BITOPS_H__
>  #include "common/asm/bitops.h"
> +
> +#include "common/bitsperlong.h"
> +#include <endian.h>
> +
> +#if __BYTE_ORDER == __BIG_ENDIAN
> +#define BITOP_LE_SWIZZLE	((BITS_PER_LONG-1) & ~0x7)
> +#else
> +#define BITOP_LE_SWIZZLE	0
> +#endif
> +
> +static inline int test_and_set_bit_le(int nr, void *addr)
> +{
> +	return test_and_set_bit(nr ^ BITOP_LE_SWIZZLE, addr);
> +}
> +
> +static inline void clear_bit_le(int nr, void *addr)
> +{
> +	clear_bit(nr ^ BITOP_LE_SWIZZLE, addr);
> +}
>  #endif


More information about the CRIU mailing list