[CRIU] [PATCH]v2 service: SIGCHLD handler

Thu Oct 31 16:52:29 PDT 2013

On 31.10.2013 20:06, Pavel Emelyanov wrote:
> On 10/31/2013 04:44 AM, Ruslan Kuprieiev wrote:
>> Hi!
>>
>> Now we're ignoring SIGCHLD from forked children.
>> This patch provides SIGCHLD handler, that will wait for child and write some
>> information about it's exit status to logfile.
>>
>> P.S. This is my first signal handler, so please take a good look at this patch.
>>
>> v2: fixed few mistakes. Thanks Andrew Vagin <avagin at parallels.com> for help.
>>
>> Signed-off-by: Ruslan Kuprieiev <kupruser at gmail.com>
>> ---
>>   cr-service.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 75 insertions(+), 2 deletions(-)
>>
>> diff --git a/cr-service.c b/cr-service.c
>> index a79ebc7..a920824 100644
>> --- a/cr-service.c
>> +++ b/cr-service.c
>> @@ -253,6 +253,53 @@ err:
>>   	return -1;
>>   }
>>   
>> +static void reap_worker(int signo)
>> +{
>> +	int saved_errno;
>> +	int status;
>> +	pid_t pid;
>> +
>> +	saved_errno = errno;
>> +
>> +	/*
>> +	 * Accordingly to signal(7):
>> +	 *   Multiple instances of real-time signals can be queued.  By
>> +	 *   contrast, if multiple instances of a standard signal are
>> +	 *   delivered while that signal is currently blocked, then only one
>> +	 *   instance is queued.
> No need in citing man page in code.
>
>> +	 * So, as we block SIGCHLD, lets wait for every child that has
>> +	 * changed state.
>> +	 */
>> +	while (1) {
>> +		pid = waitpid(-1, &status, WNOHANG|WUNTRACED|WCONTINUED);
> Why just WNOHANG is not enough?

To provide more information.

>> +
>> +		if (pid <= 0) {
>> +			errno = saved_errno;
>> +			return;
>> +		}
>> +
>> +		if (WIFEXITED(status))
>> +			pr_info("Worker(pid %d) exited with %d\n",
>> +				pid, WEXITSTATUS(status));
>> +
>> +		else if (WIFSIGNALED(status)) {
>> +			pr_info("Worker(pid %d) ", pid);
>> +#ifdef WCOREDUMP
>> +			if (WCOREDUMP(status))
>> +				pr_info("prodused a core dump and ");
>> +#endif
>> +			pr_info("was killed by %d\n", WTERMSIG(status));
>> +
>> +		} else if (WIFSTOPPED(status))
>> +			pr_info("Worker(pid %d) was stopped by %d\n",
>> +				pid, WSTOPSIG(status));
>> +
>> +		else if (WIFCONTINUED(status))
>> +			pr_info("Worker(pid %d) was continued\n", pid);
> I'm not sure we need all the events. Exited and killed is more than enough.

I thought more information is better. Especially, when we got all 
information about children from logfile. I also thought, criu has 
parasite and stuff and works with kernel, so it wouldn't do any harm, if 
we'll provide more information. Or would it?

>> +
>> +	}
>> +}
>> +
>>   int cr_service(bool daemon_mode)
>>   {
>>   	int server_fd = -1;
>> @@ -264,6 +311,9 @@ int cr_service(bool daemon_mode)
>>   	socklen_t server_addr_len;
>>   	socklen_t client_addr_len;
>>   
>> +	struct sigaction new_action, old_action;
>> +	sigset_t block_mask;
> No need in dedicated block_mask, you can fill the mask
> right on sigaction.
>
>> +
>>   	server_fd = socket(AF_LOCAL, SOCK_SEQPACKET, 0);
>>   	if (server_fd == -1) {
>>   		pr_perror("Can't initialize service socket.");
>> @@ -323,8 +373,25 @@ int cr_service(bool daemon_mode)
>>   		}
>>   	}
>>   
>> -	/* FIXME Do not ignore children's return values */
>> -	signal(SIGCHLD, SIG_IGN);
>> +	/* Set SIGCHLD handler */
>> +	sigemptyset(&block_mask);
>> +
>> +	/* FIXME May need to block some other signals too */
>> +	sigaddset(&block_mask, SIGCHLD);
>> +	sigaddset(&block_mask, SIGINT);
>> +	sigaddset(&block_mask, SIGQUIT);
>> +	sigaddset(&block_mask, SIGHUP);
>> +	sigaddset(&block_mask, SIGTERM);
>> +	sigaddset(&block_mask, SIGTSTP);
> Any explanation to this choice?

I've opened man and looked for signals, that may wait for handler to 
finish writing to logfile.

>
>> +
>> +	new_action.sa_handler = reap_worker;
>> +	new_action.sa_mask	= block_mask;
>> +	new_action.sa_flags	= SA_RESTART;
>> +
>> +	if (sigaction(SIGCHLD, &new_action, &old_action)) {
>> +		pr_perror("Can't setup SIGCHLD handler");
>> +		return -1;
>> +	}
>>   
>>   	while (1) {
>>   		int sk;
>> @@ -342,6 +409,12 @@ int cr_service(bool daemon_mode)
>>   		if (child_pid == 0) {
>>   			int ret;
>>   
>> +			/* Set default SIGCHLD handler back */
>> +			if (sigaction(SIGCHLD, &old_action, NULL)) {
>> +				pr_perror("Can't setup default SIGCHLD handler back");
>> +				exit(1);
>> +			}
>> +
>>   			close(server_fd);
>>   			ret = cr_service_work(sk);
>>   			close(sk);
>>
>