[CRIU] [PATCH]v2 service: SIGCHLD handler
Pavel Emelyanov
xemul at parallels.com
Fri Nov 1 02:31:58 PDT 2013
On 11/01/2013 03:52 AM, Ruslan Kuprieiev wrote:
> On 31.10.2013 20:06, Pavel Emelyanov wrote:
>> On 10/31/2013 04:44 AM, Ruslan Kuprieiev wrote:
>>> Hi!
>>>
>>> Now we're ignoring SIGCHLD from forked children.
>>> This patch provides SIGCHLD handler, that will wait for child and write some
>>> information about it's exit status to logfile.
>>>
>>> P.S. This is my first signal handler, so please take a good look at this patch.
>>>
>>> v2: fixed few mistakes. Thanks Andrew Vagin <avagin at parallels.com> for help.
>>>
>>> Signed-off-by: Ruslan Kuprieiev <kupruser at gmail.com>
>>> ---
>>> cr-service.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>>> 1 file changed, 75 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/cr-service.c b/cr-service.c
>>> index a79ebc7..a920824 100644
>>> --- a/cr-service.c
>>> +++ b/cr-service.c
>>> @@ -253,6 +253,53 @@ err:
>>> return -1;
>>> }
>>>
>>> +static void reap_worker(int signo)
>>> +{
>>> + int saved_errno;
>>> + int status;
>>> + pid_t pid;
>>> +
>>> + saved_errno = errno;
>>> +
>>> + /*
>>> + * Accordingly to signal(7):
>>> + * Multiple instances of real-time signals can be queued. By
>>> + * contrast, if multiple instances of a standard signal are
>>> + * delivered while that signal is currently blocked, then only one
>>> + * instance is queued.
>> No need in citing man page in code.
>>
>>> + * So, as we block SIGCHLD, lets wait for every child that has
>>> + * changed state.
>>> + */
>>> + while (1) {
>>> + pid = waitpid(-1, &status, WNOHANG|WUNTRACED|WCONTINUED);
>> Why just WNOHANG is not enough?
>
> To provide more information.
>
>>> +
>>> + if (pid <= 0) {
>>> + errno = saved_errno;
>>> + return;
>>> + }
>>> +
>>> + if (WIFEXITED(status))
>>> + pr_info("Worker(pid %d) exited with %d\n",
>>> + pid, WEXITSTATUS(status));
>>> +
>>> + else if (WIFSIGNALED(status)) {
>>> + pr_info("Worker(pid %d) ", pid);
>>> +#ifdef WCOREDUMP
>>> + if (WCOREDUMP(status))
>>> + pr_info("prodused a core dump and ");
>>> +#endif
>>> + pr_info("was killed by %d\n", WTERMSIG(status));
>>> +
>>> + } else if (WIFSTOPPED(status))
>>> + pr_info("Worker(pid %d) was stopped by %d\n",
>>> + pid, WSTOPSIG(status));
>>> +
>>> + else if (WIFCONTINUED(status))
>>> + pr_info("Worker(pid %d) was continued\n", pid);
>> I'm not sure we need all the events. Exited and killed is more than enough.
>
> I thought more information is better. Especially, when we got all
> information about children from logfile. I also thought, criu has
> parasite and stuff and works with kernel, so it wouldn't do any harm, if
> we'll provide more information. Or would it?
Typically people only need to know about termination.
>>> +
>>> + }
>>> +}
>>> +
>>> int cr_service(bool daemon_mode)
>>> {
>>> int server_fd = -1;
>>> @@ -264,6 +311,9 @@ int cr_service(bool daemon_mode)
>>> socklen_t server_addr_len;
>>> socklen_t client_addr_len;
>>>
>>> + struct sigaction new_action, old_action;
>>> + sigset_t block_mask;
>> No need in dedicated block_mask, you can fill the mask
>> right on sigaction.
>>
>>> +
>>> server_fd = socket(AF_LOCAL, SOCK_SEQPACKET, 0);
>>> if (server_fd == -1) {
>>> pr_perror("Can't initialize service socket.");
>>> @@ -323,8 +373,25 @@ int cr_service(bool daemon_mode)
>>> }
>>> }
>>>
>>> - /* FIXME Do not ignore children's return values */
>>> - signal(SIGCHLD, SIG_IGN);
>>> + /* Set SIGCHLD handler */
>>> + sigemptyset(&block_mask);
>>> +
>>> + /* FIXME May need to block some other signals too */
>>> + sigaddset(&block_mask, SIGCHLD);
>>> + sigaddset(&block_mask, SIGINT);
>>> + sigaddset(&block_mask, SIGQUIT);
>>> + sigaddset(&block_mask, SIGHUP);
>>> + sigaddset(&block_mask, SIGTERM);
>>> + sigaddset(&block_mask, SIGTSTP);
>> Any explanation to this choice?
>
> I've opened man and looked for signals, that may wait for handler to
> finish writing to logfile.
Much more unhandled signals will terminate the app while it's writing
the log file. Let's leave only those necessary for correct child handling.
>>
>>> +
>>> + new_action.sa_handler = reap_worker;
>>> + new_action.sa_mask = block_mask;
>>> + new_action.sa_flags = SA_RESTART;
>>> +
>>> + if (sigaction(SIGCHLD, &new_action, &old_action)) {
>>> + pr_perror("Can't setup SIGCHLD handler");
>>> + return -1;
>>> + }
>>>
>>> while (1) {
>>> int sk;
>>> @@ -342,6 +409,12 @@ int cr_service(bool daemon_mode)
>>> if (child_pid == 0) {
>>> int ret;
>>>
>>> + /* Set default SIGCHLD handler back */
>>> + if (sigaction(SIGCHLD, &old_action, NULL)) {
>>> + pr_perror("Can't setup default SIGCHLD handler back");
>>> + exit(1);
>>> + }
>>> +
>>> close(server_fd);
>>> ret = cr_service_work(sk);
>>> close(sk);
>>>
>>
>
> .
>
More information about the CRIU
mailing list