[CRIU] GSoC: optimize logging engine

Andrei Vagin avagin at gmail.com
Sun Sep 1 20:58:22 MSK 2019


On Mon, Aug 26, 2019 at 05:49:24PM +0300, Anastasia Markina wrote:
> Hello! I've finalized my work for the GSoC project as a patch, attached to
> this message. As before, it is also available in my github repo,
> https://github.com/anastasiamarkina/criu/tree/nastya-flog2
> 
> The next part is a fragment of binlog.txt file included into the patch,
> which explains how to use it:
> 
> USAGE: TURNING ON BINARY LOGGING
> 
> Classical text logging is done by default.
> Binary logging is turned on by the --binlog command line parameter.
> It should be combined with --log-file filename parameter to specify the log
> file name (this second parameter is used with both text and binary logging).
> 
> The example:
> 
> criu --log-file /tmp/sdf.bin --binlog check
> 
> USAGE: READING BINARY LOG FILE
> 
> There is a --print-log parameter which turnes criy into a log reader.
> It should be combined with --log-file filename parameter, which specifies
> the log file name. When these parameters are used, CRIU prints all messages
> from the binary log file on the screen and exits. This looks as follows:
> 
> criu --log-file /tmp/sdf.bin --binlog --print-log
> 
> P.S. Thank you for the possibility to work with the project!

> From 515a0edf4ffdbfc173e237553e29eae993e8aab6 Mon Sep 17 00:00:00 2001
> From: Anastasia Markina <asyamarkina2 at gmail.com>
> Date: Thu, 25 Jul 2019 15:51:04 +0300
> Subject: [PATCH] criu binary logging based on slightly modified flog engine
>  merging flog code into criu and making the first working version with binlog
>  log file name is specified after the --log-file key as usual; added --binlog
>  option to save binary log instead of text; added --print-log option to print
>  messages from binary log file and exit added documentation about binlog
> 
> ---
>  Documentation/binlog.txt  |  52 +++++++++
>  Makefile                  |  13 +++
>  criu/Makefile             |   1 +
>  criu/Makefile.packages    |   2 +-
>  criu/config.c             |   2 +
>  criu/crtools.c            | 111 ++++++++++++-------
>  criu/include/cr_options.h |   2 +
>  criu/include/criu-log.h   |   1 +
>  criu/include/log.h        |  36 +++++-
>  criu/log.c                |  40 +++++--
>  flog/Makefile             |   8 ++
>  flog/include/flog.h       |   9 ++
>  flog/include/uapi/flog.h  | 168 ++++++++++++++++++++++++++++
>  flog/src/flog.c           | 227 ++++++++++++++++++++++++++++++++++++++
>  14 files changed, 620 insertions(+), 52 deletions(-)
>  create mode 100644 Documentation/binlog.txt
>  create mode 100644 flog/Makefile
>  create mode 100644 flog/include/flog.h
>  create mode 100644 flog/include/uapi/flog.h
>  create mode 100644 flog/src/flog.c
> 
> diff --git a/Documentation/binlog.txt b/Documentation/binlog.txt
> new file mode 100644
> index 00000000..1a5c80c1
> --- /dev/null
> +++ b/Documentation/binlog.txt
> @@ -0,0 +1,52 @@
> +CRIU binary logging
> +===================
> +
> +Usage: turning on binary logging
> +--------------------------------
> +
> +Classical text logging is done by default.
> +Binary logging is turned on by the `--binlog` command line parameter.
> +It should be combined with `--log-file filename` parameter to specify
> +the log file name (this second parameter is used with both text and binary logging).
> +
> +The example:
> +
> +`criu --log-file /tmp/sdf.bin --binlog check`
> +
> +Usage: reading binary log file
> +------------------------------
> +
> +There is a `--print-log` parameter which turnes criy into a log reader.
> +It should be combined with `--log-file filename` parameter, which specifies
> +the log file name. When these parameters are used, CRIU prints all messages
> +from the binary log file on the screen and exits. This looks as follows:
> +
> +`criu --log-file /tmp/sdf.bin --binlog --print-log`
> +
> +Binary log file format
> +----------------------
> +
> +Text logging in CRIU is based on the printf functions family, with format string
> +followed by arguments. To make logging in binary form faster, CRIU just stores
> +format string and arguments in binary log without parsing. When binary log file
> +is read with `--print-log` parameter, these data are taken from the binary log
> +records and passed to a printf function.
> +
> +Binary log file format is highly inspired by the [flog experimental project](https://github.com/cyrillos/flog).
> +Each log message is stored as a format string aned a variable number of
> +arguments. The variable size record has following structure:
> +
> +
> +| field   | datatype     |  explanation                    |
> +|---------|--------------|---------------------------------|
> +| magic   | unsigned int | magic code, equal to `0x676f6c66` |
> +| version | unsigned int | binlog file version, equal to `1` |
> +| size    | unsigned int | actual size of the record (is needed because of variable arguments|
> +| nargs   | unsigned int | number of arguments |
> +| mask    | unsigned int | mask, which encodes the argument type, e.g. long integer or string|
> +| fmt     | long         | offset to the format string, starting from the beginning of the record|
> +| args[0] | long         | 1st argument: the value, if it is a number, or the offset to the string, starting from the beginning of the record |
> +| .....   |              | |
> +| args[nargs-1] | long   | last argument: the value, if it is a number, or the offset to the string, starting from the beginning of the record |
> +| .....   |              | format string and all other strings passed as arguments, if any |
> +
> diff --git a/Makefile b/Makefile
> index 0140330e..c2091a88 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -127,6 +127,8 @@ ifeq ($(GMON),1)
>  export GMON GMONLDOPT
>  endif
>  
> +CFLAGS			+= -iquote flog/include/ -iquote flog/include/uapi
> +
>  AFLAGS			+= -D__ASSEMBLY__
>  CFLAGS			+= $(USERCFLAGS) $(WARNINGS) $(DEFINES) -iquote include/
>  HOSTCFLAGS		+= $(WARNINGS) $(DEFINES) -iquote include/
> @@ -218,6 +220,17 @@ soccr/built-in.o: $(CONFIG_HEADER) .FORCE
>  $(SOCCR_A): |soccr/built-in.o
>  criu-deps	+= $(SOCCR_A)
>  
> +#
> +# Fast logging library
> +FLOG_A := flog/libflog.a
> +flog/Makefile: ;
> +flog/%: $(CONFIG_HEADER) .FORCE
> +	$(Q) $(MAKE) $(build)=flog $@
> +flog/built-in.o: $(CONFIG_HEADER) .FORCE
> +	$(Q) $(MAKE) $(build)=flog all
> +$(FLOG_A): | flog/built-in.o
> +criu-deps	+= $(FLOG_A)
> +
>  #
>  # CRIU building done in own directory
>  # with slightly different rules so we
> diff --git a/criu/Makefile b/criu/Makefile
> index 4134e505..ee0aa291 100644
> --- a/criu/Makefile
> +++ b/criu/Makefile
> @@ -70,6 +70,7 @@ PROGRAM-BUILTINS	+= images/built-in.o
>  PROGRAM-BUILTINS	+= $(obj)/built-in.o
>  PROGRAM-BUILTINS	+= $(ARCH-LIB)
>  PROGRAM-BUILTINS	+= soccr/libsoccr.a
> +PROGRAM-BUILTINS	+= flog/libflog.a
>  PROGRAM-BUILTINS	+= $(COMPEL_LIBS)
>  
>  $(obj)/built-in.o: pie
> diff --git a/criu/Makefile.packages b/criu/Makefile.packages
> index f380fa2f..65111f9c 100644
> --- a/criu/Makefile.packages
> +++ b/criu/Makefile.packages
> @@ -32,7 +32,7 @@ REQ-DEB-PKG-NAMES	+= python-future
>  REQ-RPM-PKG-TEST-NAMES	+= $(PYTHON)-pyyaml
>  endif
>  
> -export LIBS		+= -lprotobuf-c -ldl -lnl-3 -lsoccr -Lsoccr/ -lnet
> +export LIBS		+= -lprotobuf-c -ldl -lnl-3 -lsoccr -Lsoccr/ -lnet -lffi
>  
>  check-packages-failed:
>  	$(warning Can not find some of the required libraries)
> diff --git a/criu/config.c b/criu/config.c
> index 3a54afd4..7d79c638 100644
> --- a/criu/config.c
> +++ b/criu/config.c
> @@ -458,6 +458,8 @@ int parse_options(int argc, char **argv, bool *usage_error,
>  		BOOL_OPT(SK_EST_PARAM, &opts.tcp_established_ok),
>  		{ "close",			required_argument,	0, 1043	},
>  		BOOL_OPT("log-pid", &opts.log_file_per_pid),
> +		BOOL_OPT("binlog", &opts.binlog),
> +		BOOL_OPT("print-log", &opts.printlog),
>  		{ "version",			no_argument,		0, 'V'	},
>  		BOOL_OPT("evasive-devices", &opts.evasive_devices),
>  		{ "pidfile",			required_argument,	0, 1046	},
> diff --git a/criu/crtools.c b/criu/crtools.c
> index 97a6d6d6..010d8d11 100644
> --- a/criu/crtools.c
> +++ b/criu/crtools.c
> @@ -48,6 +48,14 @@
>  #include "sysctl.h"
>  #include "img-remote.h"
>  
> +flog_ctx_t flog_ctx;
> +
> +
> +int criu_quit(int ret_val) {
> +	log_fini();	
		    ^^^^ white spaces

This patch contains a lot of white spaces, pls remove all of them.

> +	exit(ret_val);
> +}
> +
>  int main(int argc, char *argv[], char *envp[])
>  {
>  	int ret = -1;
> @@ -55,7 +63,14 @@ int main(int argc, char *argv[], char *envp[])
>  	bool has_exec_cmd = false;
>  	bool has_sub_command;
>  	int state = PARSING_GLOBAL_CONF;
> +	
> +	flog_ctx.readonly=1; 
> +	/* It is not known yet if we will have binlog, but let's not try to 
> +	 * write into it for safety reasons 
> +	 */
> +	 
>  
> +	/* FIXME: where to put flog_fini? */
>  	BUILD_BUG_ON(CTL_32 != SYSCTL_TYPE__CTL_32);
>  	BUILD_BUG_ON(__CTL_STR != SYSCTL_TYPE__CTL_STR);
>  	/* We use it for fd overlap handling in clone_service_fd() */
> @@ -67,20 +82,34 @@ int main(int argc, char *argv[], char *envp[])
>  
>  	cr_pb_init();
>  	setproctitle_init(argc, argv, envp);
> -
> -	if (argc < 2)
> +	
> +	if (argc < 2) {
>  		goto usage;
> -
> +	}

why do we need these brackets?
> +	
>  	init_opts();
>  
> -
>  	ret = parse_options(argc, argv, &usage_error, &has_exec_cmd, state);
>  
> +	if (opts.printlog) {
> +				
> +		if (log_init(opts.output))
> +			return 1;	
> +		/* print and exit */
> +		printf("--- printing log from %s file ---\n", opts.output);
> +		struct stat st;			
> +		stat(opts.output, &st);			
> +		if(st.st_size) 
> +			flog_decode_all(&flog_ctx, STDOUT_FILENO);
> +		printf("--- end of the log ---\n");
> +		criu_quit(0);
> +	}
> +	
>  	if (ret == 1)
> -		return 1;
> +		return 1;	
>  	if (ret == 2)
> -		goto usage;
> -
> +		goto usage;	
> +	
>  	log_set_loglevel(opts.log_level);
>  
>  	if (!strcmp(argv[1], "swrk")) {
> @@ -98,6 +127,7 @@ int main(int argc, char *argv[], char *envp[])
>  
>  	if (check_options()) {
>  		flush_early_log_buffer(STDERR_FILENO);
> +		flog_fini(&flog_ctx);
>  		return 1;
>  	}
>  
> @@ -106,7 +136,7 @@ int main(int argc, char *argv[], char *envp[])
>  
>  	if (opts.work_dir == NULL)
>  		SET_CHAR_OPTS(work_dir, opts.imgs_dir);
> -
> +	
>  	if (optind >= argc) {
>  		pr_msg("Error: command is required\n");
>  		goto usage;
> @@ -131,8 +161,8 @@ int main(int argc, char *argv[], char *envp[])
>  		}
>  
>  		opts.exec_cmd = xmalloc((argc - optind) * sizeof(char *));
> -		if (!opts.exec_cmd)
> -			return 1;
> +		if (!opts.exec_cmd) 
> +			criu_quit(1);
>  		memcpy(opts.exec_cmd, &argv[optind + 1], (argc - optind - 1) * sizeof(char *));
>  		opts.exec_cmd[argc - optind - 1] = NULL;
>  	} else {
> @@ -148,7 +178,7 @@ int main(int argc, char *argv[], char *envp[])
>  	if (strcmp(argv[optind], "service")) {
>  		ret = open_image_dir(opts.imgs_dir);
>  		if (ret < 0)
> -			return 1;
> +			criu_quit(1);
>  	}
>  
>  	/*
> @@ -163,14 +193,14 @@ int main(int argc, char *argv[], char *envp[])
>  
>  	if (chdir(opts.work_dir)) {
>  		pr_perror("Can't change directory to %s", opts.work_dir);
> -		return 1;
> +		criu_quit(1);
>  	}
>  
>  	if (log_init(opts.output))
> -		return 1;
> -
> +		return 1;		
> +	
>  	if (kerndat_init())
> -		return 1;
> +		criu_quit(1);
>  
>  	if (opts.deprecated_ok)
>  		pr_debug("DEPRECATED ON\n");
> @@ -178,7 +208,7 @@ int main(int argc, char *argv[], char *envp[])
>  	if (!list_empty(&opts.inherit_fds)) {
>  		if (strcmp(argv[optind], "restore")) {
>  			pr_err("--inherit-fd is restore-only option\n");
> -			return 1;
> +			criu_quit(1);
>  		}
>  		/* now that log file is set up, print inherit fd list */
>  		inherit_fd_log();
> @@ -190,7 +220,7 @@ int main(int argc, char *argv[], char *envp[])
>  	if (!strcmp(argv[optind], "dump")) {
>  		if (!opts.tree_id)
>  			goto opt_pid_missing;
> -		return cr_dump_tasks(opts.tree_id);
> +		criu_quit(cr_dump_tasks(opts.tree_id));
>  	}
>  
>  	if (!strcmp(argv[optind], "pre-dump")) {
> @@ -199,10 +229,10 @@ int main(int argc, char *argv[], char *envp[])
>  
>  		if (opts.lazy_pages) {
>  			pr_err("Cannot pre-dump with --lazy-pages\n");
> -			return 1;
> +			criu_quit(1);
>  		}
>  
> -		return cr_pre_dump_tasks(opts.tree_id) != 0;
> +		criu_quit(cr_pre_dump_tasks(opts.tree_id) != 0);
>  	}
>  
>  	if (!strcmp(argv[optind], "restore")) {
> @@ -217,39 +247,41 @@ int main(int argc, char *argv[], char *envp[])
>  			ret = 1;
>  		}
>  
> -		return ret != 0;
> +		criu_quit(ret != 0);
>  	}
>  
>  	if (!strcmp(argv[optind], "lazy-pages"))
> -		return cr_lazy_pages(opts.daemon_mode) != 0;
> +		criu_quit(cr_lazy_pages(opts.daemon_mode) != 0);
>  
> -	if (!strcmp(argv[optind], "check"))
> -		return cr_check() != 0;
> +	if (!strcmp(argv[optind], "check")) {
> +		ret = cr_check();		
> +		criu_quit(ret != 0);
> +	}
>  
>  	if (!strcmp(argv[optind], "page-server"))
> -		return cr_page_server(opts.daemon_mode, false, -1) != 0;
> +		criu_quit(cr_page_server(opts.daemon_mode, false, -1) != 0);
>  
>  	if (!strcmp(argv[optind], "image-cache")) {
>  		if (!opts.port)
>  			goto opt_port_missing;
> -		return image_cache(opts.daemon_mode, DEFAULT_CACHE_SOCKET);
> +		criu_quit(image_cache(opts.daemon_mode, DEFAULT_CACHE_SOCKET));
>  	}
>  
>  	if (!strcmp(argv[optind], "image-proxy")) {
>  		if (!opts.addr) {
>  			pr_msg("Error: address not specified\n");
> -			return 1;
> +			criu_quit(1);
>  		}
>  		if (!opts.port)
>  			goto opt_port_missing;
> -		return image_proxy(opts.daemon_mode, DEFAULT_PROXY_SOCKET);
> +		criu_quit(image_proxy(opts.daemon_mode, DEFAULT_PROXY_SOCKET));
>  	}
>  
>  	if (!strcmp(argv[optind], "service"))
> -		return cr_service(opts.daemon_mode);
> +		criu_quit(cr_service(opts.daemon_mode));
>  
>  	if (!strcmp(argv[optind], "dedup"))
> -		return cr_dedup() != 0;
> +		criu_quit(cr_dedup() != 0);
>  
>  	if (!strcmp(argv[optind], "cpuinfo")) {
>  		if (!argv[optind + 1]) {
> @@ -259,22 +291,23 @@ int main(int argc, char *argv[], char *envp[])
>  		if (!strcmp(argv[optind + 1], "dump"))
>  			return cpuinfo_dump();
>  		else if (!strcmp(argv[optind + 1], "check"))
> -			return cpuinfo_check();
> +			criu_quit(cpuinfo_check());
>  	}
>  
>  	if (!strcmp(argv[optind], "exec")) {
>  		pr_msg("The \"exec\" action is deprecated by the Compel library.\n");
> -		return -1;
> +		criu_quit(-1);
>  	}
>  
>  	if (!strcmp(argv[optind], "show")) {
>  		pr_msg("The \"show\" action is deprecated by the CRIT utility.\n");
>  		pr_msg("To view an image use the \"crit decode -i $name --pretty\" command.\n");
> -		return -1;
> +		criu_quit(-1);
>  	}
> -
> +	
>  	pr_msg("Error: unknown command: %s\n", argv[optind]);
>  usage:
> +	printf("usage\n");
>  	pr_msg("\n"
>  "Usage:\n"
>  "  criu dump|pre-dump -t PID [<options>]\n"
> @@ -302,8 +335,8 @@ usage:
>  	);
>  
>  	if (usage_error) {
> -		pr_msg("\nTry -h|--help for more info\n");
> -		return 1;
> +		pr_msg("\nTry -h|--help for more info\n");				
> +		criu_quit(1);
>  	}
>  
>  	pr_msg("\n"
> @@ -425,7 +458,9 @@ usage:
>  "\n"
>  "* Logging:\n"
>  "  -o|--log-file FILE    log file name\n"
> +"     --binlog           use faster binary log instead of slower text log\n"
>  "     --log-pid          enable per-process logging to separate FILE.pid files\n"
> +"     --print-log        print log from the binlog file and exit\n"
>  "  -v[v...]|--verbosity  increase verbosity (can use multiple v)\n"
>  "  -vNUM|--verbosity=NUM set verbosity to NUM (higher level means more output):\n"
>  "                          -v1 - only errors and messages\n"
> @@ -466,13 +501,13 @@ usage:
>  "  -V|--version          show version\n"
>  	);
>  
> -	return 0;
> +	criu_quit(0);
>  
>  opt_port_missing:
>  	pr_msg("Error: port not specified\n");
> -	return 1;
> +	criu_quit(1);
>  
>  opt_pid_missing:
>  	pr_msg("Error: pid not specified\n");
> -	return 1;
> +	criu_quit(1);
>  }
> diff --git a/criu/include/cr_options.h b/criu/include/cr_options.h
> index c519c740..a79bdebc 100644
> --- a/criu/include/cr_options.h
> +++ b/criu/include/cr_options.h
> @@ -81,6 +81,8 @@ struct cr_options {
>  	int			evasive_devices;
>  	int			link_remap_ok;
>  	int			log_file_per_pid;
> +	int			binlog;
> +	int			printlog;
>  	bool			swrk_restore;
>  	char			*output;
>  	char			*root;
> diff --git a/criu/include/criu-log.h b/criu/include/criu-log.h
> index 21ef5430..57792a44 100644
> --- a/criu/include/criu-log.h
> +++ b/criu/include/criu-log.h
> @@ -25,6 +25,7 @@
>  struct timeval;
>  
>  extern int log_init(const char *output);
> +
>  extern void log_fini(void);
>  extern int log_init_by_pid(pid_t pid);
>  extern void log_closedir(void);
> diff --git a/criu/include/log.h b/criu/include/log.h
> index 15787b09..08c57381 100644
> --- a/criu/include/log.h
> +++ b/criu/include/log.h
> @@ -9,6 +9,11 @@
>  #include <errno.h>
>  #include <stdarg.h>
>  
> +#include "flog.h"
> +
> +/* FIXME: Should be in options? */
> +extern flog_ctx_t flog_ctx;
> +
>  extern void vprint_on_level(unsigned int loglevel, const char *format,
>  		va_list params);
>  
> @@ -32,35 +37,53 @@ extern void print_on_level(unsigned int loglevel, const char *format, ...)
>  
>  void flush_early_log_buffer(int fd);
>  
> +#ifdef CR_NOGLIBC
> +#undef flog_encode
> +#define flog_encode(ctx, fmt, ...)
> +#endif
> +
>  #define print_once(loglevel, fmt, ...)					\
>  	do {								\
>  		static bool __printed;					\
>  		if (!__printed) {					\
> +			flog_encode(&flog_ctx, fmt, ##__VA_ARGS__);	\
>  			print_on_level(loglevel, fmt, ##__VA_ARGS__);	\
>  			__printed = 1;					\
>  		}							\
>  	} while (0)
>  
>  #define pr_msg(fmt, ...)						\
> +	do {								\
> +	flog_encode(&flog_ctx, fmt, ##__VA_ARGS__);			\
>  	print_on_level(LOG_MSG,						\
> -		       fmt, ##__VA_ARGS__)
> +		       fmt, ##__VA_ARGS__);				\
> +	} while (0)
>  
>  #define pr_info(fmt, ...)						\
> +	do {								\
> +	flog_encode(&flog_ctx, fmt, ##__VA_ARGS__);			\
>  	print_on_level(LOG_INFO,					\
> -		       LOG_PREFIX fmt, ##__VA_ARGS__)
> +		       LOG_PREFIX fmt, ##__VA_ARGS__);			\
> +	} while (0)
>  
>  #define pr_err(fmt, ...)						\
> +	do {								\
> +	flog_encode(&flog_ctx, fmt, ##__VA_ARGS__);			\
>  	print_on_level(LOG_ERROR,					\
>  		       "Error (%s:%d): " LOG_PREFIX fmt,		\
> -		       __FILE__, __LINE__, ##__VA_ARGS__)
> +		       __FILE__, __LINE__, ##__VA_ARGS__);		\
> +	} while (0)
>  
>  #define pr_err_once(fmt, ...)						\
>  	print_once(LOG_ERROR, fmt, ##__VA_ARGS__)
>  
>  #define pr_warn(fmt, ...)						\
> +	do {								\
> +	flog_encode(&flog_ctx, fmt, ##__VA_ARGS__);			\
>  	print_on_level(LOG_WARN,					\
>  		       "Warn  (%s:%d): " LOG_PREFIX fmt,		\
> -		       __FILE__, __LINE__, ##__VA_ARGS__)
> +		       __FILE__, __LINE__, ##__VA_ARGS__);		\
> +	} while (0)
>  
>  #define pr_warn_once(fmt, ...)						\
>         print_once(LOG_WARN,						\
> @@ -68,8 +91,11 @@ void flush_early_log_buffer(int fd);
>  			__FILE__, __LINE__, ##__VA_ARGS__)
>  
>  #define pr_debug(fmt, ...)						\
> +	do {								\
> +	flog_encode(&flog_ctx, fmt, ##__VA_ARGS__);			\
>  	print_on_level(LOG_DEBUG,					\
> -		       LOG_PREFIX fmt, ##__VA_ARGS__)
> +		       LOG_PREFIX fmt, ##__VA_ARGS__);			\
> +	} while (0)
>  
>  #ifndef CR_NOGLIBC
>  
> diff --git a/criu/log.c b/criu/log.c
> index 8bdf8353..723fe72d 100644
> --- a/criu/log.c
> +++ b/criu/log.c
> @@ -10,6 +10,7 @@
>  #include <sys/time.h>
>  #include <sys/resource.h>
>  #include <sys/utsname.h>
> +#include <sys/stat.h>
>  
>  #include <fcntl.h>
>  
> @@ -207,10 +208,11 @@ void flush_early_log_buffer(int fd)
>  int log_init(const char *output)
>  {
>  	int new_logfd, fd;
> +	int log_file_open_mode, log_file_perm;
>  
>  	gettimeofday(&start, NULL);
>  	reset_buf_off();
> -
> +	memset(&flog_ctx, 0, sizeof(flog_ctx));
>  	if (output && !strncmp(output, "-", 2)) {
>  		new_logfd = dup(STDOUT_FILENO);
>  		if (new_logfd < 0) {
> @@ -218,7 +220,23 @@ int log_init(const char *output)
>  			return -1;
>  		}
>  	} else if (output) {
> -		new_logfd = open(output, O_CREAT|O_TRUNC|O_WRONLY|O_APPEND, 0600);
> +		if (opts.printlog) {
> +			/* read old binlog */
> +			log_file_open_mode = O_RDONLY;
> +			flog_ctx.readonly=1; /* don't try to write into it */
> +			struct stat st;			
> +			stat(opts.output, &st);			
> +			flog_ctx.size=st.st_size;
> +		}
> +		else if (opts.binlog) {
> +			log_file_open_mode = O_RDWR | O_CREAT | O_TRUNC;
> +		}
> +		else {
> +			flog_ctx.readonly=1; /* no binlog, don't try to write into it */
> +			log_file_open_mode = O_CREAT|O_TRUNC|O_WRONLY|O_APPEND;
> +		}
> +		log_file_perm = opts.binlog?0644:0600;
> +		new_logfd = open(output, log_file_open_mode, log_file_perm);
>  		if (new_logfd < 0) {
>  			pr_perror("Can't create log file %s", output);
>  			return -1;
> @@ -234,15 +252,19 @@ int log_init(const char *output)
>  	fd = install_service_fd(LOG_FD_OFF, new_logfd);
>  	if (fd < 0)
>  		goto err;
> -
> +			
> +	if (opts.binlog||opts.printlog) {
> +		
> +		flog_map_buf(fd, &flog_ctx);
> +	}
> +	
> +	
>  	init_done = 1;
>  
>  	/*
>  	 * Once logging is setup this write out all early log messages.
>  	 * Only those messages which have to correct log level are printed.
>  	 */
> -	flush_early_log_buffer(fd);
> -
>  	print_versions();
>  
>  	return 0;
> @@ -277,7 +299,9 @@ int log_init_by_pid(pid_t pid)
>  
>  void log_fini(void)
>  {
> -	close_service_fd(LOG_FD_OFF);
> +	flog_fini(&flog_ctx);
> +	
> +	
>  }
>  
>  static void soccr_print_on_level(unsigned int loglevel, const char *format, ...)
> @@ -392,9 +416,9 @@ void vprint_on_level(unsigned int loglevel, const char *format, va_list params)
>  }
>  
>  void print_on_level(unsigned int loglevel, const char *format, ...)
> -{
> +{	
>  	va_list params;
> -
> +	if (opts.binlog) return;
>  	va_start(params, format);
>  	vprint_on_level(loglevel, format, params);
>  	va_end(params);
> diff --git a/flog/Makefile b/flog/Makefile
> new file mode 100644
> index 00000000..71e398ed
> --- /dev/null
> +++ b/flog/Makefile
> @@ -0,0 +1,8 @@
> +ccflags-y	+= -iquote flog/include
> +ccflags-y	+= -iquote include
> +ccflags-y	+= -fno-strict-aliasing
> +ldflags-y	+= -r
> +
> +lib-name := libflog.a
> +
> +lib-y += src/flog.o
> diff --git a/flog/include/flog.h b/flog/include/flog.h
> new file mode 100644
> index 00000000..f00c2054
> --- /dev/null
> +++ b/flog/include/flog.h
> @@ -0,0 +1,9 @@
> +#ifndef __FLOG_H__
> +#define __FLOG_H__
> +
> +#include <string.h>
> +#include <errno.h>
> +
> +#include "uapi/flog.h"
> +
> +#endif /* __FLOG_H__ */
> diff --git a/flog/include/uapi/flog.h b/flog/include/uapi/flog.h
> new file mode 100644
> index 00000000..8ab3566c
> --- /dev/null
> +++ b/flog/include/uapi/flog.h
> @@ -0,0 +1,168 @@
> +#ifndef __UAPI_FLOG_H__
> +#define __UAPI_FLOG_H__
> +
> +#include <stdbool.h>
> +#include <string.h>
> +#include <errno.h>
> +
> +/*
> + * We work with up to 32 arguments in macros here.
> + * If more provided -- behaviour is undefined.
> + */
> +
> +/*
> + * By Laurent Deniau at https://groups.google.com/forum/#!topic/comp.std.c/d-6Mj5Lko_s
> + */
> +#define FLOG_PP_NARG_(...)			FLOG_PP_ARG_N(__VA_ARGS__)
> +#define FLOG_PP_NARG(...)			FLOG_PP_NARG_(1, ##__VA_ARGS__, FLOG_PP_RSEQ_N())
> +
> +#define FLOG_PP_ARG_N( _0, _1, _2, _3, _4,	\
> +		       _5, _6, _7, _8, _9,	\
> +		      _10,_11,_12,_13,_14,	\
> +		      _15,_16,_17,_18,_19,	\
> +		      _20,_21,_22,_23,_24,	\
> +		      _25,_26,_27,_28,_29,	\
> +		      _30,_31,  N, ...)		N
> +
> +#define FLOG_PP_RSEQ_N()			\
> +		       31, 30, 29, 28, 27,	\
> +		       26, 25, 24, 23, 22,	\
> +		       21, 20, 19, 18, 17,	\
> +		       16, 15, 14, 13, 12,	\
> +		       11, 10,  9,  8,  7,	\
> +		        6,  5,  4,  3,  2,	\
> +		        1,  0
> +
> +#define FLOG_GENMASK_0(N, x)		0
> +#define FLOG_GENMASK_1(N,  op, x, ...)	 (op(N,  0, x))
> +#define FLOG_GENMASK_2(N,  op, x, ...)	((op(N,  1, x)) | FLOG_GENMASK_1(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_3(N,  op, x, ...)	((op(N,  2, x)) | FLOG_GENMASK_2(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_4(N,  op, x, ...)	((op(N,  3, x)) | FLOG_GENMASK_3(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_5(N,  op, x, ...)	((op(N,  4, x)) | FLOG_GENMASK_4(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_6(N,  op, x, ...)	((op(N,  5, x)) | FLOG_GENMASK_5(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_7(N,  op, x, ...)	((op(N,  6, x)) | FLOG_GENMASK_6(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_8(N,  op, x, ...)	((op(N,  7, x)) | FLOG_GENMASK_7(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_9(N,  op, x, ...)	((op(N,  8, x)) | FLOG_GENMASK_8(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_10(N, op, x, ...)	((op(N,  9, x)) | FLOG_GENMASK_9(N,  op,  __VA_ARGS__))
> +#define FLOG_GENMASK_11(N, op, x, ...)	((op(N, 10, x)) | FLOG_GENMASK_10(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_12(N, op, x, ...)	((op(N, 11, x)) | FLOG_GENMASK_11(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_13(N, op, x, ...)	((op(N, 12, x)) | FLOG_GENMASK_12(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_14(N, op, x, ...)	((op(N, 13, x)) | FLOG_GENMASK_13(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_15(N, op, x, ...)	((op(N, 14, x)) | FLOG_GENMASK_14(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_16(N, op, x, ...)	((op(N, 15, x)) | FLOG_GENMASK_15(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_17(N, op, x, ...)	((op(N, 16, x)) | FLOG_GENMASK_16(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_18(N, op, x, ...)	((op(N, 17, x)) | FLOG_GENMASK_17(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_19(N, op, x, ...)	((op(N, 18, x)) | FLOG_GENMASK_18(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_20(N, op, x, ...)	((op(N, 19, x)) | FLOG_GENMASK_19(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_21(N, op, x, ...)	((op(N, 20, x)) | FLOG_GENMASK_20(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_22(N, op, x, ...)	((op(N, 21, x)) | FLOG_GENMASK_21(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_23(N, op, x, ...)	((op(N, 22, x)) | FLOG_GENMASK_22(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_24(N, op, x, ...)	((op(N, 23, x)) | FLOG_GENMASK_23(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_25(N, op, x, ...)	((op(N, 24, x)) | FLOG_GENMASK_24(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_26(N, op, x, ...)	((op(N, 25, x)) | FLOG_GENMASK_25(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_27(N, op, x, ...)	((op(N, 26, x)) | FLOG_GENMASK_26(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_28(N, op, x, ...)	((op(N, 27, x)) | FLOG_GENMASK_27(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_29(N, op, x, ...)	((op(N, 28, x)) | FLOG_GENMASK_28(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_30(N, op, x, ...)	((op(N, 29, x)) | FLOG_GENMASK_29(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_31(N, op, x, ...)	((op(N, 30, x)) | FLOG_GENMASK_30(N, op,  __VA_ARGS__))
> +#define FLOG_GENMASK_32(N, op, x, ...)	((op(N, 31, x)) | FLOG_GENMASK_31(N, op,  __VA_ARGS__))
> +
> +#define FLOG_CONCAT(arg1, arg2)		FLOG_CONCAT1(arg1, arg2)
> +#define FLOG_CONCAT1(arg1, arg2)	FLOG_CONCAT2(arg1, arg2)
> +#define FLOG_CONCAT2(arg1, arg2)	arg1##arg2
> +
> +#define FLOG_GENMASK_(N, op, ...)	FLOG_CONCAT(FLOG_GENMASK_, N)(N, op, ##__VA_ARGS__)
> +#define FLOG_GENMASK(op, ...)		FLOG_GENMASK_(FLOG_PP_NARG(__VA_ARGS__), op, ##__VA_ARGS__)
> +
> +#define flog_genbit(ord, n, v, ...)					\
> +	_Generic((v),							\
> +									\
> +		 /* Basic types */					\
> +		 char:				0,			\
> +		 signed char:			0,			\
> +		 unsigned char:			0,			\
> +		 signed short int:		0,			\
> +		 unsigned short int:		0,			\
> +		 signed int:			0,			\
> +		 unsigned int:			0,			\
> +		 signed long:			0,			\
> +		 unsigned long:			0,			\
> +		 signed long long:		0,			\
> +		 unsigned long long:		0,			\
> +									\
> +		 /* Not used for a while */				\
> +		 /* float:			12, */			\
> +		 /* double:			13, */			\
> +		 /* long double:		14, */			\
> +									\
> +		 /* Basic poniters */					\
> +		 char *:			(1u << (ord - n - 1)),	\
> +		 signed char *:			(1u << (ord - n - 1)),	\
> +		 unsigned char *:		(1u << (ord - n - 1)),	\
> +		 signed short int *:		0,			\
> +		 unsigned short int *:		0,			\
> +		 signed int *:			0,			\
> +		 unsigned int *:		0,			\
> +		 signed long *:			0,			\
> +		 unsigned long *:		0,			\
> +		 signed long long *:		0,			\
> +		 unsigned long long *:		0,			\
> +		 void *:			0,			\
> +									\
> +		 /* Const basic pointers */				\
> +		 const char *:			(1u << (ord - n - 1)),	\
> +		 const signed char *:		(1u << (ord - n - 1)),	\
> +		 const unsigned char *:		(1u << (ord - n - 1)),	\
> +		 const signed short int *:	0,			\
> +		 const unsigned short int *:	0,			\
> +		 const signed int *:		0,			\
> +		 const unsigned int *:		0,			\
> +		 const signed long *:		0,			\
> +		 const unsigned long *:		0,			\
> +		 const signed long long *:	0,			\
> +		 const unsigned long long *:	0,			\
> +		 const void *:			0,			\
> +									\
> +		 /* Systypes and pointers */				\
> +		 default:			-1)
> +
> +#define FLOG_MAGIC	0x676f6c66
> +#define FLOG_VERSION	1
> +
> +typedef struct {
> +	unsigned int	magic;
> +	unsigned int	version;
> +	unsigned int	size;
> +	unsigned int	nargs;
> +	unsigned int	mask;
> +	long		fmt;
> +	long		args[0];
> +} flog_msg_t;
> +
> +typedef struct {
> +	char		*buf;
> +	char		*pos;
> +	size_t		size;
> +	size_t		left;
> +	int 		readonly;
> +} flog_ctx_t;
> +
> +extern int flog_init(flog_ctx_t *ctx);
> +extern void flog_fini(flog_ctx_t *ctx);
> +
> +extern void flog_decode_all(flog_ctx_t *ctx, int fdout);
> +
> +extern int flog_encode_msg(flog_ctx_t *ctx,
> +			   unsigned int nargs, unsigned int mask,
> +			   const char *format, ...);
> +extern int flog_decode_msg(flog_msg_t *m, int fdout);
> +
> +extern int flog_map_buf(int fdout, flog_ctx_t *flog_ctx);
> +
> +#define flog_encode(ctx, fmt, ...)						\
> +	flog_encode_msg(ctx,							\
> +			FLOG_PP_NARG(__VA_ARGS__),				\
> +			FLOG_GENMASK(flog_genbit, ##__VA_ARGS__),		\
> +			fmt, ##__VA_ARGS__)
> +
> +#endif /* __UAPI_FLOG_H__ */
> diff --git a/flog/src/flog.c b/flog/src/flog.c
> new file mode 100644
> index 00000000..9a538795
> --- /dev/null
> +++ b/flog/src/flog.c
> @@ -0,0 +1,227 @@
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdarg.h>
> +#include <unistd.h>
> +#include <stdint.h>
> +
> +#include <sys/param.h>
> +#include <sys/mman.h>
> +
> +#include <ffi.h>
> +
> +#include "common/compiler.h"
> +
> +#include "flog.h"
> +
> +#define BUF_SIZE (1<<20)
> +
> +static char _mbuf[BUF_SIZE];
> +static char *mbuf = _mbuf;
> +static char *fbuf;
> +static uint64_t fsize;
> +static uint64_t mbuf_size = sizeof(_mbuf);

Here should be a comment which explains what all these global variables
mean. Why do we need them if we have flog_ctx?

> +static int binlog_fd;
> +
> +int flog_map_buf(int fdout, flog_ctx_t *flog_ctx)
> +{
> +	uint64_t off = 0;
> +	void *addr;
> +	
> +	mbuf_size = 2 * BUF_SIZE;
> +	
> +	if (flog_ctx->size==0)	flog_ctx->size=BUF_SIZE;
> +
> +	/*
> +	 * Two buffers are mmaped into memory. A new one is mapped when a first
> +	 * one is completly filled.
> +	 */
> +	if (fbuf && (mbuf - fbuf < BUF_SIZE))
> +		return 0;
> +
> +	if (fbuf) {
> +		if (munmap(fbuf, BUF_SIZE * 2)) {
> +			fprintf(stderr, "Unable to unmap a buffer: %m");
> +			return 1;
> +		}
> +		off = mbuf - fbuf - BUF_SIZE;
> +		fbuf = NULL;
> +	}
> +
> +	if (fsize == 0)
> +		fsize += BUF_SIZE;
> +	fsize += BUF_SIZE;
> +
> +	if (!flog_ctx->readonly) {
> +		if (ftruncate(fdout, fsize)) {
> +			fprintf(stderr, "Unable to truncate a file: %m");
> +			return -1;
> +		}
> +	}	
> +	if (!fbuf) {
> +		if (!flog_ctx->readonly) {			
> +			addr = mmap(NULL, BUF_SIZE * 2, PROT_WRITE | PROT_READ,
> +			    MAP_FILE | MAP_SHARED, fdout, fsize - 2 * BUF_SIZE);

I don't understand what fsize - 2 * BUF_SIZE means

> +		}
> +		else {			

pls, follow the linux kernel coding style.

> +			addr = mmap(NULL, flog_ctx->size, PROT_READ,
> +			    MAP_FILE | MAP_SHARED, fdout, 0);
> +			mbuf_size = flog_ctx->size;
> +		}
> +	}
> +	else {
> +		addr = mremap(fbuf + BUF_SIZE, BUF_SIZE,
> +				BUF_SIZE * 2, MREMAP_FIXED, fbuf);

this isn't safe, you can remap something useful.

> +	}
> +	if (addr == MAP_FAILED) {
> +		fprintf(stderr, "Unable to map a buffer: %m");

stderr can't be used if a log file is specified.

> +		return -1;
> +	}
> +	
> +	fbuf = addr;
> +	mbuf = fbuf + off;
> +
> +	binlog_fd=fdout;
> +	flog_init(flog_ctx);
> +	return 0;
> +}
> +
> +int flog_init(flog_ctx_t *ctx)
> +{
> +	ctx->size = ctx->left = mbuf_size; 
> +	ctx->pos = ctx->buf = mbuf;
> +	if (!ctx->buf)
> +		return -ENOMEM;
> +	
> +	return 0;
> +}
> +
> +void flog_fini(flog_ctx_t *ctx)

criu can crash, will we be able to read a log in this case? We need to
check this case.

> +{
> +	if (mbuf == _mbuf)
> +		return;
> +	munmap(ctx->buf, BUF_SIZE * 2);

in flog_init, you call mmap with flog_ctx->size

	munmap(ctx->buf, ctx->size) ?

> +	ctx->size=(size_t) (ctx->pos - ctx->buf);
> +	if (!ctx->readonly) {
> +		if (ftruncate(binlog_fd, ctx->size)) {
> +			fprintf(stderr, "Unable to truncate a file: %m");
> +		}
> +	}
> +	
> +}
> +
> +int flog_decode_msg(flog_msg_t *ro_m, int fdout)
> +{
> +	ffi_type *args[34] = {
> +		[0]		= &ffi_type_sint,
> +		[1]		= &ffi_type_pointer,
> +		[2 ... 33]	= &ffi_type_slong
> +	};
> +	void *values[34];

what does this 34 magic means?

> +	ffi_cif cif;
> +	ffi_arg rc;
> +	flog_msg_t *m;
> +	size_t i, ret = 0;
> +	char *fmt;
> +
> +	m=malloc(ro_m->size);
> +	memcpy(m, ro_m, ro_m->size);
> +	values[0] = (void *)&fdout;
> +	if (m->magic != FLOG_MAGIC) {
> +		return -EINVAL;
> +	}
> +	if (m->version != FLOG_VERSION)	{
> +		return -EINVAL;
> +	}
> +
> +	fmt = (void *)m + m->fmt;
> +	values[1] = &fmt;
> +	
> +	for (i = 0; i < m->nargs; i++) {		
> +		values[i + 2] = (void *)&m->args[i];
> +		if (m->mask & (1u << i)) {
> +			m->args[i] = (long)((void *)m + m->args[i]);
> +		}	
> +		
> +	}
> +	
> +	int sdf=ffi_prep_cif(&cif, FFI_DEFAULT_ABI, m->nargs + 2,
> +			 &ffi_type_sint, args);
> +	if ( sdf == FFI_OK) {
> +		ffi_call(&cif, FFI_FN(dprintf), &rc, values);
> +	} else
> +		ret = -1;
> +	
> +	free(m);
> +	return ret;
> +}
> +
> +void flog_decode_all(flog_ctx_t *ctx, int fdout)
> +{
> +	flog_msg_t *m;
> +	char *pos;
> +	printf("log size is %ld\n", ctx->size);
> +	if (ctx->size == 0)
> +		return;
> +	if (ctx->readonly) ctx->pos=ctx->buf + ctx->size;
> +	for (pos = ctx->buf; pos < ctx->pos; ) {
> +		m = (void *)pos;
> +		flog_decode_msg(m ,fdout);
> +		pos += m->size;
> +	}
> +}
> +
> +int flog_encode_msg(flog_ctx_t *ctx, unsigned int nargs, unsigned int mask, const char *format, ...)
> +{	
> +	if (ctx->readonly) {		
> +		return 0;
> +	}
> +	flog_msg_t *m = (void *)ctx->pos;
> +	char *str_start, *p;
> +	va_list argptr;
> +	size_t i;
> +
> +	m->nargs = nargs;
> +	m->mask = mask;
> +
> +	str_start = (void *)m->args + sizeof(m->args[0]) * nargs;
> +	p = memccpy(str_start, format, 0, ctx->left - (str_start - ctx->pos));
> +	if (!p)
> +		return -ENOMEM;
> +
> +	m->fmt = str_start - ctx->pos;
> +	str_start = p;
> +	va_start(argptr, format);
> +	for (i = 0; i < nargs; i++) {
> +		m->args[i] = (long)va_arg(argptr, long);
> +		/*
> +		 * If we got a string, we should either
> +		 * reference it when in rodata, or make
> +		 * a copy (FIXME implement rodata refs).
> +		 */
> +		if (mask & (1u << i)) {
> +			p = memccpy(str_start, (void *)m->args[i], 0, ctx->left - (str_start - ctx->pos));
> +			if (!p)
> +				return -ENOMEM;
> +			m->args[i] = str_start - ctx->pos;
> +			str_start = p;
> +		}
> +	}
> +	va_end(argptr);
> +	m->size = str_start - ctx->pos;
> +
> +	/*
> +	 * A magic is required to know where we stop writing into a log file,
> +	 * if it was not properly closed.  The file is mapped into memory, so a
> +	 * space in the file is allocated in advance and at the end it can have
> +	 * some unused tail.
> +	 */
> +	m->magic = FLOG_MAGIC;
> +	m->version = FLOG_VERSION;
> +
> +	m->size = round_up(m->size, 8);
> +
> +	/* Advance position and left bytes in context memory */
> +	ctx->left -= m->size;
> +	ctx->pos += m->size;
> +	return 0;
> +}
> -- 
> 2.22.0
> 

> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu



More information about the CRIU mailing list