[Users] ploop, simfs, and disk space

Kir Kolyshkin kir at openvz.org
Fri Apr 6 17:33:15 EDT 2012


I want to summarise some facts about simfs, ploop, and disk quota.

== simfs case ==
If container is on simfs, it is using host file system, usually /vz. 
Because many containers share one file system, per-container limits are 
needed for both disk space used and disk i-nodes (roughly number of 
files/directories) used. These two limits are called vzquota, and are 
controlled by --diskspace and --diskinodes parameters for vzctl set command.

For both diskspace and diskinodes, there are two values -- soft limit 
and hard limit, you can specify those using sss:hhh syntax. For example, 
"vzctl set 333 --diskspace 10G:11G --save" command sets the soft disk 
space limit to 10 GB and the hard disk space limit to 11G. The 
difference between soft and hard limit is that the soft limit can be 
temporary exceeded, while soft limit can not be exceeded. Here 
"temporary" is defined by the third parameter, --quotatime, which sets 
the time (in seconds) during which soft limit can be exceeded. This 
value is otherwise known as the grace period. Once the grace  period  
has expired, the soft limit is enforced as a hard limit.

Example: admin sets disk limits in the following way:
   vzctl set 333 --diskspace 10G:11G --diskinodes 1M:1.1M --quotatime 
3600 --save

Now, a container root can use 10G of disk space, and have about 1 
million files inside his CT. He can have 11G of disk space and about 1.1 
million files, but for no longer than 1 hour. If he uses more than 10G 
of disk space, during the first hour (and only during the first hour) he 
will still be able to use 11th gigabyte.

This dimensional system of space, inodes, soft limits, hard limits and 
grace period is nothing new, it's the same as traditional UNIX per-user 
and per-group disk quotas. The only major difference is in this case 
quotas are per-CT (per simfs mount point).

There is a --diskquota parameter (and DISK_QUOTA config file parameter) 
which is used to enable/disable per-CT disk quotas. If you set 
DISK_QUOTA=no in /etc/vz/vz.conf, no per-CT disk quotas will be 
initialized. If you set DISK_QUOTA=no in CT configuration file (e.g. 
/etc/vz/conf/333.conf), no disk quotas for this CT will be initialized.

NOTE that as with any other disk quota, if you will write to the file 
system bypassing the quota (such as directly to VE_PRIVATE, e.g. 
/vz/private/333), current quota usage values will be incorrect. In that 
case, you need to stop the CT and run vzctl quotainit, to recalculate 
quota usage. In some cases (such as after incorrect system shutdown 
caused by power outage) quota files are marked dirty, and such 
recalculation is happening automatically during CT start.

For the sake of completeness, there is vzctl quotaon and vzctl quotaoff 
commands, but usually you don't have to use those two, because quotaon 
is performed during vzctl mount (and vzctl start), and quotaoff is 
performed during vzctl stop (and vzctl umount).

 From inside the CT, utilities such as df are showing those quota limits 
instead of actual available disk space and inodes (this is implemented 
in the kernel by having a special version of statfs() syscall for simfs 
which looks into vzquota). Sometimes it gets complicated, so if you see 
something strange in  df output, it is either incorrect quota values 
(and you need to recalculate quota usage, see above), or perhaps the 
filesystem disk space available is less than quota limits. For lots of 
gory details on this stuff, please see 
http://wiki.openvz.org/Disk_quota,_df_and_stat_weird_behaviour

Also, you can check /proc/vz/vzquota to see for which containers quota 
is on, as well as its current limits and usage values.

I am leaving more advanced topics such as using vzquota utility directly 
as a (highly optional) exercise for (highly) advanced users.

== ploop case ==

In ploop case, there is an image file and the underlying file system, so 
there is no shared file system and vzquota is naturally not required. 
Therefore, options --diskquota (and DISK_QUOTA parameter), --diskinodes 
and --quotatime are silently ignored.

Option --diskspace is not ignored, but instead of changing vzquota disk 
space limit, it initiates the resize of the CT ploop image file and the 
filesystem which resides on top of that image.

NOTE that image and file system resize, especially in case when the CT 
is running (so-called online resize) is quite tricky, and in worst case 
scenario can lead to image or filesystem damage that is beyond repair. 
So exercise it a lot in testing environment, but do not abuse it in 
production*.

NOTE that specifying two values for --diskspace in case of ploop makes 
no sense. Only one value (hard limit) is used (and as with other 
parameters, if you only specify one value, second one becomes equal to 
the first one). So using --diskspace 1G:1.1G is the same as --diskspace 
1.1G (or --diskspace 0:1.1G). Easy rule: do not use two numbers for 
diskspace, just one.

* NOTE ploop is not yet ready for production, and will not be for at 
least a few more months.


More information about the Users mailing list