[Users] Cloud Storage for OpenVZ Containers

Wed Jan 29 14:57:10 PST 2014

Added 10G to each vm and rebooted, got the same issue. Included the same 
output you asked for last time to see if there was something obvious.

[root at ovz3 ~]# pstorage -c test_cluster stat
connected to MDS#2
Cluster 'test_cluster': healthy
Space: [OK] allocatable 56GB of 63GB, free 60GB of 63GB
MDS nodes: 3 of 3, epoch uptime: 2h 20m
CS nodes:  3 of 3 (3 avail, 0 inactive, 0 offline)
License: [Error] License not loaded, capacity limited to 100Gb
Replication:  1 norm,  1 limit
Chunks: [OK] 1 (100%) healthy,  0 (0%) standby,  0 (0%) degraded, 0 (0%) 
urgent,
              0 (0%) blocked,  0 (0%) pending,  0 (0%) offline,  0 (0%) 
replicating,
              0 (0%) overcommitted,  0 (0%) deleting,  0 (0%) void
FS:  1MB in 4 files, 4 inodes,  1 file maps,  1 chunks,  1 chunk replicas
IO:       read     0B/s (  0ops/s), write     0B/s (  0ops/s)
IO total: read       0B (    0ops), write       0B (    0ops)
Repl IO:  read     0B/s, write:     0B/s
Sync rate:   0ops/s, datasync rate:   0ops/s

MDSID STATUS   %CTIME   COMMITS   %CPU    MEM   UPTIME HOST
     1 avail      2.0%       0/s   0.0%    18m   2h 20m ovz1.home.int:2510
M   2 avail      2.4%       0/s   0.1%    18m   2h 20m ovz2.home.int:2510
     3 avail      3.8%       1/s   0.0%    18m   2h 20m ovz3.home.int:2510

  CSID STATUS      SPACE   FREE REPLICAS IOWAIT IOLAT(ms) QDEPTH HOST
  1025 active       21GB   19GB        0     0%       0/0    0.0 
ovz1.home.int
  1026 active       21GB   19GB        0     0%       0/0    0.0 
ovz2.home.int
  1027 active       21GB   20GB        1     0%       0/0    0.0 
ovz3.home.int

  CLID   LEASES     READ    WRITE     RD_OPS     WR_OPS     FSYNCS 
IOLAT(ms) HOST
  2089      0/1     0B/s     0B/s     0ops/s     0ops/s 0ops/s       0/0 
ovz1.home.int
  2090      0/0     0B/s     0B/s     0ops/s     0ops/s 0ops/s       0/0 
ovz2.home.int
  2091      0/0     0B/s     0B/s     0ops/s     0ops/s 0ops/s       0/0 
ovz3.home.int

On 01/29/2014 01:45 PM, Kirill Korotaev wrote:
> Edward, got it - there is a small threshold (10GB) on minimum free 
> space on CS’es (reserved for different cases include recovery),
> you have ~10GB per CS so you hit this threshold immediately.
>
> Most likely you run from inside VMs, right? Just increase disk space 
> available to CS then.
>
>
> On 29 Jan 2014, at 21:04, Edward Konetzko <konetzed at gmail.com 
> <mailto:konetzed at gmail.com>> wrote:
>
>> [konetzed at ovz2 ~]$ sudo pstorage -c test_cluster stat
>> connected to MDS#3
>> Cluster 'test_cluster': healthy
>> Space: [OK] allocatable 28GB of 35GB, free 31GB of 35GB
>> MDS nodes: 3 of 3, epoch uptime: 10h 25m
>> CS nodes:  3 of 3 (3 avail, 0 inactive, 0 offline)
>> License: [Error] License not loaded, capacity limited to 100Gb
>> Replication:  1 norm,  1 limit
>> Chunks: [OK] 1 (100%) healthy,  0 (0%) standby,  0 (0%) degraded,  0 
>> (0%) urgent,
>>              0 (0%) blocked,  0 (0%) pending,  0 (0%) offline,  0 
>> (0%) replicating,
>>              0 (0%) overcommitted,  0 (0%) deleting, 0 (0%) void
>> FS:  10KB in 2 files, 2 inodes,  1 file maps,  1 chunks,  1 chunk 
>> replicas
>> IO:       read     0B/s (  0ops/s), write     0B/s ( 0ops/s)
>> IO total: read       0B (    0ops), write       0B (    0ops)
>> Repl IO:  read     0B/s, write:     0B/s
>> Sync rate:   0ops/s, datasync rate:   0ops/s
>>
>> MDSID STATUS   %CTIME   COMMITS   %CPU    MEM   UPTIME HOST
>>     1 avail      3.1%       1/s   0.1%    14m   9h 58m 
>> ovz1.home.int:2510
>>     2 avail      2.5%       0/s   0.0%    14m   9h 14m 
>> ovz2.home.int:2510
>> M   3 avail      3.0%       1/s   0.3%    15m  10h 25m 
>> ovz3.home.int:2510
>>
>>  CSID STATUS      SPACE   FREE REPLICAS IOWAIT IOLAT(ms) QDEPTH HOST
>>  1025 active       11GB   10GB        0     0% 0/0    0.0 ovz1.home.int
>>  1026 active       11GB   10GB        0     0% 0/0    0.0 ovz2.home.int
>>  1027 active       11GB   10GB        1     0% 0/0    0.0 ovz3.home.int
>>
>>  CLID   LEASES     READ    WRITE     RD_OPS WR_OPS     FSYNCS 
>> IOLAT(ms) HOST
>>  2060      0/0     0B/s     0B/s     0ops/s 0ops/s     0ops/s       
>> 0/0 ovz3.home.int
>>  2065      0/1     0B/s     0B/s     0ops/s 0ops/s     0ops/s       
>> 0/0 ovz1.home.int
>>
>> I do have skype but I have meetings all day for work and cant be on a 
>> computer after.  I may have time tomorrow if that would work.  I am 
>> in the central time zone.
>>
>> Edward
>>
>>
>> On 01/29/2014 03:14 AM, Kirill Korotaev wrote:
>>> Edward,
>>>
>>> can you send me in private email output of:
>>> # pstorage -c <cluster> stat
>>> output?
>>>
>>> Do you have a skype?
>>>
>>> Thanks,
>>> Kirill
>>>
>>>
>>>
>>> On 29 Jan 2014, at 10:26, Edward Konetzko <konetzed at gmail.com 
>>> <mailto:konetzed at gmail.com>> wrote:
>>>
>>>> On 01/28/2014 09:51 AM, Kir Kolyshkin wrote:
>>>>> On 28 January 2014 02:55, Kirill Korotaev <dev at parallels.com 
>>>>> <mailto:dev at parallels.com>> wrote:
>>>>>
>>>>>     >> On 25 Jan 2014, at 07:38, Rene C. openvz at dokbua.com
>>>>>     <mailto:openvz at dokbua.com> wrote:
>>>>>     >>
>>>>>     >
>>>>>     > Hi,
>>>>>     >
>>>>>     > I read the website about the cloud storage and I found some
>>>>>     words, which seems familiar for me.
>>>>>     >
>>>>>     > May I ask, which filesystem do you use to be able to
>>>>>     regularly scrub and self-heal the filesystem?
>>>>>     >
>>>>>     > Personaly I use zfsonlinux in production for a long time now
>>>>>     and I am very satisfied with it, and based on your
>>>>>     description, it seems you should use something like that and
>>>>>     something on top of the native filesystem to get a cloud storage.
>>>>>     >
>>>>>     > Or you use a ceph or alike "filesystem", which has similar
>>>>>     capabilities with cloud features.
>>>>>
>>>>>     It’s more like a ceph. Data is stored in a distributed way, so
>>>>>     unlike to zfs you have access to the data even in case of node
>>>>>     failure (crash, CPU/memory fault etc.) and access is available
>>>>>     from ANY cluster node.
>>>>>     As such we store the data and maintain checksums on every node
>>>>>     and can do periodic scrubbing of the data.
>>>>>
>>>>>
>>>>> Just to clarify -- this is Parallels own distributed/cloud 
>>>>> filesystem, not CEPH or GlusterFS,
>>>>> but similar to. For more info, check the links at 
>>>>> https://openvz.org/Parallels_Cloud_Storage#External_links
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at openvz.org
>>>>> https://lists.openvz.org/mailman/listinfo/users
>>>> Setup a cluster using Centos 6.5 64bit, fresh install in KVM 
>>>> instances.  I wanted to test functionality not actual speed.
>>>>
>>>> All software was latest as of last night and I followed the quick 
>>>> how to here https://openvz.org/Parallels_Cloud_Storage
>>>>
>>>> Everything works great until I try to create an instance using the 
>>>> command "vzctl create 101 --layout ploop --ostemplate 
>>>> centos-6-x86_64 --private /pcs/containers/101" from the docs.
>>>>
>>>> About one mb of data is written to disk and then it just hangs.  
>>>> The following is output from dmesg
>>>>
>>>> [  360.414242] INFO: task vzctl:1646 blocked for more than 120 seconds.
>>>> [  360.414770] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
>>>> disables this message.
>>>> [  360.415406] vzctl         D ffff88007e444500     0  1646   
>>>> 1611    0 0x00000084
>>>> [  360.415418]  ffff88007ea59a68 0000000000000086 ffff8800ffffffff 
>>>> 000006b62934b8c0
>>>> [  360.415428]  0000000000000000 ffff88007e9f2ad0 0000000000005eaa 
>>>> ffffffffad17694d
>>>> [  360.415437]  000000000ad7ef74 ffffffff81a97b40 ffff88007e444ac8 
>>>> 000000000001eb80
>>>> [  360.415452] Call Trace:
>>>> [  360.415492]  [<ffffffff81517353>] io_schedule+0x73/0xc0
>>>> [  360.415516]  [<ffffffff811f39b3>] wait_on_sync_kiocb+0x53/0x80
>>>> [  360.415537]  [<ffffffffa04dbf47>] fuse_direct_IO+0x167/0x230 [fuse]
>>>> [  360.415558]  [<ffffffff8112e948>] mapping_direct_IO+0x48/0x70
>>>> [  360.415567]  [<ffffffff811301a6>] 
>>>> generic_file_direct_write_iter+0xf6/0x170
>>>> [  360.415576]  [<ffffffff81130c8e>] 
>>>> __generic_file_write_iter+0x32e/0x420
>>>> [  360.415585]  [<ffffffff81130e05>] __generic_file_aio_write+0x85/0xa0
>>>> [  360.415594]  [<ffffffff81130ea8>] generic_file_aio_write+0x88/0x100
>>>> [  360.415605]  [<ffffffffa04da085>] 
>>>> fuse_file_aio_write+0x185/0x430 [fuse]
>>>> [  360.415623]  [<ffffffff811a530a>] do_sync_write+0xfa/0x140
>>>> [  360.415641]  [<ffffffff8109d930>] ? 
>>>> autoremove_wake_function+0x0/0x40
>>>> [  360.415655]  [<ffffffff812902da>] ? strncpy_from_user+0x4a/0x90
>>>> [  360.415664]  [<ffffffff811a55e8>] vfs_write+0xb8/0x1a0
>>>> [  360.415671]  [<ffffffff811a5ee1>] sys_write+0x51/0x90
>>>> [  360.415681]  [<ffffffff8100b102>] system_call_fastpath+0x16/0x1b
>>>>
>>>> Even just trying to create a 10k file with dd causes a task to 
>>>> hang.  "dd if=/dev/zero of=/pcs/test.junk bs=1k count=10"
>>>>
>>>>
>>>> Any ideas? Anymore info you would like for debugging.
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at openvz.org <mailto:Users at openvz.org>
>>>> https://lists.openvz.org/mailman/listinfo/users
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20140129/fe701e11/attachment-0001.html>