[Devel] Re: [PATCH 5/8] checkpoint/restart of anonymous hugetlb mappings
Nathan Lynch
ntl at pobox.com
Fri Sep 17 13:23:13 PDT 2010
On Thu, 2010-09-16 at 20:44 -0400, Oren Laadan wrote:
>
> On 09/14/2010 04:02 PM, Nathan Lynch wrote:
> > Support checkpoint and restore of both private and shared
> > hugepage-backed mappings established via mmap(MAP_HUGETLB). Introduce
> > APIs for checkpoint and restart of individual huge pages which are to
> > be used by the sysv SHM_HUGETLB c/r code.
> >
> > Signed-off-by: Nathan Lynch <ntl at pobox.com>
>
> The code looks clean, but I need to learn more about HUGETLB
> before I can say much...
>
> Do you also have test-suite for this ?
Included below is a throwaway patch to user-cr's shmem and ipcshm tests
which will cause them to use huge pages. You'll need to configure huge
pages on your system; see Documentation/vm/hugetlbpage.txt in the kernel
source.
>
> [...]
>
> > +static int hugetlb_dump_contents(struct ckpt_ctx *ctx, struct vm_area_struct *vma)
> > +{
> > + struct ckpt_hdr_hpage hdr;
> > + unsigned long pageshift;
> > + unsigned long pagesize;
> > + unsigned long addr;
> > + int ret;
> > +
> > + pageshift = huge_page_shift(hstate_vma(vma));
> > + pagesize = vma_kernel_pagesize(vma);
> > +
> > + ckpt_hdr_hpage_init(&hdr, pageshift);
> > +
> > + for (addr = vma->vm_start; addr < vma->vm_end; addr += pagesize) {
> > + struct page *page = NULL;
> > +
> > + down_read(&vma->vm_mm->mmap_sem);
> > + ret = __get_user_pages(ctx->tsk, vma->vm_mm,
> > + addr, 1, FOLL_DUMP | FOLL_GET,
> > + &page, NULL);
> > + /* FOLL_DUMP gives -EFAULT for holes */
> > + if (ret == -EFAULT)
> > + ret = 0;
>
> With regular pages, this didn't always work, especially after they
> slightly changed the semantics of FOLL_DUMP. So I introduced the
> FOLL_DIRTY flag to detect dirty (non-zero) pages. I wonder if
> something like that may be needed here too ?
I don't think so - huge pages are never used to map regular files (they
are always on hugetlbfs), so they can't get out of sync with a backing
store.
test/ipcshm.c | 7 ++++---
test/shmem.c | 8 ++++++--
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/test/ipcshm.c b/test/ipcshm.c
index cf932b4..f4b5e8a 100644
--- a/test/ipcshm.c
+++ b/test/ipcshm.c
@@ -7,6 +7,7 @@
#define OUTFILE "/tmp/cr-test.out"
#define SEG_SIZE (20 * 4096)
+#define HTLB_SEG_SIZE (1024 * 1024 * 16)
#define SEG_KEY1 11
int main(int argc, char *argv[])
@@ -37,7 +38,7 @@ int main(int argc, char *argv[])
exit(1);
}
- id2 = shmget(IPC_PRIVATE, SEG_SIZE, 0700|IPC_CREAT|IPC_EXCL);
+ id2 = shmget(IPC_PRIVATE, HTLB_SEG_SIZE, 0700|IPC_CREAT|IPC_EXCL|SHM_HUGETLB);
if (id2 < 0) {
perror("shmget2");
exit(1);
@@ -63,9 +64,9 @@ int main(int argc, char *argv[])
if (shmdt(seg1) < 0)
perror("shmdt1");
- fprintf(file, "detaches 2nd, sleeping 30\n");
+ fprintf(file, "detaches 2nd, sleeping 120\n");
fflush(file);
- sleep(20);
+ sleep(120);
fprintf(file, "waking up\n");
fflush(file);
diff --git a/test/shmem.c b/test/shmem.c
index 6d7dd8a..cb9fd10 100644
--- a/test/shmem.c
+++ b/test/shmem.c
@@ -5,6 +5,10 @@
#include <math.h>
#include <sys/mman.h>
+#ifndef MAP_HUGETLB
+#define MAP_HUGETLB 0x40000
+#endif
+
#define OUTFILE "/tmp/cr-test.out"
int main(int argc, char *argv[])
@@ -41,7 +45,7 @@ int main(int argc, char *argv[])
}
addr = mmap(NULL, 16384, PROT_READ | PROT_WRITE,
- MAP_ANONYMOUS | MAP_SHARED, 0, 0);
+ MAP_ANONYMOUS | MAP_SHARED | MAP_HUGETLB, 0, 0);
if (addr == MAP_FAILED) {
perror("mmap");
exit(1);
@@ -66,7 +70,7 @@ int main(int argc, char *argv[])
close(pipefd[1]);
}
- for (i = 0; i < 10; i++) {
+ for (i = 0; i < 120; i++) {
sleep(1);
/* make the fpu work -> a = a + i/10 */
a = sqrt(a*a + 2*a*(i/10.0) + i*i/100.0);
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list