This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
realtime:documentation:howto:applications:memory_wip [2017/05/03 02:07] jithu [Dynamic memory allocation and prefaulting] |
— (current) | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== RT App memory related ====== | ||
- | **(Currently WIP (Work in Progress), Ignore the contents until this line is removed )** | ||
- | |||
- | This page is meant to document the things to be considered from a Real Time Application's memory perspective. The following things fall into the purview of this topic: | ||
- | - Memory Locking | ||
- | - Stack Memory for RT threads | ||
- | - Dynamic memory allocation and prefaulting | ||
- | Keep in mind that the [[realtime:documentation:howto:applications:application_base|usual sequence]] is for an app to begin its execution as a regular(non RT) App, then create the RT threads with appropriate resources and scheduling parameters. | ||
- | |||
- | ===== Memory Locking ===== | ||
- | Through the ''mlockall()'' system-call, it is possible for an app to instruct the kernel to lock the calling process's entire virtual address space - current and future - into RAM, thereby preventing it from being paged-out to the swap during the process's lifetime. See the snippet below: | ||
- | <code c> | ||
- | /* Lock all current and future pages from preventing of being paged to swap */ | ||
- | if (mlockall( MCL_CURRENT | MCL_FUTURE )) | ||
- | { | ||
- | perror("mlockall failed"); | ||
- | } | ||
- | </code> | ||
- | |||
- | Real-time apps should do this at first (i.e prior to spawning the real-time threads), since, memory access latency of a paged-out address will be significantly worse (than if it is present in the RAM). Consequently, failing to do so will significantly affect the determinism based on the overall system's memory pressure. | ||
- | |||
- | Note that this call would be applicable for all memoriy areas in the process address space - i.e globals, stack, heap, code etc | ||
- | |||
- | ===== Stack Memory for RT threads ===== | ||
- | |||
- | All threads(RT and non-RT) within a process have their own private stack. The aforementioned ''mlockall'' is sufficient to pin the entire thread stack in RAM. Note that it is possible to specify a different size for a thread's stack-size(default being 8MB) as shown in the snippet. If the process spawns a large number of RT threads it is advisable to reduce the stack-sizes as these-too cannot be paged out. | ||
- | <code c> | ||
- | static void create_rt_thread(void) | ||
- | { | ||
- | pthread_t thread; | ||
- | pthread_attr_t attr; | ||
- | |||
- | /* init to default values */ | ||
- | if (pthread_attr_init(&attr)) | ||
- | error(1); | ||
- | /* Set a smaller stack */ | ||
- | if (pthread_attr_setstacksize(&attr, PTHREAD_STACK_MIN + MY_STACK_SIZE)) | ||
- | error(2); | ||
- | /* And finally start the actual thread */ | ||
- | pthread_create(&thread, &attr, rt_func, NULL); | ||
- | } | ||
- | </code> | ||
- | Details : The entire stack of every thread inside the application is forced to RAM when mlockall(MCL_CURRENT) is called. Threads started after a call to mlockall(MCL_CURRENT | MCL_FUTURE) will generate page faults immediately since the new stack is immediately forced to RAM (due to the MCL_FUTURE flag). | ||
- | |||
- | With mlockall(), page faults for thread stack area, happens during thread creation //(no explicit additional prefaulting necessary to avoid pagefaults during first access)//. So all threads need to be created at startup time, before RT show time. | ||
- | |||
- | ===== Dynamic memory allocation and prefaulting ===== | ||
- | Dynamic memory allocations will result in page-faults, which will impact the determinism / latency of the real-time thread. So it is necessary to pre-allocate the max amount of dynamic memory required during startup prior to its RT show-time (//and free() - explained later//), so that page-faults are not incurred on the real-time critical path. | ||
- | |||
- | Q. Is is always possible to know the required size of Dynamic memory beforehand ? | ||
- | |||
- | A. While this approach of malloc()/free() ing the maximum required dynamic-memory at startup provides a straightforward way of adapting existing source code to real-time, it is debatable as to whether it is possible to accurately pre-assess this size of the dynamic memory beforehand. We have to keep in mind that a realtime app cant possibly go on asking beyond a finite memory before - lest there is no difference between RT/NON-RT stuff. | ||
- | |||
- | How do we go about this: | ||
- | <code c> | ||
- | #include <stdlib.h> | ||
- | #include <stdio.h> | ||
- | #include <sys/mman.h> // Needed for mlockall() | ||
- | #include <unistd.h> // needed for sysconf(int name); | ||
- | #include <malloc.h> | ||
- | #include <sys/time.h> // needed for getrusage | ||
- | #include <sys/resource.h> // needed for getrusage | ||
- | |||
- | |||
- | #define SOMESIZE (100*1024*1024) // 100MB | ||
- | |||
- | |||
- | int main(int argc, char* argv[]) | ||
- | { | ||
- | // Allocate some memory | ||
- | int i, page_size; | ||
- | char* buffer; | ||
- | struct rusage usage; | ||
- | |||
- | |||
- | // Now lock all current and future pages from preventing of being paged | ||
- | if (mlockall(MCL_CURRENT | MCL_FUTURE )) | ||
- | { | ||
- | perror("mlockall failed:"); | ||
- | } | ||
- | |||
- | |||
- | // Turn off malloc trimming. | ||
- | mallopt (M_TRIM_THRESHOLD, -1); | ||
- | |||
- | |||
- | // Turn off mmap usage. | ||
- | mallopt (M_MMAP_MAX, 0); | ||
- | |||
- | |||
- | page_size = sysconf(_SC_PAGESIZE); | ||
- | buffer = malloc(SOMESIZE); | ||
- | |||
- | |||
- | getrusage(RUSAGE_SELF, &usage); | ||
- | printf("Major-pagefaults:%d, Minor Pagefaults:%d\n", usage.ru_majflt, usage.ru_minflt); | ||
- | |||
- | |||
- | // Touch page to prove there will be no page fault later | ||
- | for (i=0; i < SOMESIZE; i+=page_size) | ||
- | { | ||
- | // Each write to this buffer will *not* generate a pagefault. | ||
- | // Even if nothing has been written to the newly allocated memory, the physical page | ||
- | // is still provisioned to the process because mlockall() has been called with | ||
- | // the MCL_FUTURE flag | ||
- | buffer[i] = 0; | ||
- | // print the number of major and minor pagefaults this application has triggered | ||
- | getrusage(RUSAGE_SELF, &usage); | ||
- | printf("Major-pagefaults:%d, Minor Pagefaults:%d\n", usage.ru_majflt, usage.ru_minflt); | ||
- | } | ||
- | free(buffer); | ||
- | // buffer is now released. As glibc is configured such that it never gives back memory to | ||
- | // the kernel, the memory allocated above is locked for this process. All malloc() and new() | ||
- | // calls come from the memory pool reserved and locked above. Issuing free() and delete() | ||
- | // does NOT make this locking undone. So, with this locking mechanism we can build C++ applications | ||
- | // that will never run into a major/minor pagefault, even with swapping enabled. | ||
- | |||
- | |||
- | //<do your RT-thing> | ||
- | |||
- | |||
- | return 0; | ||
- | } | ||
- | </code> | ||