Dynamic Memory Allocation API
Dynamic memory allocation (especially deallocation) using the C runtime heap, through malloc, free, strdup, calloc, realloc, etc. can result in performance degradation and out-of-memory conditions.
This is due to fragmentation of the heap. The degraded performance and exhausted memory result from indirect interactions within the heap between unrelated application code. These issues are non-deterministic, and can be very difficult to rectify.
Memory Pools offer a powerful solution. They trade-off a deterministic amount of memory for
- deterministic behaviour,
- O(1) allocation and release performance, and
- built-in memory allocation tracking.
And it brings the power of destructors to C!
Overview
The most basic usage involves:
- Creating a pool (usually done once at process start-up)
- Allocating objects (memory blocks) from a pool
- Releasing objects back to their pool.
Pools generally can't be deleted. You create them when your process starts-up, and use them until your process terminates. It's up to the OS to clean-up the memory pools, along with everything else your process is using, when your process terminates. (Although, if you find yourself really needing to delete pools, Sub-Pools could offer you a solution.)
Pools also support the following advanced features:
- reference counting
- destructors
- statistics
- multi-threading
- sub-pools (pools that can be deleted).
The following sections describe these, beginning with the most basic usage and working up to more advanced topics.
Creating a Pool
Before allocating memory from a pool, the pool must be created using le_mem_CreatePool(), passing it the name of the pool and the size of the objects to be allocated from that pool. This returns a reference to the new pool, which has zero free objects in it.
To populate your new pool with free objects, you call le_mem_ExpandPool()
. This is separated into two functions (rather than having one function with three parameters) to make it virtually impossible to accidentally get the parameters in the wrong order (which would result in nasty bugs that couldn't be caught by the compiler). The ability to expand pools comes in handy (see Managing Pool Sizes).
This code sample defines a class "Point" and a pool "PointPool" used to allocate memory for objects of that class:
#define MAX_POINTS 12 // Maximum number of points that can be handled.typedef struct{int x; // pixel position along x-axisint y; // pixel position along y-axis}Point_t;le_mem_PoolRef_t PointPool;int xx_pt_ProcessStart(void){le_mem_ExpandPool(PointPool, MAX_POINTS);return SUCCESS;}
To make things easier for power-users, le_mem_ExpandPool()
returns the same pool reference that it was given. This allows the xx_pt_ProcessStart() function to be re-implemented as follows:
int xx_pt_ProcessStart(void){return SUCCESS;}
Although this requires a dozen or so fewer keystrokes of typing and occupies one less line of code, it's arguably less readable than the previous example.
For a discussion on how to pick the number of objects to have in your pools, see Managing Pool Sizes.
Allocating From a Pool
Allocating from a pool has multiple options:
le_mem_TryAlloc()
- Quietly return NULL if there are no free blocks in the pool.le_mem_AssertAlloc()
- Log an error and take down the process if there are no free blocks in the pool.le_mem_ForceAlloc()
- If there are no free blocks in the pool, log a warning and automatically expand the pool (or log an error and terminate the calling process there's not enough free memory to expand the pool).
All of these functions take a pool reference and return a pointer to the object allocated from the pool.
The first option, using le_mem_TryAlloc()
, is the closest to the way good old malloc() works. It requires the caller check the return code to see if it's NULL. This can be annoying enough that a lot of people get lazy and don't check the return code (Bad programmer! Bad!). It turns out that this option isn't really what people usually want (but occasionally they do)
The second option, using le_mem_AssertAlloc()
, is only used when the allocation should never fail, by design; a failure to allocate a block is a fatal error. This isn't often used, but can save a lot of boilerplate error checking code.
The third option, using
le_mem_ForceAlloc(), is the one that gets used most often. It allows developers to avoid writing error checking code, because the allocation will essentially never fail because it's handled inside the memory allocator. It also allows developers to defer fine tuning their pool sizes until after they get things working. Later, they check the logs for pool size usage, and then modify their pool sizes accordingly. If a particular pool is continually growing, it's a good indication there's a memory leak. This permits seeing exactly what objects are being leaked. If certain debug options are turned on, they can even find out which line in which file allocated the blocks being leaked.
Releasing Back Into a Pool
Releasing memory back to a pool never fails, so there's no need to check a return code. Also, each object knows which pool it came from, so the code that releases the object doesn't have to care. All it has to do is call le_mem_Release()
and pass a pointer to the object to be released.
The critical thing to remember is that once an object has been released, it must never be accessed again . Here is a very bad code example:
Point_t* pointPtr = le_mem_ForceAlloc(PointPool);pointPtr->x = 5;pointPtr->y = 10;le_mem_Release(pointPtr);printf("Point is at position (%d, %d).\n", pointPtr->x, pointPtr->y);
Reference Counting
Reference counting is a powerful feature of our memory pools. Here's how it works:
- Every object allocated from a pool starts with a reference count of 1.
- Whenever someone calls le_mem_AddRef() on an object, its reference count is incremented by 1.
- When it's released, its reference count is decremented by 1.
- When its reference count reaches zero, it's destroyed (i.e., its memory is released back into the pool.)
This allows one function to:
- create an object.
- work with it.
- increment its reference count and pass a pointer to the object to another function (or thread, data structure, etc.).
- work with it some more.
- release the object without having to worry about when the other function is finished with it.
The other function also releases the object when it's done with it. So, the object will exist until both functions are done.
If there are multiple threads involved, be careful to protect the shared object from race conditions(see the Multi-Threading).
Another great advantage of reference counting is it enables Destructors.
- Note
- le_mem_GetRefCount() can be used to check the current reference count on an object.
Destructors
Destructors are a powerful feature of C++. Anyone who has any non-trivial experience with C++ has used them. Because C was created before object-oriented programming was around, there's no native language support for destructors in C. Object-oriented design is still possible and highly desireable even when the programming is done in C.
In Legato, it's possible to call le_mem_SetDestructor()
to attach a function to a memory pool to be used as a destructor for objects allocated from that pool. If a pool has a destructor, whenever the reference count reaches zero for an object allocated from that pool, the pool's destructor function will pass a pointer to that object. After the destructor returns, the object will be fully destroyed, and its memory will be released back into the pool for later reuse by another object.
Here's a destructor code sample:
static void PointDestructor(void* objPtr){Point_t* pointPtr = objPtr;printf("Destroying point (%d, %d)\n", pointPtr->x, pointPtr->y);@todo Add more to sample.}int xx_pt_ProcessStart(void){PointPool = le_mem_CreatePool(sizeof(Point_t));le_mem_ExpandPool(PointPool, MAX_POINTS);le_mem_SetDestructor(PointPool, PointDestructor);return SUCCESS;}static void DeletePointList(Point_t** pointList, size_t numPoints){size_t i;for (i = 0; i < numPoints; i++){le_mem_Release(pointList[i]);}}
In this sample, when DeletePointList() is called (with a pointer to an array of pointers to Point_t objects with reference counts of 1), each of the objects in the pointList is released. This causes their reference counts to hit 0, which triggers executing PointDestructor() for each object in the pointList, and the "Destroying point..." message will be printed for each.
Statistics
Some statistics are gathered for each memory pool:
- Number of allocations.
- Number of currently free objects.
- Number of overflows (times that le_mem_ForceAlloc() had to expand the pool).
Statistics (and other pool properties) can be checked using functions:
Statistics are fetched together atomically using a single function call. This prevents inconsistencies between them if in a multi-threaded program.
If you don't have a reference to a specified pool, but you have the name of the pool, you can get a reference to the pool using le_mem_FindPool()
.
In addition to programmatically fetching these, they're also available through the "poolstat" console command (unless your process's main thread is blocked).
To reset the pool statistics, use le_mem_ResetStats()
.
Diagnostics
The memory system also supports two different forms of diagnostics. Both are enabled by setting the appropriate KConfig options when building the framework.
The first of these options is MEM_TRACE. When you enable MEM_TRACE every pool is given a tracepoint with the name of the pool on creation.
For instance, the configTree node pool is called, "configTree.nodePool". So to enable a trace of all config tree node creation and deletion one would use the log tool as follows:
$ log trace configTree.nodePool
The second diagnostic build flag is MEM_POOLS. When MEM_POOLS is disabled, the pools are disabled and instead malloc and free are directly used. Thus enabling the use of tools like Valgrind.
Multi-Threading
All functions in this API are thread-safe, but not async-safe . The objects allocated from pools are not inherently protected from races between threads.
Allocating and releasing objects, checking stats, incrementing reference counts, etc. can all be done from multiple threads (excluding signal handlers) without having to worry about corrupting the memory pools' hidden internal data structures.
There's no magical way to prevent different threads from interferring with each other if they both access the contents of the same object at the same time.
The best way to prevent multi-threaded race conditions is simply don't share data between threads. If multiple threads must access the same data structure, then mutexes, semaphores, or similar methods should be used to synchronize threads and avoid data structure corruption or thread misbehaviour.
Although memory pools are thread-safe, they are not async-safe. This means that memory pools can be corrupted if they are accessed by a signal handler while they are being accessed by a normal thread. To be safe, don't call any memory pool functions from within a signal handler.
One problem using destructor functions in a multi-threaded environment is that the destructor function modifies a data structure shared between threads, so it's easy to forget to synchronize calls to le_mem_Release()
with other code accessing the data structure. If a mutex is used to coordinate access to the data structure, then the mutex must be held by the thread that calls le_mem_Release() to ensure there's no other thread accessing the data structure when the destructor runs.
Managing Pool Sizes
We know it's possible to have pools automatically expand when they are exhausted, but we don't really want that to happen normally. Ideally, the pools should be fully allocated to their maximum sizes at start-up so there aren't any surprises later when certain feature combinations cause the system to run out of memory in the field. If we allocate everything we think is needed up-front, then we are much more likely to uncover any memory shortages during testing, before it's in the field.
Choosing the right size for your pools correctly at start-up is easy to do if there is a maximum number of fixed, external things that are being represented by the objects being allocated from the pool. If the pool holds "call objects" representing phone calls over a T1 carrier that will never carry more than 24 calls at a time, then it's obvious that you need to size your call object pool at 24.
Other times, it's not so easy to choose the pool size like code to be reused in different products or different configurations that have different needs. In those cases, you still have a few options:
- At start-up, query the operating environment and base the pool sizes.
- Read a configuration setting from a file or other configuration data source.
- Use a build-time configuration setting.
The build-time configuration setting is the easiest, and generally requires less interaction between components at start-up simplifying APIs and reducing boot times.
If the pool size must be determined at start-up, use le_mem_ExpandPool()
. Perhaps there's a service-provider module designed to allocate objects on behalf of client. It can have multiple clients at the same time, but it doesn't know how many clients or what their resource needs will be until the clients register with it at start-up. We'd want those clients to be as decoupled from each other as possible (i.e., we want the clients know as little as possible about each other); we don't want the clients to get together and add up all their needs before telling the service-provider. We'd rather have the clients independently report their own needs to the service-provider. Also, we don't want each client to have to wait for all the other clients to report their needs before starting to use the services offered by the service-provider. That would add more complexity to the interactions between the clients and the service-provider.
This is what should happen when the service-provider can't wait for all clients to report their needs before creating the pool:
- When the service-provider starts up, it creates an empty pool.
- Whenever a client registers itself with the service-provider, the client can tell the service-provider what its specific needs are, and the service-provider can expand its object pool accordingly.
- Since registrations happen at start-up, pool expansion occurs at start-up, and testing will likely find any pool sizing before going into the field.
Where clients dynamically start and stop during runtime in response to external events (e.g., when someone is using the device's Web UI), we still have a problem because we can't shrink pools or delete pools when clients go away. This is where Sub-Pools is useful.
Sub-Pools
Essentially, a Sub-Pool is a memory pool that gets its blocks from another pool (the super-pool). Sub Pools can be deleted, causing its blocks to be released back into the super-pool.
This is useful when a service-provider module needs to handle clients that dynamically pop into existence and later disappear again. When a client attaches to the service and says it will probably need a maximum of X of the service-provider's resources, the service provider can set aside that many of those resources in a sub-pool for that client. If that client goes over its limit, the sub-pool will log a warning message.
The problem of sizing the super-pool correctly at start-up still exists, so what's the point of having a sub-pool, when all of the resources could just be allocated from the super-pool?
The benefit is really gained in troubleshooting. If client A, B, C, D and E are all behaving nicely, but client F is leaking resources, the sub-pool created on behalf of client F will start warning about the memory leak; time won't have to be wasted looking at clients A through E to rule them out.
A visual side effect of using sub-pools is that their blocks will count as being in use by the super-pool but they will not count as allocations of the super-pool.
To create a sub-pool, call le_mem_CreateSubPool()
. It takes a reference to the super-pool and the number of objects to move to the sub-pool, and it returns a reference to the new sub-pool.
To delete a sub-pool, call le_mem_DeleteSubPool()
. Do not try to use it to delete a pool that was created using le_mem_CreatePool(). It's only for sub-pools created using le_mem_CreateSubPool(). Also, it's not okay to delete a sub-pool while there are still blocks allocated from it, or if it has any sub-pools. You'll see errors in your logs if you do that.
Sub-Pools automatically inherit their parent's destructor function.
Reduced-size pools
One problem that occurs with memory pools is where objects of different sizes need to be stored. A classic example is strings -- the longest string an application needs to be able to handle may be much longer than the typical string size. In this case a lot of memory will be wasted with standard memory pools, since all objects allocated from the pool will be the size of the longest possible object.
The solution is to use reduced-size pools. These are a kind of sub-pool where the size of the object in the sub-pool is different from the size of the object in the super-pool. This way multiple blocks from the sub-pool can be stored in a single block of the super-pool.
Reduced-size pools have some limitations:
- Each block in the super-pool is divided up to create blocks in the subpool. So subpool blocks sizes must be less than half the size of the super-pool block size. An attempt to create a pool with a larger block size will just return a reference to the super-pool.
- Due overhead, each block actually requires 8-80 bytes more space than requested. There is little point in subdividing pools < 4x overhead, or ~300 bytes in the default configuration. Note the exact amount of overhead depends on the size of the guard bands and the size of pointer on your target.
- Blocks used by the reduced pool are permanently moved from the super-pool to the reduced pool. This must be taken into account when sizing the super-pool.
Example 1:
A network service needs to allocate space for packets received over a network. It should Typical packet length is up to 200 bytes, but occasional packets may be up to 1500 bytes. The service needs to be able to queue at least 32 packets, up to 5 of which can be 1500 bytes. Two pools are used: A pool of 12 objects 1500 bytes in size, and a reduced-size pool of 32 objects 200 bytes in size (280 bytes with overhead). Seven objects from the superpool are used to store reduced-size objects. This represents a memory savings of 60% compared with a pool of 32 1500 byte objects.
Example 2:
An application builds file paths which can be from 16-70 bytes. Only three such paths can be created at a time, but they can be any size. In this case it is better to just create a single pool of three 70-byte objects. Most of the potential space gains would be consumed by overhead. Even if this was not the case, the super pool still needs three free objects in addition to the objects required by the subpool, so there is no space savings.
To create a reduced-size pool, use le_mem_CreateReducedPool()
. It takes a reference to the super-pool, the initial number of objects in the sub-pool, and size of an object in the sub-pool compared with the parent pool, and it returns a reference to the new sub-pool.
Reduced-size pools are deleted using le_mem_DeleteSubPool()
like other sub-pools.
To help the programmer pick the right pool to allocate from, reduced-size pools provide le_mem_TryVarAlloc()
, le_mem_AssertVarAlloc()
and le_mem_ForceVarAlloc()
functions. In addition to the pool these take the object size to allocate. If this size is larger than the pool's object size and the pool is a reduced-size pool, and there is a parent pool large enough for this object, it will allocate the object from the parent pool instead. If no parent pool is large enough, the program will exit.
You can call le_mem_GetBlockSize()
to get the actual size of the object returned by one of these functions.
As with other sub-pools, they cannot be deleted while any blocks are allocated from the pool, or it has any sub-pools. Reduced-size pools also automatically inherit their parent's destructor function.
Copyright (C) Sierra Wireless Inc.