Make it possible to disable interval-triggered profile dumping, even if
profiling is enabled. This is useful if the user only wants a single
dump at exit, or if the application manually triggers profile dumps.
If the mean heap sampling interval is larger than one page, simulate
sampled small objects with large objects. This allows profiling context
pointers to be omitted for small objects. As a result, the memory
overhead for sampling decreases as the sampling interval is increased.
Fix a compilation error in the profiling code.
Remove medium size classes, because concurrent dirty page purging is
no longer capable of purging inactive dirty pages inside active runs
(due to recent arena/bin locking changes).
Enhance tcache to support caching large objects, so that the same range
of size classes is still cached, despite the removal of medium size
class support.
Use chains of cached objects, rather than using arrays of pointers.
Since tcache_bin_t is no longer dynamically sized, convert tcache_t's
tbin to an array of structures, rather than an array of pointers. This
implicitly removes tcache_bin_{create,destroy}(), which further
simplifies the fast path for malloc/free.
Use cacheline alignment for tcache_t allocations.
Remove runtime configuration option for number of tcache bin slots, and
replace it with a boolean option for enabling/disabling tcache.
Limit the number of tcache objects to the lesser of TCACHE_NSLOTS_MAX
and 2X the number of regions per run for the size class.
For GC-triggered flush, discard 3/4 of the objects below the low water
mark, rather than 1/2.
Remove all functionality related to tracing. This functionality was
useful for understanding memory fragmentation during early algorithmic
design of jemalloc, but it had little utility for non-trivial
applications, due to the sheer volume of data written to disk.
If a custom small_size2bin table was required due to non-default size
class settings, memory allocation prior to initializing chunk parameters
would cause a crash due to division by 0. The fix re-orders the various
*_boot() function calls.
Bootstrapping is simpler now than it was before the base allocator
started just using the chunk allocator directly. This allows
arena_boot[01]() to be combined.
Add error detection for pthread_atfork() and atexit() function calls.
Replace chunk stats code that was missing locking; this fixes a race
condition that could corrupt chunk statistics.
Converting malloc_stats_print() to use mallctl*().
Add a missing semicolon in th DSS code.
Convert malloc_tcache_flush() to a mallctl.
Convert malloc_swap_enable() to a set of mallctl's.
Fix a stats bug in large object curruns accounting.
Replace tcache_bin_fill() with arena_tcache_fill(), and fix a bug in an OOM
error path.
Fix API name mangling to coexist with __attribute__((malloc)).
Enhance bin run deallocation to avoid marking all pages as dirty, since the
dirty bits are already correct for all but the first page, due to the use of
arena_run_rc_{incr,decr}(). This tends to dramatically reduce the number of
pages that are marked dirty.
Modify arena_bin_run_size_calc() to assure that bin run headers never exceed
one page. In practice, this can't happen unless hard-coded constants (related
to RUN_MAX_OVRHD) are modified, but the dirty page tracking code assumes bin
run headers never extend past the first page, so it seems worth making this a
universally valid assumption.
Use JEMALLOC_ATTR(tls_model("initial-exec)) instead of -ftls-model=initial-exec
so that libjemalloc_pic.a can be directly linked into another library without
needing linker options changes.
Add attributes to malloc, calloc, and posix_memalign, for compatibility with
glibc's declarations.
Add function prototypes for the standard malloc(3) API.
Add the 'G'/'g' and 'H'/'h' MALLOC_OPTIONS flags.
Add the malloc_tcache_flush() function.
Disable thread-specific caching until the application goes multi-threaded.
Add the 'M' and 'm' MALLOC_OPTIONS flags, which control the maximum medium size
class.
Relax the cap on small/medium run size to arena_maxclass.
Reduce arena_run_reg_dalloc() integer division code complexity.
Increase the default chunk size from 1MiB to 4MiB.
implementation, calls free() after calling TSD destructors. This was causing a
crash during thread exit, since the magazine rack was no longer valid for the
thread. Fix this by using a special mag_rack value to indicate that
deallocation should bypass the magazine machinery.