When purging, large allocations are usually the ones that cross the npages_limit
threshold, simply because they are "large". This means we often leave the large
extent around for a while, which has the downsides of: 1) high RSS and 2) more
chance of them getting fragmented. Given that they are not likely to be reused
very soon (LRU), let's over purge by 1 extent (which is often large and not
reused frequently).
Added an upper bound on how many pages we can decay during the current run.
Without this, decay could have unbounded increase in stashed, since other
threads could add new pages into the extents.
This option controls the max size when grow_retained. This is useful when we
have customized extent hooks reserving physical memory (e.g. 1G huge pages).
Without this feature, the default increasing sequence could result in fragmented
and wasted physical memory.
This eliminates the need for the arena stats code to "know" about tcaches; all
that it needs is a cache_bin_array_descriptor_t to tell it where to find
cache_bins whose stats it should aggregate.
This is the first step towards breaking up the tcache and arena (since they
interact primarily at the bin level). It should also make a future arena
caching implementation more straightforward.
Passing is_background_thread down the decay path, so that background thread
itself won't attempt inactivity_check. This fixes an issue with background
thread doing trylock on a mutex it already owns.
Avoid holding arenas_lock and background_thread_lock when creating background
threads, because pthread_create may take internal locks, and potentially cause
deadlock with jemalloc internal locks.
Fix management of extent_grow_next to serialize operations that may grow
retained memory. This assures that the sizes of the newly allocated
extents correspond to the size classes in the intended growth sequence.
Fix management of extent_grow_next to skip size classes if a request is
too large to be satisfied by the next size in the growth sequence. This
avoids the potential for an arbitrary number of requests to bypass
triggering extent_grow_next increases.
This resolves#858.
When # of dirty pages move below npages_limit (e.g. they are reused), we should
not lower number of unpurged pages because that would cause the reused pages to
be double counted in the backlog (as a result, decay happen slower than it
should). Instead, set number of unpurged to the greater of current npages and
npages_limit.
Added an assertion: the ceiling # of pages should be greater than npages_limit.
To avoid background threads sleeping forever with idle arenas, we eagerly check
background threads' sleep time after extents_dalloc, and signal the thread if
necessary.
Added opt.background_thread to enable background threads, which handles purging
currently. When enabled, decay ticks will not trigger purging (which will be
left to the background threads). We limit the max number of threads to NCPUs.
When percpu arena is enabled, set CPU affinity for the background threads as
well.
The sleep interval of background threads is dynamic and determined by computing
number of pages to purge in the future (based on backlog).
This lets us specify whether and how mutexes of the same rank are allowed to be
acquired. Currently, we only allow two polices (only a single mutex at a given
rank at a time, and mutexes acquired in ascending order), but we can plausibly
allow more (e.g. the "release uncontended mutexes before blocking").
Support millisecond resolution for decay times. Among other use cases
this makes it possible to specify a short initial dirty-->muzzy decay
phase, followed by a longer muzzy-->clean decay phase.
This resolves#812.
Instead, always define function pointers for interceptable functions,
but mark them const unless testing, so that the compiler can optimize
out the pointer dereferences.
Add the extent_destroy_t extent destruction hook to extent_hooks_t, and
use it during arena destruction. This hook explicitly communicates to
the callee that the extent must be destroyed or tracked for later reuse,
lest it be permanently leaked. Prior to this change, retained extents
could unintentionally be leaked if extent retention was enabled.
This resolves#560.
Control use of munmap(2) via a run-time option rather than a
compile-time option (with the same per platform default). The old
behavior of --disable-munmap can be achieved with
--with-malloc-conf=munmap:false.
This partially resolves#580.
Simplify configuration by removing the --disable-tcache option, but
replace the testing for that configuration with
--with-malloc-conf=tcache:false.
Fix the thread.arena and thread.tcache.flush mallctls to work correctly
if tcache is disabled.
This partially resolves#580.
Tracking extents is required by arena_reset. To support this, the extent
linkage was used for tracking 1) large allocations, and 2) full slabs. However
modifying the extent linkage could be an expensive operation as it likely incurs
cache misses. Since we forbid arena_reset on auto arenas, let's bypass the
linkage operations for auto arenas.
Rather than using a LIFO queue to track available extent_t structures,
use a red-black tree, and always choose the oldest/lowest available
during reuse.
Previously we had a general detection and support of reentrancy, at the cost of
having branches and inc / dec operations on fast paths. To avoid taxing fast
paths, we move the reentrancy operations onto tsd slow state, and only modify
reentrancy level around external calls (that might trigger reentrancy).
With this change, when profiling is enabled, we avoid doing redundant rtree
lookups. Also changed dalloc_atx_t to alloc_atx_t, as it's now used on
allocation path as well (to speed up profiling).
This is a biggy. jemalloc_internal.h has been doing multiple jobs for a while
now:
- The source of system-wide definitions.
- The catch-all include file.
- The module header file for jemalloc.c
This commit splits up this functionality. The system-wide definitions
responsibility has moved to jemalloc_preamble.h. The catch-all include file is
now jemalloc_internal_includes.h. The module headers for jemalloc.c are now in
jemalloc_internal_[externs|inlines|types].h, just as they are for the other
modules.