Fix management of extent_grow_next to serialize operations that may grow
retained memory. This assures that the sizes of the newly allocated
extents correspond to the size classes in the intended growth sequence.
Fix management of extent_grow_next to skip size classes if a request is
too large to be satisfied by the next size in the growth sequence. This
avoids the potential for an arbitrary number of requests to bypass
triggering extent_grow_next increases.
This resolves#858.
Rather than relying on parallel make to build individual configurations
one at a time, use xargs to build multiple configurations in parallel.
This allows the configure scripts to run in parallel. On a 14-core
system (28 hyperthreads), this increases average CPU utilization from
~20% to ~90%.
An sbrk() caller outside jemalloc can decrease the dss, so add a
separate atomic boolean to explicitly track whether jemalloc is
concurrently calling sbrk(), rather than depending on state outside
jemalloc's full control.
This resolves#802.
Generalize the run_tests.sh and .travis.yml test generation to handle
combinations of arguments to the --with-malloc-conf configure option,
and merge "dss:primary" into the existing "tcache:false" testing.
Drop the base mutex while allocating new base blocks, because extent
allocation can enter code that prohibits holding non-core mutexes, e.g.
the extent_[d]alloc() and extent_purge_forced_wrapper() calls in
extent_alloc_dss().
This partially resolves#802.
When # of dirty pages move below npages_limit (e.g. they are reused), we should
not lower number of unpurged pages because that would cause the reused pages to
be double counted in the backlog (as a result, decay happen slower than it
should). Instead, set number of unpurged to the greater of current npages and
npages_limit.
Added an assertion: the ceiling # of pages should be greater than npages_limit.
To avoid background threads sleeping forever with idle arenas, we eagerly check
background threads' sleep time after extents_dalloc, and signal the thread if
necessary.
Added opt.background_thread to enable background threads, which handles purging
currently. When enabled, decay ticks will not trigger purging (which will be
left to the background threads). We limit the max number of threads to NCPUs.
When percpu arena is enabled, set CPU affinity for the background threads as
well.
The sleep interval of background threads is dynamic and determined by computing
number of pages to purge in the future (based on backlog).
Instead of embedding a lock bit in rtree leaf elements, we associate extents
with a small set of mutexes. This gets us two things:
- We can use the system mutexes. This (hypothetically) protects us from
priority inversion, and lets us stop doing a backoff/sleep loop, instead
opting for precise wakeups from the mutex.
- Cuts down on the number of mutex acquisitions we have to do (from 4 in the
worst case to two).
We end up simplifying most of the rtree code (which no longer has to deal with
locking or concurrency at all), at the cost of additional complexity in the
extent code: since the mutex protecting the rtree leaf elements is determined by
reading the extent out of those elements, the initial read is racy, so that we
may acquire an out of date mutex. We re-check the extent in the leaf after
acquiring the mutex to protect us from this race.
This lets us specify whether and how mutexes of the same rank are allowed to be
acquired. Currently, we only allow two polices (only a single mutex at a given
rank at a time, and mutexes acquired in ascending order), but we can plausibly
allow more (e.g. the "release uncontended mutexes before blocking").
Support millisecond resolution for decay times. Among other use cases
this makes it possible to specify a short initial dirty-->muzzy decay
phase, followed by a longer muzzy-->clean decay phase.
This resolves#812.
Rather than using a manually maintained list of internal symbols to
drive name mangling, add a compilation phase to automatically extract
the list of internal symbols.
This resolves#677.