Specifically, the extent_arena_[g|s]et functions and the address randomization.
These are the only things that tie the extent struct itself to the arena code.
W/o retain, split and merge are disallowed on Windows. Avoid doing first-fit
which needs splitting almost always. Instead, try exact fit only and bail out
early.
extent_register may only fail if the underlying extent and region got stolen /
coalesced before we lock. Avoid doing extent_leak (which purges the region)
since we don't really own the region.
This can only happen on Windows and with opt.retain disabled (which isn't the
default). The solution is suboptimal, however not a common case as retain is
the long term plan for all platforms anyway.
The VirtualAlloc and VirtualFree APIs are different because MEM_DECOMMIT cannot
be used across multiple VirtualAlloc regions. To properly support decommit,
only allow merge / split within the same region -- this is done by tracking the
"is_head" state of extents and not merging cross-region.
Add a new state is_head (only relevant for retain && !maps_coalesce), which is
true for the first extent in each VirtualAlloc region. Determine if two extents
can be merged based on the head state, and use serial numbers for sanity checks.
This option saves a few CPU cycles, but potentially adds a lot of
fragmentation - so much so that there are workarounds like
max_active. Instead, let's just drop it entirely. It only made
a difference in one service I tested (.3% cpu regression), while
many services saw a memory win (also small, less than 1% mem P99)
This is discovered and suggested by @jasone in #1468. When custom extent hooks
are in use, we should ensure page alignment on the extent alloc path, instead of
relying on the user hooks to do so.
The analytics tool is put under experimental.utilization namespace in
mallctl. Input is one pointer or an array of pointers and the output
is a list of memory utilization statistics.
When custom extent_hooks or transparent huge pages are in use, the purging
semantics may change, which means we may not get zeroed pages on repopulating.
Fixing the issue by manually memset for such cases.
We eagerly coalesce large buffers when deallocating, however the previous logic
around this introduced extra lock overhead -- when coalescing we always lock the
neighbors even if they are active, while for active extents nothing can be done.
This commit checks if the neighbor extents are potentially active before
locking, and avoids locking if possible. This speeds up large_dalloc by ~20%.
It also fixes some undesired behavior: we could stop coalescing because a small
buffer was merged, while a large neighbor was ignored on the other side.
When retain is enabled, the default dalloc hook does nothing (since we avoid
munmap). But the overhead preparing the call is high, specifically the extent
de-register and re-register involve locking and extent / rtree modifications.
Bypass the call with retain in this diff.
The global data is mostly only used at initialization, or for easy access to
values we could compute statically. Instead of consuming that space (and
risking TLB misses), we can just pass around a pointer to stack data during
bootstrapping.
The largest small class, smallest large class, and largest large class may all
be needed down fast paths; to avoid the risk of touching another cache line, we
can make them available as constants.
This class removes almost all the dependencies on size_classes.h, accessing the
data there only via the new module sc.h, which does not depend on any
configuration options.
In a subsequent commit, we'll remove the configure-time size class computations,
doing them at boot time, instead.
Before this commit jemalloc produced many warnings when compiled with -Wextra
with both Clang and GCC. This commit fixes the issues raised by these warnings
or suppresses them if they were spurious at least for the Clang and GCC
versions covered by CI.
This commit:
* adds `JEMALLOC_DIAGNOSTIC` macros: `JEMALLOC_DIAGNOSTIC_{PUSH,POP}` are
used to modify the stack of enabled diagnostics. The
`JEMALLOC_DIAGNOSTIC_IGNORE_...` macros are used to ignore a concrete
diagnostic.
* adds `JEMALLOC_FALLTHROUGH` macro to explicitly state that falling
through `case` labels in a `switch` statement is intended
* Removes all UNUSED annotations on function parameters. The warning
-Wunused-parameter is now disabled globally in
`jemalloc_internal_macros.h` for all translation units that include
that header. It is never re-enabled since that header cannot be
included by users.
* locally suppresses some -Wextra diagnostics:
* `-Wmissing-field-initializer` is buggy in older Clang and GCC versions,
where it does not understanding that, in C, `= {0}` is a common C idiom
to initialize a struct to zero
* `-Wtype-bounds` is suppressed in a particular situation where a generic
macro, used in multiple different places, compares an unsigned integer for
smaller than zero, which is always true.
* `-Walloc-larger-than-size=` diagnostics warn when an allocation function is
called with a size that is too large (out-of-range). These are suppressed in
the parts of the tests where `jemalloc` explicitly does this to test that the
allocation functions fail properly.
* adds a new CI build bot that runs the log unit test on CI.
Closes#1196 .
Previously, we would leak the extent and memory associated with a salvageable
portion of an extent that we were trying to split in three, in the case where
the first split attempt succeeded and the second failed.
preserve_lru feature adds lots of complication, for little value.
Removing it means merged extents are re-added to the lru list, and may
take longer to madvise away than they otherwise would.
Canaries after removal seem flat for several services (no change).
"always" marks all user mappings as MADV_HUGEPAGE; while "never" marks all
mappings as MADV_NOHUGEPAGE. The default setting "default" does not change any
settings. Note that all the madvise calls are part of the default extent hooks
by design, so that customized extent hooks have complete control over the
mappings including hugepage settings.
We compute the max size required to satisfy an alignment. However this can be
quite pessimistic, especially with frequent reuse (and combined with state-based
fragmentation). This commit adds one more fit step specific to aligned
allocations, searching in all potential fit size classes.
When purging, large allocations are usually the ones that cross the npages_limit
threshold, simply because they are "large". This means we often leave the large
extent around for a while, which has the downsides of: 1) high RSS and 2) more
chance of them getting fragmented. Given that they are not likely to be reused
very soon (LRU), let's over purge by 1 extent (which is often large and not
reused frequently).
Coalescing is a small price to pay for large allocations since they happen less
frequently. This reduces fragmentation while also potentially improving
locality.
When allocating from dirty extents (which we always prefer if available), large
active extents can get split even if the new allocation is much smaller, in
which case the introduced fragmentation causes high long term damage. This new
option controls the threshold to reuse and split an existing active extent. We
avoid using a large extent for much smaller sizes, in order to reduce
fragmentation. In some workload, adding the threshold improves virtual memory
usage by >10x.