Detect whether chunks start off as THP-capable by default (according to
the state of /sys/kernel/mm/transparent_hugepage/enabled), and use this
as the basis for whether to call pages_nohuge() once per chunk during
first purge of any of the chunk's page runs.
Add the --disable-thp configure option, as well as the the opt.thp
mallctl.
This resolves#541.
Convert CFLAGS to be a concatenation:
CFLAGS := CONFIGURE_CFLAGS SPECIFIED_CFLAGS EXTRA_CFLAGS
This ordering makes it possible to override the flags set by the
configure script both during and after configuration, with CFLAGS and
EXTRA_CFLAGS, respectively.
This resolves#619.
When multiple threads calling stats_print, race could happen as we read the
counters in separate mallctl calls; and the removed assertion could fail when
other operations happened in between the mallctl calls. For simplicity, output
"race" in the utilization field in this case.
This resolves#616.
Fix lg_chunk clamping to take into account cache-oblivious large
allocation. This regression only resulted in incorrect behavior if
!config_fill (false unless --disable-fill specified) and
config_cache_oblivious (true unless --disable-cache-oblivious
specified).
This regression was introduced by
8a03cf039c (Implement cache index
randomization for large allocations.), which was first released in
4.0.0.
This resolves#555.
Implement and test a JSON validation parser. Use the parser to validate
JSON output from malloc_stats_print(), with a significant subset of
supported output options.
This resolves#583.
Fix chunk_alloc_dss() to account for bytes that are not a multiple of
the chunk size. This regression was introduced by
e2bcf037d4 (Make dss operations
lockless.), which was first released in 4.3.0.
In some cases the prof machinery allocates (in order to modify the
bt2gctx hash table), and such operations are synchronized via
bt2gctx_mtx. Rather than asserting that no locks are held on entry
into functions that may call prof_gdump(), make the weaker assertion
that no "core" locks are held. The prof machinery enqueues dumps
triggered by prof_gdump() calls when bt2gctx_mtx is held, so this
weakened assertion avoids false failures in such cases.
This fixes interactions with witness_assert_depth[_to_rank](), which was
added in dad74bd3c8 (Convert
witness_assert_lockless() to witness_assert_lock_depth().).
malloc_conf does not reliably work with MSVC, which complains of
"inconsistent dll linkage", i.e. its inability to support the
application overriding malloc_conf when dynamically linking/loading.
Work around this limitation by adding test harness support for per test
shell script sourcing, and converting all tests to use MALLOC_CONF
instead of malloc_conf.
Synchronize tcaches with tcaches_mtx rather than ctl_mtx. Add missing
synchronization for tcache flushing. This bug was introduced by
1cb181ed63 (Implement explicit tcache
support.), which was first released in 4.0.0.
Introduces gen_travis.py, which generates .travis.yml, and updates .travis.yml
to be the generated version.
The travis build matrix approach doesn't play well with mixing and matching
various different environment settings, so we generate every build explicitly,
rather than letting them do it for us.
To avoid abusing travis resources (and save us time waiting for CI results), we
don't test every possible combination of options; we only check up to 2 unusual
settings at a time.
Avoid the name secure_getenv to avoid redeclaring secure_getenv when
secure_getenv is present but its use is manually disabled via
ac_cv_func_secure_getenv=no.
Some system libraries are using malloc_default_zone() and then using
some of the malloc_zone_* API. Under normal conditions, those functions
check the malloc_zone_t/malloc_introspection_t struct for the values
that are allowed to be NULL, so that a NULL deref doesn't happen.
As of OSX 10.12, malloc_default_zone() doesn't return the actual default
zone anymore, but returns a fake, wrapper zone. The wrapper zone defines
all the possible functions in the malloc_zone_t/malloc_introspection_t
struct (almost), and calls the function from the registered default zone
(jemalloc in our case) on its own. Without checking whether the pointers
are NULL.
This means that a system library that calls e.g.
malloc_zone_batch_malloc(malloc_default_zone(), ...) ends up trying to
call jemalloc_zone.batch_malloc, which is NULL, and crash follows.
So as of OSX 10.12, the default zone is required to have all the
functions available (really, the same as the wrapper zone), even if they
do nothing.
This is arguably a bug in libsystem_malloc in OSX 10.12, but jemalloc
still needs to work in that case.
The SDK jemalloc is built against might be not be the latest for various
reasons, but the resulting binary ought to work on newer versions of
OSX.
In order to ensure this, we need the fullest definitions possible, so
copy what we need from the latest version of malloc/malloc.h available
on opensource.apple.com.
Currently, jemalloc detects sparc64 targets by checking whether
__sparc64__ is defined. However, this definition is used on BSD
targets only. Linux targets define both __sparc__ and __arch64__
for sparc64. Since this also works on BSD, rather use __sparc__
and __arch64__ instead of __sparc64__ to detect sparc64 targets.
The core issue here is the weak linking of the symbol, and in certain
environments--for instance, using the latest Xcode (8.1) with the latest
SDK (10.12)--os_unfair_lock may resolve even though you're compiling on
a host that doesn't support it (10.11).
We can use the availability macros to circumvent this problem, and
detect that we're not compiling for a target that is going to support
them and error out at compile time. The other alternative is to do a
runtime check, but that presents issues for cross-compiling.
Add the pages_[no]huge() functions, which toggle huge page state via
madvise(..., MADV_[NO]HUGEPAGE) calls.
The first time a page run is purged from within an arena chunk, call
pages_nohuge() to tell the kernel to make no further attempts to back
the chunk with huge pages. Upon arena chunk deletion, restore the
associated virtual memory to its original state via pages_huge().
This resolves#243.
Some versions of Android provide a pthreads library without providing
pthread_atfork(), so in practice a separate feature test is necessary
for the latter.
Add feature tests for the MADV_FREE and MADV_DONTNEED flags to
madvise(2), so that MADV_FREE is detected and used for Linux kernel
versions 4.5 and newer. Refactor pages_purge() so that on systems which
support both flags, MADV_FREE is preferred over MADV_DONTNEED.
This resolves#387.
Rather than relying on two's complement negation for alignment mask
generation, use bitwise not and addition. This dodges warnings from
MSVC, and should be strength-reduced by compiler optimization anyway.
Add extent serial numbers and use them where appropriate as a sort key
that is higher priority than address, so that the allocation policy
prefers older extents.
This resolves#147.
2cdf07aba9 (Fix extent_quantize() to
handle greater-than-huge-size extents.) solved a non-problem; the
expression passed in to index2size() was never too large. However the
expression could in principle underflow, so fix the actual (latent) bug
and remove unnecessary complexity.