Commit Graph

611 Commits

Author SHA1 Message Date
Jason Evans
7c124830a1 Fix lg_chunk clamping for config_cache_oblivious.
Fix lg_chunk clamping to take into account cache-oblivious large
allocation.  This regression only resulted in incorrect behavior if
!config_fill (false unless --disable-fill specified) and
config_cache_oblivious (true unless --disable-cache-oblivious
specified).

This regression was introduced by
8a03cf039c (Implement cache index
randomization for large allocations.), which was first released in
4.0.0.

This resolves #555.
2017-02-27 15:19:41 -08:00
Jason Evans
1e2c9ef8d6 Fix huge-aligned allocation.
This regression was caused by
b9408d77a6 (Fix/simplify chunk_recycle()
allocation size computations.).

This resolves #647.
2017-02-27 11:08:19 -08:00
Jason Evans
54d2d697b2 Test JSON output of malloc_stats_print() and fix bugs.
Implement and test a JSON validation parser.  Use the parser to validate
JSON output from malloc_stats_print(), with a significant subset of
supported output options.

This resolves #583.
2017-02-26 11:04:58 -08:00
Jason Evans
61d26425e5 Fix JSON-mode output for !config_stats and/or !config_prof cases.
These bugs were introduced by b599b32280
(Add "J" (JSON) support to malloc_stats_print().), which was first
released in 4.3.0.

This resolves #615.
2017-02-26 11:04:58 -08:00
Jason Evans
adae7cfc4a Fix chunk_alloc_dss() regression.
Fix chunk_alloc_dss() to account for bytes that are not a multiple of
the chunk size.  This regression was introduced by
e2bcf037d4 (Make dss operations
lockless.), which was first released in 4.3.0.
2017-02-26 10:53:26 -08:00
Jason Evans
08c24e7c1a Relax witness assertions related to prof_gdump().
In some cases the prof machinery allocates (in order to modify the
bt2gctx hash table), and such operations are synchronized via
bt2gctx_mtx.  Rather than asserting that no locks are held on entry
into functions that may call prof_gdump(), make the weaker assertion
that no "core" locks are held.  The prof machinery enqueues dumps
triggered by prof_gdump() calls when bt2gctx_mtx is held, so this
weakened assertion avoids false failures in such cases.
2017-02-23 10:08:42 -08:00
Jason Evans
f56cb9a68e Add witness_assert_depth[_to_rank]().
This makes it possible to make lock state assertions about precisely
which locks are held.
2017-02-23 10:08:42 -08:00
Jason Evans
3ecc3c8486 Fix/refactor tcaches synchronization.
Synchronize tcaches with tcaches_mtx rather than ctl_mtx.  Add missing
synchronization for tcache flushing.  This bug was introduced by
1cb181ed63 (Implement explicit tcache
support.), which was first released in 4.0.0.
2017-02-23 10:06:15 -08:00
Jason Evans
fdba5ad5cc Repair file permissions.
This regression was caused by 8f61fdedb9
(Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *).).

This resolves #538.
2017-02-22 00:24:32 -08:00
Tamir Duberstein
b973ec7975 Avoid redeclaring glibc's secure_getenv
Avoid the name secure_getenv to avoid redeclaring secure_getenv when
secure_getenv is present but its use is manually disabled via
ac_cv_func_secure_getenv=no.
2017-01-25 11:22:28 -08:00
Jason Evans
b49c649bc1 Fix lock order reversal during gdump. 2017-01-24 12:50:06 -08:00
Jason Evans
dad74bd3c8 Convert witness_assert_lockless() to witness_assert_lock_depth().
This makes it possible to make lock state assertions about precisely
which locks are held.
2017-01-24 12:50:06 -08:00
Mike Hommey
c6943acb3c Add dummy implementations for most remaining OSX zone allocator functions
Some system libraries are using malloc_default_zone() and then using
some of the malloc_zone_* API. Under normal conditions, those functions
check the malloc_zone_t/malloc_introspection_t struct for the values
that are allowed to be NULL, so that a NULL deref doesn't happen.

As of OSX 10.12, malloc_default_zone() doesn't return the actual default
zone anymore, but returns a fake, wrapper zone. The wrapper zone defines
all the possible functions in the malloc_zone_t/malloc_introspection_t
struct (almost), and calls the function from the registered default zone
(jemalloc in our case) on its own. Without checking whether the pointers
are NULL.

This means that a system library that calls e.g.
malloc_zone_batch_malloc(malloc_default_zone(), ...) ends up trying to
call jemalloc_zone.batch_malloc, which is NULL, and crash follows.

So as of OSX 10.12, the default zone is required to have all the
functions available (really, the same as the wrapper zone), even if they
do nothing.

This is arguably a bug in libsystem_malloc in OSX 10.12, but jemalloc
still needs to work in that case.
2017-01-17 20:12:24 -08:00
Mike Hommey
c68bb41793 Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions
The SDK jemalloc is built against might be not be the latest for various
reasons, but the resulting binary ought to work on newer versions of
OSX.

In order to ensure this, we need the fullest definitions possible, so
copy what we need from the latest version of malloc/malloc.h available
on opensource.apple.com.
2017-01-17 20:12:24 -08:00
Jason Evans
145f3cd173 Add --disable-syscall.
This resolves #517.
2016-12-03 16:56:19 -08:00
Jason Evans
34a7e37a71 Fix pages_purge() when using MADV_DONTNEED.
This fixes a regression caused by
e98a620c59 (Mark partially purged arena
chunks as non-hugepage.).
2016-12-03 16:06:19 -08:00
Jason Evans
e98a620c59 Mark partially purged arena chunks as non-hugepage.
Add the pages_[no]huge() functions, which toggle huge page state via
madvise(..., MADV_[NO]HUGEPAGE) calls.

The first time a page run is purged from within an arena chunk, call
pages_nohuge() to tell the kernel to make no further attempts to back
the chunk with huge pages.  Upon arena chunk deletion, restore the
associated virtual memory to its original state via pages_huge().

This resolves #243.
2016-11-24 00:15:55 -08:00
Jason Evans
949a27fc32 Add pthread_atfork(3) feature test.
Some versions of Android provide a pthreads library without providing
pthread_atfork(), so in practice a separate feature test is necessary
for the latter.
2016-11-17 15:16:27 -08:00
Jason Evans
62f2d84e7a Refactor madvise(2) configuration.
Add feature tests for the MADV_FREE and MADV_DONTNEED flags to
madvise(2), so that MADV_FREE is detected and used for Linux kernel
versions 4.5 and newer.  Refactor pages_purge() so that on systems which
support both flags, MADV_FREE is preferred over MADV_DONTNEED.

This resolves #387.
2016-11-17 10:37:48 -08:00
Jason Evans
0d6a472db9 Avoid gcc tautological-compare warnings. 2016-11-16 18:53:59 -08:00
Jason Evans
3ea838d2a2 Avoid gcc type-limits warnings. 2016-11-16 18:32:24 -08:00
Jason Evans
6468dd52f3 Fix an MSVC compiler warning. 2016-11-15 21:08:28 -08:00
Jason Evans
8f61fdedb9 Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *).
This avoids warnings in some cases, and is otherwise generally good
hygiene.
2016-11-15 15:00:28 -08:00
Jason Evans
2379479225 Consistently use size_t rather than uint64_t for extent serial numbers. 2016-11-15 13:47:22 -08:00
Jason Evans
5c77af98b1 Add extent serial numbers.
Add extent serial numbers and use them where appropriate as a sort key
that is higher priority than address, so that the allocation policy
prefers older extents.

This resolves #147.
2016-11-15 13:33:40 -08:00
Jason Evans
1aeea0f391 Simplify extent_quantize().
2cdf07aba9 (Fix extent_quantize() to
handle greater-than-huge-size extents.) solved a non-problem; the
expression passed in to index2size() was never too large.  However the
expression could in principle underflow, so fix the actual (latent) bug
and remove unnecessary complexity.
2016-11-11 22:46:55 -08:00
Jason Evans
b9408d77a6 Fix/simplify chunk_recycle() allocation size computations.
Remove outer CHUNK_CEILING(s2u(...)) from alloc_size computation, since
s2u() may overflow (and return 0), and CHUNK_CEILING() is only needed
around the alignment portion of the computation.

This fixes a regression caused by
5707d6f952 (Quantize szad trees by size
class.) and first released in 4.0.0.

This resolves #497.
2016-11-11 22:18:39 -08:00
Jason Evans
2cdf07aba9 Fix extent_quantize() to handle greater-than-huge-size extents.
Allocation requests can't directly create extents that exceed
HUGE_MAXCLASS, but extent merging can create them.

This fixes a regression caused by
8a03cf039c (Implement cache index
randomization for large allocations.) and first released in 4.0.0.

This resolves #497.
2016-11-11 22:17:27 -08:00
Jason Evans
5d6cb6eb66 Refactor prng to not use 64-bit atomics on 32-bit platforms.
This resolves #495.
2016-11-07 11:50:59 -08:00
Jason Evans
a4e83e8593 Fix run leak.
Fix arena_run_first_best_fit() to search all potentially non-empty
runs_avail heaps, rather than ignoring the heap that contains runs
larger than large_maxclass, but less than chunksize.

This fixes a regression caused by
f193fd80cf (Refactor runs_avail.).

This resolves #493.
2016-11-07 09:43:39 -08:00
Jason Evans
28b7e42e44 Fix arena data structure size calculation.
Fix paren placement so that QUANTUM_CEILING() applies to the correct
portion of the expression that computes how much memory to base_alloc().
In practice this bug had no impact.  This was caused by
5d8db15db9 (Simplify run quantization.),
which in turn fixed an over-allocation regression caused by
3c4d92e82a (Add per size class huge
allocation statistics.).
2016-11-04 15:00:08 -07:00
Jason Evans
32896a902b Fix large allocation to search optimal size class heap.
Fix arena_run_alloc_large_helper() to not convert size to usize when
searching for the first best fit via arena_run_first_best_fit().  This
allows the search to consider the optimal quantized size class, so that
e.g. allocating and deallocating 40 KiB in a tight loop can reuse the
same memory.

This regression was nominally caused by
5707d6f952 (Quantize szad trees by size
class.), but it did not commonly cause problems until
8a03cf039c (Implement cache index
randomization for large allocations.).  These regressions were first
released in 4.0.0.

This resolves #487.
2016-11-03 22:36:30 -07:00
Jason Evans
e9012630ac Fix chunk_alloc_cache() to support decommitted allocation.
Fix chunk_alloc_cache() to support decommitted allocation, and use this
ability in arena_chunk_alloc_internal() and arena_stash_dirty(), so that
chunks don't get permanently stuck in a hybrid state.

This resolves #487.
2016-11-03 22:36:30 -07:00
Dave Watson
6c56e194b0 Check for existance of CPU_COUNT macro before using it.
This resolves #485.
2016-11-02 19:54:19 -07:00
Jason Evans
da206df10b Do not use syscall(2) on OS X 10.12 (deprecated). 2016-11-02 19:35:12 -07:00
Jason Evans
3f2b8d9cfa Add os_unfair_lock support.
OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended
replacement.
2016-11-02 19:35:12 -07:00
Jason Evans
a99e0fa2d2 Fix/refactor zone allocator integration code.
Fix zone_force_unlock() to reinitialize, rather than unlocking mutexes,
since OS X 10.12 cannot tolerate a child unlocking mutexes that were
locked by its parent.

Refactor; this was a side effect of experimenting with zone
{de,re}registration during fork(2).
2016-11-02 19:35:09 -07:00
Jason Evans
b599b32280 Add "J" (JSON) support to malloc_stats_print().
This resolves #474.
2016-11-01 15:32:37 -07:00
Jason Evans
1d57c03e33 Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW.
The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC),
whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but
still has resolution (~1ms) that is adequate for our purposes.

This resolves #479.
2016-10-29 22:59:42 -07:00
Jason Evans
c443b67561 Use syscall(2) rather than {open,read,close}(2) during boot.
Some applications wrap various system calls, and if they call the
allocator in their wrappers, unexpected reentry can result.  This is not
a general solution (many other syscalls are spread throughout the code),
but this resolves a bootstrapping issue that is apparently common.

This resolves #443.
2016-10-29 22:46:52 -07:00
Jason Evans
e46f8f97bc Do not mark malloc_conf as weak on Windows.
This works around malloc_conf not being properly initialized by at least
the cygwin toolchain.  Prior build system changes to use
-Wl,--[no-]whole-archive may be necessary for malloc_conf resolution to
work properly as a non-weak symbol (not tested).
2016-10-29 00:16:30 -07:00
Jason Evans
35799a5030 Do not mark malloc_conf as weak for unit tests.
This is generally correct (no need for weak symbols since no jemalloc
library is involved in the link phase), and avoids linking problems
(apparently unininitialized non-NULL malloc_conf) when using cygwin with
gcc.
2016-10-28 23:21:14 -07:00
Dave Watson
ed84764a2a Support static linking of jemalloc with glibc
glibc defines its malloc implementation with several weak and strong
symbols:

strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc)
strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree)
strong_alias (__libc_free, __free) strong_alias (__libc_free, free)
strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc)

The issue is not with the weak symbols, but that other parts of glibc
depend on __libc_malloc explicitly.  Defining them in terms of jemalloc
API's allows the linker to drop glibc's malloc.o completely from the link,
and static linking no longer results in symbol collisions.

Another wrinkle: jemalloc during initialization calls sysconf to
get the number of CPU's.  GLIBC allocates for the first time before
setting up isspace (and other related) tables, which are used by
sysconf.  Instead, use the pthread API to get the number of
CPUs with GLIBC, which seems to work.

This resolves #442.
2016-10-28 15:10:19 -07:00
Jason Evans
dc553d52d8 Fix over-sized allocation of rtree leaf nodes.
Use the correct level metadata when allocating child nodes so that leaf
nodes don't end up over-sized (2^16 elements vs 2^4 elements).
2016-10-28 00:41:15 -07:00
Jason Evans
962a2979e3 Do not (recursively) allocate within tsd_fetch().
Refactor tsd so that tsdn_fetch() does not trigger allocation, since
allocation could cause infinite recursion.

This resolves #458.
2016-10-21 00:27:37 -07:00
Jason Evans
e2bcf037d4 Make dss operations lockless.
Rather than protecting dss operations with a mutex, use atomic
operations.  This has negligible impact on synchronization overhead
during typical dss allocation, but is a substantial improvement for
chunk_in_dss() and the newly added chunk_dss_mergeable(), which can be
called multiple times during chunk deallocations.

This change also has the advantage of avoiding tsd in deallocation paths
associated with purging, which resolves potential deadlocks during
thread exit due to attempted tsd resurrection.

This resolves #425.
2016-10-13 15:33:56 -07:00
Jason Evans
9737685943 Add/use adaptive spinning.
Add spin_t and spin_{init,adaptive}(), which provide a simple
abstraction for adaptive spinning.

Adaptively spin during busy waits in bootstrapping and rtree node
initialization.
2016-10-13 14:58:38 -07:00
Jason Evans
a2539fab95 Disallow 0x5a junk filling when running in Valgrind.
Explicitly disallow junk:true and junk:free runtime settings when
running in Valgrind, since deallocation-time junk filling and redzone
validation cause false positive Valgrind reports.

This resolves #470.
2016-10-12 22:58:40 -07:00
Jason Evans
d419bb09ef Fix and simplify decay-based purging.
Simplify decay-based purging attempts to only be triggered when the
epoch is advanced, rather than every time purgeable memory increases.
In a correctly functioning system (not previously the case; see below),
this only causes a behavior difference if during subsequent purge
attempts the least recently used (LRU) purgeable memory extent is
initially too large to be purged, but that memory is reused between
attempts and one or more of the next LRU purgeable memory extents are
small enough to be purged.  In practice this is an arbitrary behavior
change that is within the set of acceptable behaviors.

As for the purging fix, assure that arena->decay.ndirty is recorded
*after* the epoch advance and associated purging occurs.  Prior to this
fix, it was possible for purging during epoch advance to cause a
substantially underrepresentative (arena->ndirty - arena->decay.ndirty),
i.e. the number of dirty pages attributed to the current epoch was too
low, and a series of unintended purges could result.  This fix is also
relevant in the context of the simplification described above, but the
bug's impact would be limited to over-purging at epoch advances.
2016-10-11 15:50:05 -07:00
Jason Evans
45a5bf6772 Do not advance decay epoch when time goes backwards.
Instead, move the epoch backward in time.  Additionally, add
nstime_monotonic() and use it in debug builds to assert that time only
goes backward if nstime_update() is using a non-monotonic time source.
2016-10-10 22:31:37 -07:00