Commit Graph

2542 Commits

Author SHA1 Message Date
Jason Evans
ff4db5014e Remove rtree leading 0 bit optimization.
A subsequent change instead ignores insignificant high bits.
2017-02-08 18:50:03 -08:00
Jason Evans
cdc240d501 Make non-essential inline rtree functions static functions. 2017-02-08 18:50:03 -08:00
Jason Evans
c511a44e99 Split rtree_elm_lookup_hard() out of rtree_elm_lookup().
Anything but a hit in the first element of the lookup cache is
expensive enough to negate the benefits of inlining.
2017-02-08 18:50:03 -08:00
Jason Evans
4a346f5593 Replace rtree path cache with LRU cache.
Rework rtree_ctx_t to encapsulate an rtree leaf LRU lookup cache rather
than a single-path element lookup cache.  The replacement is logically
much simpler, as well as slightly faster in the fast path case and less
prone to degraded performance during non-trivial sequences of lookups.
2017-02-08 18:50:03 -08:00
Jason Evans
0ecf692726 Optimize a branch out of rtree_read() if !dependent. 2017-02-08 18:50:03 -08:00
Jason Evans
3bd6d8e41d Conditianalize lg_tcache_max use on JEMALLOC_TCACHE. 2017-02-07 12:15:36 -08:00
Jason Evans
5177995530 Fix extent_record().
Read adjacent rtree elements while holding element locks, since the
extents mutex only protects against relevant like-state extent mutation.

Fix management of the 'coalesced' loop state variable to merge
forward/backward results, rather than overwriting the result of forward
coalescing if attempting to coalesce backward.  In practice this caused
no correctness issues, but could cause extra iterations in rare cases.

These regressions were introduced by
d27f29b468 (Disentangle arena and extent
locking.).
2017-02-06 20:05:49 -08:00
Jason Evans
6737d5f61e Fix a race in extent_grow_retained().
Set extent as active prior to registration so that other threads can't
modify it in the absence of locking.

This regression was introduced by
d27f29b468 (Disentangle arena and extent
locking.), via non-obvious means.  Removal of extents_mtx protection
during extent_grow_retained() execution opened up the race, but in the
presence of that locking, the code was safe.

This resolves #599.
2017-02-04 12:15:13 -08:00
Jason Evans
1bac516aaa Optimize compute_size_with_overflow().
Do not check for overflow unless it is actually a possibility.
2017-02-03 19:13:05 -08:00
Jason Evans
767ffa2b5f Fix compute_size_with_overflow().
Fix compute_size_with_overflow() to use a high_bits mask that has the
high bits set, rather than the low bits.  This regression was introduced
by 5154ff32ee (Unify the allocation
paths).
2017-02-03 19:13:05 -08:00
Jason Evans
d27f29b468 Disentangle arena and extent locking.
Refactor arena and extent locking protocols such that arena and
extent locks are never held when calling into the extent_*_wrapper()
API.  This requires extra care during purging since the arena lock no
longer protects the inner purging logic.  It also requires extra care to
protect extents from being merged with adjacent extents.

Convert extent_t's 'active' flag to an enumerated 'state', so that
retained extents are explicitly marked as such, rather than depending on
ring linkage state.

Refactor the extent collections (and their synchronization) for cached
and retained extents into extents_t.  Incorporate LRU functionality to
support purging.  Incorporate page count accounting, which replaces
arena->ndirty and arena->stats.retained.

Assert that no core locks are held when entering any internal
[de]allocation functions.  This is in addition to existing assertions
that no locks are held when entering external [de]allocation functions.

Audit and document synchronization protocols for all arena_t fields.

This fixes a potential deadlock due to recursive allocation during
gdump, in a similar fashion to b49c649bc1
(Fix lock order reversal during gdump.), but with a necessarily much
broader code impact.
2017-02-01 16:43:46 -08:00
Jason Evans
1b6e43507e Fix/refactor tcaches synchronization.
Synchronize tcaches with tcaches_mtx rather than ctl_mtx.  Add missing
synchronization for tcache flushing.  This bug was introduced by
1cb181ed63 (Implement explicit tcache
support.), which was first released in 4.0.0.
2017-02-01 16:43:46 -08:00
Jason Evans
d0e93ada51 Add witness_assert_depth[_to_rank]().
This makes it possible to make lock state assertions about precisely
which locks are held.
2017-02-01 16:43:46 -08:00
Jason Evans
ace679ce74 Synchronize extent_grow_next accesses.
This should have been part of 411697adcd
(Use exponential series to size extents.), which introduced
extent_grow_next.
2017-02-01 16:43:46 -08:00
Jason Evans
5033a9176a Call prof_gctx_create() without owing bt2gctx_mtx.
This reduces the probability of allocating (and thereby indirectly
making a system call) while owning bt2gctx_mtx.  Unfortunately it is an
incomplete solution, because ckh insertion/deletion can also
allocate/deallocate, which requires more extensive changes to address.
2017-02-01 16:43:46 -08:00
Jason Evans
397f54aa46 Conditionalize prof fork handling on config_prof.
This allows the compiler to completely remove dead code.
2017-02-01 16:43:46 -08:00
Qi Wang
bbff6ca674 Handle race in stats_arena_bins_print
When multiple threads calling stats_print, race could happen as we read the
counters in separate mallctl calls; and the removed assertion could fail when
other operations happened in between the mallctl calls. For simplicity, output
"race" in the utilization field in this case.
2017-02-01 15:17:39 -08:00
Jason Evans
190f81c6d5 Silence harmless warnings discovered via run_tests.sh. 2017-02-01 11:29:12 -08:00
David Goldblatt
449b7f4867 CI: Run --enable-debug builds on windows
This will hopefully catch some windows-specific bugs.
2017-01-31 17:23:30 -08:00
David Goldblatt
5260d9c12f Introduce scripts to run all possible tests
In 6e7d0890 we added better travis continuous integration tests. This is nice,
but has two problems:
- We run only a subset of interesting tests.
- The travis builds can take hours to give us back results (especially on OS X).

This adds scripts/gen_run_tests.py, and its output, run_tests.sh, which builds
and runs a larger portion of possible configurations on the local machine.

While a travis run takes several hours to complete , I can run these scripts on
my (OS X) latop and (Linux) devserve, and get a more exhaustive set of results
back in around 10 minutes.
2017-01-30 17:51:57 -08:00
David Goldblatt
6e7d0890cb Beef up travis CI integration testing
Introduces gen_travis.py, which generates .travis.yml, and updates .travis.yml
to be the generated version.

The travis build matrix approach doesn't play well with mixing and matching
various different environment settings, so we generate every build explicitly,
rather than letting them do it for us.

To avoid abusing travis resources (and save us time waiting for CI results), we
don't test every possible combination of options; we only check up to 2 unusual
settings at a time.
2017-01-26 21:31:21 -08:00
David Goldblatt
85d2841818 Fix a bug in which a potentially invalid usize replaced size
In the refactoring that unified the allocation paths, usize was substituted for
size. This worked fine under the default test configuration, but triggered
asserts when we started beefing up our CI testing.

This change fixes the issue, and clarifies the comment describing the argument
selection that it got wrong.
2017-01-25 15:50:59 -08:00
Tamir Duberstein
0874b648e0 Avoid redeclaring glibc's secure_getenv
Avoid the name secure_getenv to avoid redeclaring secure_getenv when
secure_getenv is present but its use is manually disabled via
ac_cv_func_secure_getenv=no.
2017-01-25 11:24:32 -08:00
Tamir Duberstein
b973ec7975 Avoid redeclaring glibc's secure_getenv
Avoid the name secure_getenv to avoid redeclaring secure_getenv when
secure_getenv is present but its use is manually disabled via
ac_cv_func_secure_getenv=no.
2017-01-25 11:22:28 -08:00
Jason Evans
b49c649bc1 Fix lock order reversal during gdump. 2017-01-24 12:50:06 -08:00
Jason Evans
dad74bd3c8 Convert witness_assert_lockless() to witness_assert_lock_depth().
This makes it possible to make lock state assertions about precisely
which locks are held.
2017-01-24 12:50:06 -08:00
Jason Evans
c0cc5db871 Replace tabs following #define with spaces.
This resolves #564.
2017-01-20 21:45:53 -08:00
Jason Evans
f408643a4c Remove extraneous parens around return arguments.
This resolves #540.
2017-01-20 21:43:07 -08:00
Jason Evans
c4c2592c83 Update brace style.
Add braces around single-line blocks, and remove line breaks before
function-opening braces.

This resolves #537.
2017-01-20 21:43:07 -08:00
David Goldblatt
5154ff32ee Unify the allocation paths
This unifies the allocation paths for malloc, posix_memalign, aligned_alloc,
calloc, memalign, valloc, and mallocx, so that they all share common code where
they can.

There's more work that could be done here, but I think this is the smallest
discrete change in this direction.
2017-01-20 12:15:53 -08:00
Jason Evans
9eb1b1c881 Fix --disable-stats support.
Fix numerous regressions that were exposed by --disable-stats, both in
the core library and in the tests.
2017-01-19 18:31:07 -08:00
Jason Evans
66bf773ef2 Test JSON output of malloc_stats_print() and fix bugs.
Implement and test a JSON validation parser.  Use the parser to validate
JSON output from malloc_stats_print(), with a significant subset of
supported output options.

This resolves #551.
2017-01-19 14:05:00 -08:00
Jason Evans
7a61ebe71f Remove -Werror=declaration-after-statement.
This partially resolves #536.
2017-01-19 11:07:42 -08:00
Qi Wang
58424e679d Added stats about number of bytes cached in tcache currently. 2017-01-18 10:55:21 -08:00
Mike Hommey
12ab4383e9 Add dummy implementations for most remaining OSX zone allocator functions
Some system libraries are using malloc_default_zone() and then using
some of the malloc_zone_* API. Under normal conditions, those functions
check the malloc_zone_t/malloc_introspection_t struct for the values
that are allowed to be NULL, so that a NULL deref doesn't happen.

As of OSX 10.12, malloc_default_zone() doesn't return the actual default
zone anymore, but returns a fake, wrapper zone. The wrapper zone defines
all the possible functions in the malloc_zone_t/malloc_introspection_t
struct (almost), and calls the function from the registered default zone
(jemalloc in our case) on its own. Without checking whether the pointers
are NULL.

This means that a system library that calls e.g.
malloc_zone_batch_malloc(malloc_default_zone(), ...) ends up trying to
call jemalloc_zone.batch_malloc, which is NULL, and crash follows.

So as of OSX 10.12, the default zone is required to have all the
functions available (really, the same as the wrapper zone), even if they
do nothing.

This is arguably a bug in libsystem_malloc in OSX 10.12, but jemalloc
still needs to work in that case.
2017-01-17 20:13:28 -08:00
Mike Hommey
0f7376eb62 Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions
The SDK jemalloc is built against might be not be the latest for various
reasons, but the resulting binary ought to work on newer versions of
OSX.

In order to ensure this, we need the fullest definitions possible, so
copy what we need from the latest version of malloc/malloc.h available
on opensource.apple.com.
2017-01-17 20:13:28 -08:00
Mike Hommey
c6943acb3c Add dummy implementations for most remaining OSX zone allocator functions
Some system libraries are using malloc_default_zone() and then using
some of the malloc_zone_* API. Under normal conditions, those functions
check the malloc_zone_t/malloc_introspection_t struct for the values
that are allowed to be NULL, so that a NULL deref doesn't happen.

As of OSX 10.12, malloc_default_zone() doesn't return the actual default
zone anymore, but returns a fake, wrapper zone. The wrapper zone defines
all the possible functions in the malloc_zone_t/malloc_introspection_t
struct (almost), and calls the function from the registered default zone
(jemalloc in our case) on its own. Without checking whether the pointers
are NULL.

This means that a system library that calls e.g.
malloc_zone_batch_malloc(malloc_default_zone(), ...) ends up trying to
call jemalloc_zone.batch_malloc, which is NULL, and crash follows.

So as of OSX 10.12, the default zone is required to have all the
functions available (really, the same as the wrapper zone), even if they
do nothing.

This is arguably a bug in libsystem_malloc in OSX 10.12, but jemalloc
still needs to work in that case.
2017-01-17 20:12:24 -08:00
Mike Hommey
c68bb41793 Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions
The SDK jemalloc is built against might be not be the latest for various
reasons, but the resulting binary ought to work on newer versions of
OSX.

In order to ensure this, we need the fullest definitions possible, so
copy what we need from the latest version of malloc/malloc.h available
on opensource.apple.com.
2017-01-17 20:12:24 -08:00
Jason Evans
1ff09534b5 Fix prof_realloc() regression.
Mostly revert the prof_realloc() changes in
498856f44a (Move slabs out of chunks.) so
that prof_free_sampled_object() is called when appropriate.  Leave the
prof_tctx_[re]set() optimization in place, but add an assertion to
verify that all eight cases are correctly handled.  Add a comment to
make clear the code ordering, so that the regression originally fixed by
ea8d97b897 (Fix
prof_{malloc,free}_sample_object() call order in prof_realloc().) is not
repeated.

This resolves #499.
2017-01-17 15:16:37 -08:00
Jason Evans
de5e1aff2a Formatting/comment fixes. 2017-01-17 15:16:37 -08:00
Jason Evans
8115f05b26 Add nullptr support to sized delete operators. 2017-01-17 14:30:15 -08:00
Jason Evans
41aa41853c Fix style nits. 2017-01-17 14:30:15 -08:00
Qi Wang
e8990dc7c7 Remove redundent stats-merging logic when destroying tcache.
The removed stats merging logic is already taken care of by tcache_flush.
2017-01-17 09:42:39 -08:00
Jason Evans
ffbb7dac3d Remove leading blank lines from function bodies.
This resolves #535.
2017-01-13 14:49:24 -08:00
Jason Evans
87e81e609b Fix indentation. 2017-01-13 14:49:24 -08:00
John Paul Adrian Glaubitz
9389335b86 Use better pre-processor defines for sparc64
Currently, jemalloc detects sparc64 targets by checking whether
__sparc64__ is defined. However, this definition is used on BSD
targets only. Linux targets define both __sparc__ and __arch64__
for sparc64. Since this also works on BSD, rather use __sparc__
and __arch64__ instead of __sparc64__ to detect sparc64 targets.
2017-01-13 09:01:33 -08:00
David Goldblatt
77cccac8cd Break up headers into constituent parts
This is part of a broader change to make header files better represent the
dependencies between one another (see
https://github.com/jemalloc/jemalloc/issues/533). It breaks up component headers
into smaller parts that can be made to have a simpler dependency graph.

For the autogenerated headers (smoothstep.h and size_classes.h), no splitting
was necessary, so I didn't add support to emit multiple headers.
2017-01-12 15:43:51 -08:00
David Goldblatt
94c5d22a4d Remove mb.h, which is unused 2017-01-11 13:24:30 -08:00
John Paul Adrian Glaubitz
77de5f27d8 Use better pre-processor defines for sparc64
Currently, jemalloc detects sparc64 targets by checking whether
__sparc64__ is defined. However, this definition is used on BSD
targets only. Linux targets define both __sparc__ and __arch64__
for sparc64. Since this also works on BSD, rather use __sparc__
and __arch64__ instead of __sparc64__ to detect sparc64 targets.
2017-01-10 17:39:54 -08:00
Jason Evans
edf1bafb2b Implement arena.<i>.destroy .
Add MALLCTL_ARENAS_DESTROYED for accessing destroyed arena stats as an
analogue to MALLCTL_ARENAS_ALL.

This resolves #382.
2017-01-06 18:58:46 -08:00