Commit Graph

685 Commits

Author SHA1 Message Date
Mike Hommey
0f7376eb62 Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions
The SDK jemalloc is built against might be not be the latest for various
reasons, but the resulting binary ought to work on newer versions of
OSX.

In order to ensure this, we need the fullest definitions possible, so
copy what we need from the latest version of malloc/malloc.h available
on opensource.apple.com.
2017-01-17 20:13:28 -08:00
Jason Evans
1ff09534b5 Fix prof_realloc() regression.
Mostly revert the prof_realloc() changes in
498856f44a (Move slabs out of chunks.) so
that prof_free_sampled_object() is called when appropriate.  Leave the
prof_tctx_[re]set() optimization in place, but add an assertion to
verify that all eight cases are correctly handled.  Add a comment to
make clear the code ordering, so that the regression originally fixed by
ea8d97b897 (Fix
prof_{malloc,free}_sample_object() call order in prof_realloc().) is not
repeated.

This resolves #499.
2017-01-17 15:16:37 -08:00
Jason Evans
de5e1aff2a Formatting/comment fixes. 2017-01-17 15:16:37 -08:00
Jason Evans
8115f05b26 Add nullptr support to sized delete operators. 2017-01-17 14:30:15 -08:00
Jason Evans
41aa41853c Fix style nits. 2017-01-17 14:30:15 -08:00
Qi Wang
e8990dc7c7 Remove redundent stats-merging logic when destroying tcache.
The removed stats merging logic is already taken care of by tcache_flush.
2017-01-17 09:42:39 -08:00
Jason Evans
ffbb7dac3d Remove leading blank lines from function bodies.
This resolves #535.
2017-01-13 14:49:24 -08:00
Jason Evans
87e81e609b Fix indentation. 2017-01-13 14:49:24 -08:00
Jason Evans
edf1bafb2b Implement arena.<i>.destroy .
Add MALLCTL_ARENAS_DESTROYED for accessing destroyed arena stats as an
analogue to MALLCTL_ARENAS_ALL.

This resolves #382.
2017-01-06 18:58:46 -08:00
Jason Evans
dc2125cf95 Replace the arenas.initialized mallctl with arena.<i>.initialized . 2017-01-06 18:58:46 -08:00
Jason Evans
6edbedd916 Range-check mib[1] --> arena_ind casts. 2017-01-06 18:58:46 -08:00
Jason Evans
c0a05e6aba Move static ctl_epoch variable into ctl_stats_t (as epoch). 2017-01-06 18:58:45 -08:00
Jason Evans
d778dd2afc Refactor ctl_stats_t.
Refactor ctl_stats_t to be a demand-zeroed non-growing data structure.
To keep the size from being onerous (~60 MiB) on 32-bit systems, convert
the arenas field to contain pointers rather than directly embedded
ctl_arena_stats_t elements.
2017-01-06 18:58:45 -08:00
Jason Evans
0f04bb1d6f Rename the arenas.extend mallctl to arenas.create. 2017-01-06 18:58:45 -08:00
Jason Evans
3dc4e83ccb Add MALLCTL_ARENAS_ALL.
Add the MALLCTL_ARENAS_ALL cpp macro as a fixed index for use
in accessing the arena.<i>.{purge,decay,dss} and stats.arenas.<i>.*
mallctls, and deprecate access via the arenas.narenas index (to be
removed in 6.0.0).
2017-01-06 18:58:45 -08:00
Jason Evans
d0a3129b88 Fix locking in arena_dirty_count().
This was a latent bug, since the function is (intentionally) not used.
2017-01-06 18:58:45 -08:00
Jason Evans
363629df88 Fix allocated_large stats with respect to sampled small allocations. 2017-01-06 18:58:45 -08:00
Jason Evans
5c5ff8d121 Fix arena_large_reset_stats_cancel().
Decrement ndalloc_large rather than incrementing, in order to cancel out
the increment in arena_large_dalloc_stats_update().
2017-01-04 20:26:30 -08:00
Jason Evans
a0dd3a4483 Implement per arena base allocators.
Add/rename related mallctls:
- Add stats.arenas.<i>.base .
- Rename stats.arenas.<i>.metadata to stats.arenas.<i>.internal .
- Add stats.arenas.<i>.resident .

Modify the arenas.extend mallctl to take an optional (extent_hooks_t *)
argument so that it is possible for all base allocations to be serviced
by the specified extent hooks.

This resolves #463.
2016-12-26 18:08:28 -08:00
Jason Evans
a6e86810d8 Refactor purging and splitting/merging.
Split purging into lazy and forced variants.  Use the forced variant for
zeroing dss.

Add support for NULL function pointers as an opt-out mechanism for the
dalloc, commit, decommit, purge_lazy, purge_forced, split, and merge
fields of extent_hooks_t.

Add short-circuiting checks in large_ralloc_no_move_{shrink,expand}() so
that no attempt is made if splitting/merging is not supported.

This resolves #268.
2016-12-26 18:08:16 -08:00
Jason Evans
884fa22b8c Rename arena_decay_t's ndirty to nunpurged. 2016-12-26 17:59:43 -08:00
Jason Evans
411697adcd Use exponential series to size extents.
If virtual memory is retained, allocate extents such that their sizes
form an exponentially growing series.  This limits the number of
disjoint virtual memory ranges so that extent merging can be effective
even if multiple arenas' extent allocation requests are highly
interleaved.

This resolves #462.
2016-12-26 17:59:42 -08:00
Jason Evans
c1baa0a9b7 Add huge page configuration and pages_[no}huge().
Add the --with-lg-hugepage configure option, but automatically configure
LG_HUGEPAGE even if it isn't specified.

Add the pages_[no]huge() functions, which toggle huge page state via
madvise(..., MADV_[NO]HUGEPAGE) calls.
2016-12-26 17:59:34 -08:00
Jason Evans
eab3b180e5 Fix JSON-mode output for !config_stats and/or !config_prof cases.
These bugs were introduced by 0ba5b9b618
(Add "J" (JSON) support to malloc_stats_print().), which was backported
as b599b32280 (with the same bugs except
the inapplicable "metatata" misspelling) and first released in 4.3.0.
2016-12-23 11:15:44 -08:00
Jason Evans
bacb6afc6c Simplify arena_slab_regind().
Rewrite arena_slab_regind() to provide sufficient constant data for
the compiler to perform division strength reduction.  This replaces
more general manual strength reduction that was implemented before
arena_bin_info was compile-time-constant.  It would be possible to
slightly improve on the compiler-generated division code by taking
advantage of range limits that the compiler doesn't know about.
2016-12-23 10:34:34 -08:00
Dave Watson
2319152d9f jemalloc cpp new/delete bindings
Adds cpp bindings for jemalloc, along with necessary autoconf settings.
This is mostly to add sized deallocation support, which can't be added
from C directly.  Sized deallocation is ~10% microbench improvement.

* Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the
  easiest way to get c++14 detection.
* Adds various other changes, like CXXFLAGS, to configure.ac.
* Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic
  unittest.
* Both new and delete are overridden, to ensure jemalloc is used for
  both.
* TODO future enhancement of avoiding extra PLT thunks for new and
  delete - sdallocx and malloc are publicly exported jemalloc symbols,
  using an alias would link them directly.  Unfortunately, was having
  trouble getting it to play nice with jemalloc's namespace support.

Testing:
Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0.  Only gcc >= 5 has sized
deallocation support, verified that the rest build correctly.

Tested mac osx and Centos.

Tested --with-jemalloc-prefix and --without-export.

This resolves #202.
2016-12-12 18:36:06 -08:00
Jason Evans
acb7b1f53e Add --disable-syscall.
This resolves #517.
2016-12-03 16:50:58 -08:00
Jason Evans
5234be2133 Add pthread_atfork(3) feature test.
Some versions of Android provide a pthreads library without providing
pthread_atfork(), so in practice a separate feature test is necessary
for the latter.
2016-11-17 15:14:57 -08:00
Jason Evans
a64123ce13 Refactor madvise(2) configuration.
Add feature tests for the MADV_FREE and MADV_DONTNEED flags to
madvise(2), so that MADV_FREE is detected and used for Linux kernel
versions 4.5 and newer.  Refactor pages_purge() so that on systems which
support both flags, MADV_FREE is preferred over MADV_DONTNEED.

This resolves #387.
2016-11-17 10:31:57 -08:00
Jason Evans
aec5a051e8 Avoid gcc type-limits warnings. 2016-11-16 18:28:38 -08:00
Maks Naumov
95974c0440 Remove size_t -> unsigned -> size_t conversion. 2016-11-16 11:23:31 -08:00
Jason Evans
8a4528bdd1 Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *).
This avoids warnings in some cases, and is otherwise generally good
hygiene.
2016-11-15 15:01:03 -08:00
Jason Evans
a38acf716e Add extent serial numbers.
Add extent serial numbers and use them where appropriate as a sort key
that is higher priority than address, so that the allocation policy
prefers older extents.

This resolves #147.
2016-11-15 13:08:33 -08:00
Jason Evans
c0a667112c Fix arena_reset() crashing bug.
This regression was caused by 498856f44a
(Move slabs out of chunks.).
2016-11-15 10:34:02 -08:00
Jason Evans
cda59f9970 Rename atomic_*_{uint32,uint64,u}() to atomic_*_{u32,u64,zu}().
This change conforms to naming conventions throughout the codebase.
2016-11-07 11:27:48 -08:00
Jason Evans
04b463546e Refactor prng to not use 64-bit atomics on 32-bit platforms.
This resolves #495.
2016-11-07 10:52:44 -08:00
Jason Evans
a967fae362 Fix/simplify extent_recycle() allocation size computations.
Do not call s2u() during alloc_size computation, since any necessary
ceiling increase is taken care of later by extent_first_best_fit() -->
extent_size_quantize_ceil(), and the s2u() call may erroneously cause a
higher quantization result.

Remove an overly strict overflow check that was added in
4a7852137d (Fix extent_recycle()'s
cache-oblivious padding support.).
2016-11-03 23:49:21 -07:00
Jason Evans
4a7852137d Fix extent_recycle()'s cache-oblivious padding support.
Add padding *after* computing the size class, so that the optimal size
class isn't skipped during search for a usable extent.  This regression
was caused by b46261d58b (Implement
cache-oblivious support for huge size classes.).
2016-11-03 22:33:35 -07:00
Jason Evans
ea9961acdb Fix psz/pind edge cases.
Add an "over-size" extent heap in which to store extents which exceed
the maximum size class (plus cache-oblivious padding, if enabled).
Remove psz2ind_clamp() and use psz2ind() instead so that trying to
allocate the maximum size class can in principle succeed.  In practice,
this allows assertions to hold so that OOM errors can be successfully
generated.
2016-11-03 22:33:34 -07:00
Jason Evans
8dd5ea87ca Fix extent_alloc_cache[_locked]() to support decommitted allocation.
Fix extent_alloc_cache[_locked]() to support decommitted allocation, and
use this ability in arena_stash_dirty(), so that decommitted extents are
not needlessly committed during purging.  In practice this does not
happen on any currently supported systems, because both extent merging
and decommit must be implemented; all supported systems implement one
xor the other.
2016-11-03 22:33:23 -07:00
Dave Watson
25f7bbcf28 Fix long spinning in rtree_node_init
rtree_node_init spinlocks the node, allocates, and then sets the node.
This is under heavy contention at the top of the tree if many threads
start to allocate at the same time.

Instead, take a per-rtree sleeping mutex to reduce spinning.  Tested
both pthreads and osx OSSpinLock, and both reduce spinning adequately

Previous benchmark time:
./ttest1 500 100
~15s

New benchmark time:
./ttest1 500 100
.57s
2016-11-02 20:30:53 -07:00
Dave Watson
712fde79fd Check for existance of CPU_COUNT macro before using it.
This resolves #485.
2016-11-02 20:05:40 -07:00
Jason Evans
d82f2b3473 Do not use syscall(2) on OS X 10.12 (deprecated). 2016-11-02 19:18:33 -07:00
Jason Evans
795f6689de Add os_unfair_lock support.
OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended
replacement.
2016-11-02 18:09:45 -07:00
Jason Evans
d9f7b2a430 Fix/refactor zone allocator integration code.
Fix zone_force_unlock() to reinitialize, rather than unlocking mutexes,
since OS X 10.12 cannot tolerate a child unlocking mutexes that were
locked by its parent.

Refactor; this was a side effect of experimenting with zone
{de,re}registration during fork(2).
2016-11-02 18:06:40 -07:00
Jason Evans
7b0a8b74f0 malloc_stats_print() fixes/cleanups.
Fix and clean up various malloc_stats_print() issues caused by
0ba5b9b618 (Add "J" (JSON) support to
malloc_stats_print().).
2016-11-01 15:26:35 -07:00
Jason Evans
0ba5b9b618 Add "J" (JSON) support to malloc_stats_print().
This resolves #474.
2016-10-31 22:30:49 -07:00
Jason Evans
b93f63b3eb Fix extent_rtree acquire() to release element on error.
This resolves #480.
2016-10-31 16:32:33 -07:00
Jason Evans
6c80321aed Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW.
The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC),
whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but
still has resolution (~1ms) that is adequate for our purposes.

This resolves #479.
2016-10-29 22:58:18 -07:00
Jason Evans
d87037a62c Use syscall(2) rather than {open,read,close}(2) during boot.
Some applications wrap various system calls, and if they call the
allocator in their wrappers, unexpected reentry can result.  This is not
a general solution (many other syscalls are spread throughout the code),
but this resolves a bootstrapping issue that is apparently common.

This resolves #443.
2016-10-29 22:41:04 -07:00