Commit Graph

85 Commits

Author SHA1 Message Date
Qi Wang
fb56766ca9 Eagerly purge oversized merged extents.
This change improves memory usage slightly, at virtually no CPU cost.
2019-03-14 17:34:55 -07:00
Qi Wang
e3db480f6f Rename huge_threshold to oversize_threshold.
The keyword huge tend to remind people of huge pages which is not relevent to
the feature.
2019-01-25 13:15:45 -08:00
Qi Wang
f459454afe Avoid potential issues on extent zero-out.
When custom extent_hooks or transparent huge pages are in use, the purging
semantics may change, which means we may not get zeroed pages on repopulating.
Fixing the issue by manually memset for such cases.
2019-01-11 19:16:12 -08:00
Dave Watson
ac34afb403 drop bump_empty_alloc option. Size class lookup support used instead. 2018-10-17 08:50:58 -07:00
Tyler Etzel
b664bd7935 Add logging for sampled allocations
- prof_opt_log flag starts logging automatically at runtime
- prof_log_{start,stop} mallctl for manual control
2018-08-01 13:27:11 -07:00
David Goldblatt
55e5cc1341 SC: Make some key size classes static.
The largest small class, smallest large class, and largest large class may all
be needed down fast paths; to avoid the risk of touching another cache line, we
can make them available as constants.
2018-07-12 20:53:06 -07:00
David Goldblatt
e904f813b4 Hide size class computation behind a layer of indirection.
This class removes almost all the dependencies on size_classes.h, accessing the
data there only via the new module sc.h, which does not depend on any
configuration options.

In a subsequent commit, we'll remove the configure-time size class computations,
doing them at boot time, instead.
2018-07-12 20:53:06 -07:00
gnzlbg
3d29d11ac2 Clean compilation -Wextra
Before this commit jemalloc produced many warnings when compiled with -Wextra
with both Clang and GCC. This commit fixes the issues raised by these warnings
or suppresses them if they were spurious at least for the Clang and GCC
versions covered by CI.

This commit:

* adds `JEMALLOC_DIAGNOSTIC` macros: `JEMALLOC_DIAGNOSTIC_{PUSH,POP}` are
  used to modify the stack of enabled diagnostics. The
  `JEMALLOC_DIAGNOSTIC_IGNORE_...` macros are used to ignore a concrete
  diagnostic.

* adds `JEMALLOC_FALLTHROUGH` macro to explicitly state that falling
  through `case` labels in a `switch` statement is intended

* Removes all UNUSED annotations on function parameters. The warning
  -Wunused-parameter is now disabled globally in
  `jemalloc_internal_macros.h` for all translation units that include
  that header. It is never re-enabled since that header cannot be
  included by users.

* locally suppresses some -Wextra diagnostics:

  * `-Wmissing-field-initializer` is buggy in older Clang and GCC versions,
    where it does not understanding that, in C, `= {0}` is a common C idiom
    to initialize a struct to zero

  * `-Wtype-bounds` is suppressed in a particular situation where a generic
    macro, used in multiple different places, compares an unsigned integer for
    smaller than zero, which is always true.

  * `-Walloc-larger-than-size=` diagnostics warn when an allocation function is
    called with a size that is too large (out-of-range). These are suppressed in
    the parts of the tests where `jemalloc` explicitly does this to test that the
    allocation functions fail properly.

* adds a new CI build bot that runs the log unit test on CI.

Closes #1196 .
2018-07-09 21:40:42 -07:00
Qi Wang
94a88c26f4 Implement huge arena: opt.huge_threshold.
The feature allows using a dedicated arena for huge allocations.  We want the
addtional arena to separate huge allocation because: 1) mixing small extents
with huge ones causes fragmentation over the long run (this feature reduces VM
size significantly); 2) with many arenas, huge extents rarely get reused across
threads; and 3) huge allocations happen way less frequently, therefore no
concerns for lock contention.
2018-06-29 10:35:02 -07:00
Qi Wang
0fadf4a2e3 Add UNUSED to avoid compiler warnings. 2018-04-16 13:50:21 -07:00
David T. Goldblatt
4bf4a1c4ea Pull out arena_bin_info_t and arena_bin_t into their own file.
In the process, kill arena_bin_index, which is unused.  To follow are several
diffs continuing this separation.
2017-12-18 16:29:10 -08:00
David Goldblatt
8261e581be Header refactoring: Pull size helpers out of jemalloc module. 2017-05-31 13:08:45 -07:00
David Goldblatt
44f9bd147a Header refactoring: unify and de-catchall rtree module. 2017-05-31 13:08:45 -07:00
David Goldblatt
18ecbfa89e Header refactoring: unify and de-catchall mutex module 2017-05-24 15:27:30 -07:00
Qi Wang
b693c7868e Implementing opt.background_thread.
Added opt.background_thread to enable background threads, which handles purging
currently.  When enabled, decay ticks will not trigger purging (which will be
left to the background threads).  We limit the max number of threads to NCPUs.
When percpu arena is enabled, set CPU affinity for the background threads as
well.

The sleep interval of background threads is dynamic and determined by computing
number of pages to purge in the future (based on backlog).
2017-05-23 12:26:20 -07:00
David Goldblatt
31b43219db Header refactoring: size_classes module - remove from the catchall 2017-04-24 10:33:21 -07:00
David Goldblatt
bf2dc7e678 Header refactoring: ticker module - remove from the catchall and unify. 2017-04-24 10:33:21 -07:00
David Goldblatt
4d2e4bf5eb Get rid of most of the various inline macros. 2017-04-24 10:33:21 -07:00
David Goldblatt
7ebc83894f Header refactoring: move jemalloc_internal_types.h out of the catch-all 2017-04-18 18:35:03 -07:00
Qi Wang
ccfe68a916 Pass alloc_ctx down profiling path.
With this change, when profiling is enabled, we avoid doing redundant rtree
lookups. Also changed dalloc_atx_t to alloc_atx_t, as it's now used on
allocation path as well (to speed up profiling).
2017-04-12 13:55:39 -07:00
Qi Wang
f35213bae4 Pass dalloc_ctx down the sdalloc path.
This avoids redundant rtree lookups.
2017-04-12 13:55:39 -07:00
Qi Wang
bfa530b75b Pass dealloc_ctx down free() fast path.
This gets rid of the redundent rtree lookup down fast path.
2017-04-11 09:58:12 -07:00
Jason Evans
32e7cf51cd Further specialize arena_[s]dalloc() tcache fast path.
Use tsd_rtree_ctx() rather than tsdn_rtree_ctx() when tcache is
non-NULL, in order to avoid an extra branch (and potentially extra stack
space) in the fast path.
2017-03-22 18:33:32 -07:00
Jason Evans
5e67fbc367 Push down iealloc() calls.
Call iealloc() as deep into call chains as possible without causing
redundant calls.
2017-03-22 18:33:32 -07:00
Jason Evans
51a2ec92a1 Remove extent dereferences from the deallocation fast paths. 2017-03-22 18:33:32 -07:00
Jason Evans
4f341412e5 Remove extent arg from isalloc() and arena_salloc(). 2017-03-22 18:33:32 -07:00
Jason Evans
99d68445ef Incorporate szind/slab into rtree leaves.
Expand and restructure the rtree API such that all common operations can
be achieved with minimal work, regardless of whether the rtree leaf
fields are independent versus packed into a single atomic pointer.
2017-03-22 18:33:32 -07:00
Jason Evans
f50d6009fe Remove binind field from arena_slab_data_t.
binind is now redundant; the containing extent_t's szind field always
provides the same value.
2017-03-22 18:33:32 -07:00
Jason Evans
e8921cf2eb Convert extent_t's usize to szind.
Rather than storing usize only for large (and prof-promoted)
allocations, store the size class index for allocations that reside
within the extent, such that the size class index is valid for all
extents that contain extant allocations, and invalid otherwise (mainly
to make debugging simpler).
2017-03-22 18:33:32 -07:00
Jason Evans
64e458f5cd Implement two-phase decay-based purging.
Split decay-based purging into two phases, the first of which uses lazy
purging to convert dirty pages to "muzzy", and the second of which uses
forced purging, decommit, or unmapping to convert pages to clean or
destroy them altogether.  Not all operating systems support lazy
purging, yet the application may provide extent hooks that implement
lazy purging, so care must be taken to dynamically omit the first phase
when necessary.

The mallctl interfaces change as follows:
- opt.decay_time --> opt.{dirty,muzzy}_decay_time
- arena.<i>.decay_time --> arena.<i>.{dirty,muzzy}_decay_time
- arenas.decay_time --> arenas.{dirty,muzzy}_decay_time
- stats.arenas.<i>.pdirty --> stats.arenas.<i>.p{dirty,muzzy}
- stats.arenas.<i>.{npurge,nmadvise,purged} -->
  stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged}

This resolves #521.
2017-03-15 13:13:47 -07:00
Jason Evans
f8fee6908d Synchronize arena->decay with arena->decay.mtx.
This removes the last use of arena->lock.
2017-02-16 09:39:46 -08:00
Jason Evans
f408643a4c Remove extraneous parens around return arguments.
This resolves #540.
2017-01-20 21:43:07 -08:00
Jason Evans
c4c2592c83 Update brace style.
Add braces around single-line blocks, and remove line breaks before
function-opening braces.

This resolves #537.
2017-01-20 21:43:07 -08:00
Jason Evans
ffbb7dac3d Remove leading blank lines from function bodies.
This resolves #535.
2017-01-13 14:49:24 -08:00
David Goldblatt
77cccac8cd Break up headers into constituent parts
This is part of a broader change to make header files better represent the
dependencies between one another (see
https://github.com/jemalloc/jemalloc/issues/533). It breaks up component headers
into smaller parts that can be made to have a simpler dependency graph.

For the autogenerated headers (smoothstep.h and size_classes.h), no splitting
was necessary, so I didn't add support to emit multiple headers.
2017-01-12 15:43:51 -08:00