Commit Graph

169 Commits

Author SHA1 Message Date
Qi Wang
e422fa8e7e Add arena.i.retain_grow_limit
This option controls the max size when grow_retained.  This is useful when we
have customized extent hooks reserving physical memory (e.g. 1G huge pages).
Without this feature, the default increasing sequence could result in fragmented
and wasted physical memory.
2017-11-03 13:53:33 -07:00
Qi Wang
e55c3ca267 Add stats for metadata_thp.
Report number of THPs used in arena and aggregated stats.
2017-08-30 16:47:32 -07:00
Qi Wang
47b20bb654 Change opt.metadata_thp to [disabled,auto,always].
To avoid the high RSS caused by THP + low usage arena (i.e. THP becomes a
significant percentage), added a new "auto" option which will only start using
THP after a base allocator used up the first THP region.  Starting from the
second hugepage (in a single arena), "auto" behaves the same as "always",
i.e. madvise hugepage right away.
2017-08-30 16:47:32 -07:00
Qi Wang
8fdd9a5797 Implement opt.metadata_thp
This option enables transparent huge page for base allocators (require
MADV_HUGEPAGE support).
2017-08-11 14:51:20 -07:00
Qi Wang
57beeb2fcb Switch ctl to explicitly use tsd instead of tsdn. 2017-06-23 13:27:53 -07:00
Jason Evans
d49ac4c709 Fix assertion typos.
Reported by Conrad Meyer.
2017-06-23 11:48:00 -07:00
Qi Wang
ae93fb08e2 Pass tsd to tcache_flush(). 2017-06-15 17:55:53 -07:00
Qi Wang
a4d6fe73cf Only abort on dlsym when necessary.
If neither background_thread nor lazy_lock is in use, do not abort on dlsym
errors.
2017-06-14 13:27:41 -07:00
Qi Wang
394df9519d Combine background_thread started / paused into state. 2017-06-12 08:56:14 -07:00
Qi Wang
464cb60490 Move background thread creation to background_thread_0.
To avoid complications, avoid invoking pthread_create "internally", instead rely
on thread0 to launch new threads, and also terminating threads when asked.
2017-06-12 08:56:14 -07:00
Qi Wang
73713fbb27 Drop high rank locks when creating threads.
Avoid holding arenas_lock and background_thread_lock when creating background
threads, because pthread_create may take internal locks, and potentially cause
deadlock with jemalloc internal locks.
2017-06-08 10:02:18 -07:00
Qi Wang
3a813946fb Take background thread lock when setting extent hooks. 2017-06-05 10:56:25 -07:00
Qi Wang
340071f0cf Set isthreaded when enabling background_thread. 2017-06-01 17:34:49 -07:00
Jason Evans
b511232fcd Refactor/fix background_thread/percpu_arena bootstrapping.
Refactor bootstrapping such that dlsym() is called during the
bootstrapping phase that can tolerate reentrant allocation.
2017-06-01 08:55:27 -07:00
David Goldblatt
8261e581be Header refactoring: Pull size helpers out of jemalloc module. 2017-05-31 13:08:45 -07:00
David Goldblatt
98774e64a4 Header refactoring: unify and de-catchall extent_mmap module. 2017-05-31 13:08:45 -07:00
David Goldblatt
93284bb53d Header refactoring: unify and de-catchall extent_dss. 2017-05-31 13:08:45 -07:00
Jason Evans
c606a87d2a Add the --disable-thp option to support cross compiling.
This resolves #669.
2017-05-30 11:30:54 -07:00
Qi Wang
d5ef5ae934 Add opt.stats_print_opts.
The value is passed to atexit(3)-triggered malloc_stats_print() calls.
2017-05-29 11:54:00 -07:00
Qi Wang
b86d271cbf Added opt_abort_conf: abort on invalid config options. 2017-05-26 21:14:28 -07:00
David Goldblatt
18ecbfa89e Header refactoring: unify and de-catchall mutex module 2017-05-24 15:27:30 -07:00
Qi Wang
5f5ed2198e Add profiling for the background thread mutex. 2017-05-23 12:26:20 -07:00
Qi Wang
2bee0c6251 Add background thread related stats. 2017-05-23 12:26:20 -07:00
Qi Wang
b693c7868e Implementing opt.background_thread.
Added opt.background_thread to enable background threads, which handles purging
currently.  When enabled, decay ticks will not trigger purging (which will be
left to the background threads).  We limit the max number of threads to NCPUs.
When percpu arena is enabled, set CPU affinity for the background threads as
well.

The sleep interval of background threads is dynamic and determined by computing
number of pages to purge in the future (based on backlog).
2017-05-23 12:26:20 -07:00
David Goldblatt
26c792e61a Allow mutexes to take a lock ordering enum at construction.
This lets us specify whether and how mutexes of the same rank are allowed to be
acquired.  Currently, we only allow two polices (only a single mutex at a given
rank at a time, and mutexes acquired in ascending order), but we can plausibly
allow more (e.g. the "release uncontended mutexes before blocking").
2017-05-19 14:21:27 -07:00
Jason Evans
6e62c62862 Refactor *decay_time into *decay_ms.
Support millisecond resolution for decay times.  Among other use cases
this makes it possible to specify a short initial dirty-->muzzy decay
phase, followed by a longer muzzy-->clean decay phase.

This resolves #812.
2017-05-18 11:33:45 -07:00
Qi Wang
baf3e294e0 Add stats: arena uptime. 2017-05-18 10:04:28 -07:00
Jason Evans
b9ab04a191 Refactor !opt.munmap to opt.retain. 2017-04-29 09:24:12 -07:00
David Goldblatt
89e2d3c12b Header refactoring: ctl - unify and remove from catchall.
In order to do this, we introduce the mutex_prof module, which breaks a circular
dependency between ctl and prof.
2017-04-25 09:51:38 -07:00
Jason Evans
c67c3e4a63 Replace --disable-munmap with opt.munmap.
Control use of munmap(2) via a run-time option rather than a
compile-time option (with the same per platform default).  The old
behavior of --disable-munmap can be achieved with
--with-malloc-conf=munmap:false.

This partially resolves #580.
2017-04-24 20:37:16 -07:00
David Goldblatt
31b43219db Header refactoring: size_classes module - remove from the catchall 2017-04-24 10:33:21 -07:00
David Goldblatt
4d2e4bf5eb Get rid of most of the various inline macros. 2017-04-24 10:33:21 -07:00
Jason Evans
b2a8453a3f Remove --disable-tls.
This option is no longer useful, because TLS is correctly configured
automatically on all supported platforms.

This partially resolves #580.
2017-04-21 11:12:29 -07:00
Jason Evans
4403c9ab44 Remove --disable-tcache.
Simplify configuration by removing the --disable-tcache option, but
replace the testing for that configuration with
--with-malloc-conf=tcache:false.

Fix the thread.arena and thread.tcache.flush mallctls to work correctly
if tcache is disabled.

This partially resolves #580.
2017-04-21 10:06:12 -07:00
Qi Wang
5aa46f027d Bypass extent tracking for auto arenas.
Tracking extents is required by arena_reset.  To support this, the extent
linkage was used for tracking 1) large allocations, and 2) full slabs.  However
modifying the extent linkage could be an expensive operation as it likely incurs
cache misses.  Since we forbid arena_reset on auto arenas, let's bypass the
linkage operations for auto arenas.
2017-04-21 00:29:18 -07:00
David Goldblatt
418d96a86c Header refactoring: unify nstime.h and move it out of the catch-all 2017-04-18 18:35:03 -07:00
David Goldblatt
d9ec36e22d Header refactoring: move assert.h out of the catch-all 2017-04-18 18:35:03 -07:00
David Goldblatt
f692e6c214 Header refactoring: move util.h out of the catchall 2017-04-18 18:35:03 -07:00
Jason Evans
881fbf762f Prefer old/low extent_t structures during reuse.
Rather than using a LIFO queue to track available extent_t structures,
use a red-black tree, and always choose the oldest/lowest available
during reuse.
2017-04-17 14:47:45 -07:00
David Goldblatt
743d940dc3 Header refactoring: Split up jemalloc_internal.h
This is a biggy.  jemalloc_internal.h has been doing multiple jobs for a while
now:
- The source of system-wide definitions.
- The catch-all include file.
- The module header file for jemalloc.c

This commit splits up this functionality.  The system-wide definitions
responsibility has moved to jemalloc_preamble.h.  The catch-all include file is
now jemalloc_internal_includes.h.  The module headers for jemalloc.c are now in
jemalloc_internal_[externs|inlines|types].h, just as they are for the other
modules.
2017-04-11 11:52:30 -07:00
Qi Wang
fde3e20cc0 Integrate auto tcache into TSD.
The embedded tcache is initialized upon tsd initialization.  The avail arrays
for the tbins will be allocated / deallocated accordingly during init / cleanup.

With this change, the pointer to the auto tcache will always be available, as
long as we have access to the TSD.  tcache_available() (called in tcache_get())
is provided to check if we should use tcache.
2017-04-07 09:55:14 -07:00
Qi Wang
362e356675 Profile per arena base mutex, instead of just a0. 2017-03-23 00:03:28 -07:00
Qi Wang
d3fde1c124 Refactor mutex profiling code with x-macros. 2017-03-23 00:03:28 -07:00
Qi Wang
f6698ec1e6 Switch to nstime_t for the time related fields in mutex profiling. 2017-03-23 00:03:28 -07:00
Qi Wang
20b8c70e9f Added extents_dirty / _muzzy mutexes, as well as decay_dirty / _muzzy. 2017-03-23 00:03:28 -07:00
Qi Wang
64c5f5c174 Added "stats.mutexes.reset" mallctl to reset all mutex stats.
Also switched from the term "lock" to "mutex".
2017-03-23 00:03:28 -07:00
Qi Wang
bd2006a41b Added JSON output for lock stats.
Also added option 'x' to malloc_stats() to bypass lock section.
2017-03-23 00:03:28 -07:00
Qi Wang
ca9074deff Added lock profiling and output for global locks (ctl, prof and base). 2017-03-23 00:03:28 -07:00
Qi Wang
0fb5c0e853 Add arena lock stats output. 2017-03-23 00:03:28 -07:00
Qi Wang
a4f176af57 Output bin lock profiling results to malloc_stats.
Two counters are included for the small bins: lock contention rate, and
max lock waiting time.
2017-03-23 00:03:28 -07:00