Commit Graph

752 Commits

Author SHA1 Message Date
Qi Wang
cb032781bd Add extent_grow_mtx in pre_ / post_fork handlers.
This fixed the issue that could cause the child process to stuck after fork.
2017-06-29 17:01:18 -07:00
Qi Wang
57beeb2fcb Switch ctl to explicitly use tsd instead of tsdn. 2017-06-23 13:27:53 -07:00
Qi Wang
425463a446 Check arena in current context in pre_reentrancy. 2017-06-23 13:27:53 -07:00
Qi Wang
d6eb8ac8f3 Set reentrancy when invoking customized extent hooks.
Customized extent hooks may malloc / free thus trigger reentry.  Support this
behavior by adding reentrancy on hook functions.
2017-06-23 13:27:53 -07:00
Qi Wang
a3f4977217 Add thread name for background threads. 2017-06-23 10:54:54 -07:00
Qi Wang
52fc887b49 Avoid inactivity_check within background threads.
Passing is_background_thread down the decay path, so that background thread
itself won't attempt inactivity_check.  This fixes an issue with background
thread doing trylock on a mutex it already owns.
2017-06-22 16:53:58 -07:00
Jason Evans
37f3fa0941 Mask signals during background thread creation.
This prevents signals from being inadvertently delivered to background
threads.
2017-06-20 17:47:38 -07:00
Qi Wang
9b1befabbb Add minimal initialized TSD.
We use the minimal_initilized tsd (which requires no cleanup) for free()
specifically, if tsd hasn't been initialized yet.

Any other activity will transit the state from minimal to normal.  This is to
workaround the case where a thread has no malloc calls in its lifetime until
during thread termination, free() happens after tls destructors.
2017-06-15 17:55:53 -07:00
Qi Wang
ae93fb08e2 Pass tsd to tcache_flush(). 2017-06-15 17:55:53 -07:00
Qi Wang
a4d6fe73cf Only abort on dlsym when necessary.
If neither background_thread nor lazy_lock is in use, do not abort on dlsym
errors.
2017-06-14 13:27:41 -07:00
Qi Wang
394df9519d Combine background_thread started / paused into state. 2017-06-12 08:56:14 -07:00
Qi Wang
464cb60490 Move background thread creation to background_thread_0.
To avoid complications, avoid invoking pthread_create "internally", instead rely
on thread0 to launch new threads, and also terminating threads when asked.
2017-06-12 08:56:14 -07:00
Jason Evans
13685ab1b7 Normalize background thread configuration.
Also fix a compilation error #ifndef JEMALLOC_PTHREAD_CREATE_WRAPPER.
2017-06-08 23:01:26 -07:00
Jason Evans
faaf458bad Remove redundant typedefs.
Pre-C11 compilers do not support typedef redefinition.
2017-06-08 13:28:57 -07:00
Qi Wang
5642f03cae Add internal tsd for background_thread. 2017-06-08 10:02:18 -07:00
Qi Wang
73713fbb27 Drop high rank locks when creating threads.
Avoid holding arenas_lock and background_thread_lock when creating background
threads, because pthread_create may take internal locks, and potentially cause
deadlock with jemalloc internal locks.
2017-06-08 10:02:18 -07:00
Qi Wang
00869e39a3 Make tsd no-cleanup during tsd reincarnation.
Since tsd cleanup isn't guaranteed when reincarnated, we set up tsd in a way
that needs no cleanup, by making it going through slow path instead.
2017-06-07 11:03:49 -07:00
Qi Wang
3a813946fb Take background thread lock when setting extent hooks. 2017-06-05 10:56:25 -07:00
Qi Wang
340071f0cf Set isthreaded when enabling background_thread. 2017-06-01 17:34:49 -07:00
Jason Evans
b511232fcd Refactor/fix background_thread/percpu_arena bootstrapping.
Refactor bootstrapping such that dlsym() is called during the
bootstrapping phase that can tolerate reentrant allocation.
2017-06-01 08:55:27 -07:00
David Goldblatt
8261e581be Header refactoring: Pull size helpers out of jemalloc module. 2017-05-31 13:08:45 -07:00
David Goldblatt
041e041e1f Header refactoring: unify and de-catchall mutex_pool. 2017-05-31 13:08:45 -07:00
David Goldblatt
98774e64a4 Header refactoring: unify and de-catchall extent_mmap module. 2017-05-31 13:08:45 -07:00
David Goldblatt
93284bb53d Header refactoring: unify and de-catchall extent_dss. 2017-05-31 13:08:45 -07:00
David Goldblatt
44f9bd147a Header refactoring: unify and de-catchall rtree module. 2017-05-31 13:08:45 -07:00
Jason Evans
c606a87d2a Add the --disable-thp option to support cross compiling.
This resolves #669.
2017-05-30 11:30:54 -07:00
Jason Evans
168793a1c1 Fix extent_grow_next management.
Fix management of extent_grow_next to serialize operations that may grow
retained memory.  This assures that the sizes of the newly allocated
extents correspond to the size classes in the intended growth sequence.

Fix management of extent_grow_next to skip size classes if a request is
too large to be satisfied by the next size in the growth sequence.  This
avoids the potential for an arbitrary number of requests to bypass
triggering extent_grow_next increases.

This resolves #858.
2017-05-29 17:27:18 -07:00
Qi Wang
d5ef5ae934 Add opt.stats_print_opts.
The value is passed to atexit(3)-triggered malloc_stats_print() calls.
2017-05-29 11:54:00 -07:00
Qi Wang
b86d271cbf Added opt_abort_conf: abort on invalid config options. 2017-05-26 21:14:28 -07:00
Qi Wang
927239b910 Cleanup smoothstep.sh / .h.
h_step_sum was used to compute moving sum.  Not in use anymore.
2017-05-25 16:52:10 -07:00
David Goldblatt
18ecbfa89e Header refactoring: unify and de-catchall mutex module 2017-05-24 15:27:30 -07:00
David Goldblatt
9f822a1fd7 Header refactoring: unify and de-catchall witness code. 2017-05-24 15:27:30 -07:00
Jason Evans
36195c8f4d Disable percpu_arena by default. 2017-05-23 15:32:50 -07:00
Qi Wang
eeefdf3ce8 Fix # of unpurged pages in decay algorithm.
When # of dirty pages move below npages_limit (e.g. they are reused), we should
not lower number of unpurged pages because that would cause the reused pages to
be double counted in the backlog (as a result, decay happen slower than it
should).  Instead, set number of unpurged to the greater of current npages and
npages_limit.

Added an assertion: the ceiling # of pages should be greater than npages_limit.
2017-05-23 13:48:30 -07:00
Qi Wang
0eae838b0d Check for background thread inactivity on extents_dalloc.
To avoid background threads sleeping forever with idle arenas, we eagerly check
background threads' sleep time after extents_dalloc, and signal the thread if
necessary.
2017-05-23 12:26:20 -07:00
Qi Wang
5f5ed2198e Add profiling for the background thread mutex. 2017-05-23 12:26:20 -07:00
Qi Wang
2bee0c6251 Add background thread related stats. 2017-05-23 12:26:20 -07:00
Qi Wang
b693c7868e Implementing opt.background_thread.
Added opt.background_thread to enable background threads, which handles purging
currently.  When enabled, decay ticks will not trigger purging (which will be
left to the background threads).  We limit the max number of threads to NCPUs.
When percpu arena is enabled, set CPU affinity for the background threads as
well.

The sleep interval of background threads is dynamic and determined by computing
number of pages to purge in the future (based on backlog).
2017-05-23 12:26:20 -07:00
David Goldblatt
3f685e8824 Protect the rtree/extent interactions with a mutex pool.
Instead of embedding a lock bit in rtree leaf elements, we associate extents
with a small set of mutexes.  This gets us two things:

- We can use the system mutexes.  This (hypothetically) protects us from
  priority inversion, and lets us stop doing a backoff/sleep loop, instead
  opting for precise wakeups from the mutex.
- Cuts down on the number of mutex acquisitions we have to do (from 4 in the
  worst case to two).

We end up simplifying most of the rtree code (which no longer has to deal with
locking or concurrency at all), at the cost of additional complexity in the
extent code: since the mutex protecting the rtree leaf elements is determined by
reading the extent out of those elements, the initial read is racy, so that we
may acquire an out of date mutex.  We re-check the extent in the leaf after
acquiring the mutex to protect us from this race.
2017-05-19 14:21:27 -07:00
David Goldblatt
26c792e61a Allow mutexes to take a lock ordering enum at construction.
This lets us specify whether and how mutexes of the same rank are allowed to be
acquired.  Currently, we only allow two polices (only a single mutex at a given
rank at a time, and mutexes acquired in ascending order), but we can plausibly
allow more (e.g. the "release uncontended mutexes before blocking").
2017-05-19 14:21:27 -07:00
Jason Evans
6e62c62862 Refactor *decay_time into *decay_ms.
Support millisecond resolution for decay times.  Among other use cases
this makes it possible to specify a short initial dirty-->muzzy decay
phase, followed by a longer muzzy-->clean decay phase.

This resolves #812.
2017-05-18 11:33:45 -07:00
Qi Wang
baf3e294e0 Add stats: arena uptime. 2017-05-18 10:04:28 -07:00
Jason Evans
18a83681cf Refactor (MALLOCX_ARENA_MAX + 1) to be MALLOCX_ARENA_LIMIT.
This resolves #673.
2017-05-14 10:14:23 -07:00
Jason Evans
909f0482e4 Automatically generate private symbol name mangling macros.
Rather than using a manually maintained list of internal symbols to
drive name mangling, add a compilation phase to automatically extract
the list of internal symbols.

This resolves #677.
2017-05-11 23:06:54 -07:00
Jason Evans
a4ae9707da Remove unused private_unnamespace infrastructure. 2017-05-11 23:06:54 -07:00
Jason Evans
a268af5085 Stop depending on JEMALLOC_N() for function interception during testing.
Instead, always define function pointers for interceptable functions,
but mark them const unless testing, so that the compiler can optimize
out the pointer dereferences.
2017-05-11 23:06:54 -07:00
Jason Evans
81ef365622 Avoid compiler warnings on Windows. 2017-05-11 18:06:20 -07:00
Jason Evans
11d2f39d96 Remove mutex_prof_data_t redeclaration.
Redeclaration causes compilations failures with e.g. gcc 4.2.1 on
FreeBSD.  This regression was introduced by
89e2d3c12b (Header refactoring: ctl -
unify and remove from catchall.).
2017-05-11 10:49:43 -07:00
Jason Evans
0798fe6e70 Fix rtree_leaf_elm_szind_slab_update().
Re-read the leaf element when atomic CAS fails due to a race with
another thread that has locked the leaf element, since
atomic_compare_exchange_strong_p() overwrites the expected value with
the actual value on failure.  This regression was introduced by
0ee0e0c155 (Implement compact rtree leaf
element representation.).

This resolves #798.
2017-05-03 08:52:33 -07:00
Jason Evans
344dd342dd rtree_leaf_elm_extent_write() --> rtree_leaf_elm_extent_lock_write()
Refactor rtree_leaf_elm_extent_write() as
rtree_leaf_elm_extent_lock_write(), so that whether the leaf element is
currently acquired is separate from what lock state to write.  This
allows for a relaxed atomic read when releasing the lock.
2017-05-03 08:52:33 -07:00