Commit Graph

25 Commits

Author SHA1 Message Date
Qi Wang
37b8913925 Add support for sharded bins within an arena.
This makes it possible to have multiple set of bins in an arena, which improves
arena scalability because the bins (especially the small ones) are always the
limiting factor in production workload.

A bin shard is picked on allocation; each extent tracks the bin shard id for
deallocation.  The shard size will be determined using runtime options.
2018-12-03 17:17:03 -08:00
Dave Watson
b23336af96 mutex: fix trylock spin wait contention
If there are 3 or more threads spin-waiting on the same mutex,
there will be excessive exclusive cacheline contention because
pthread_trylock() immediately tries to CAS in a new value, instead
of first checking if the lock is locked.

This diff adds a 'locked' hint flag, and we will only spin wait
without trylock()ing while set.  I don't know of any other portable
way to get the same behavior as pthread_mutex_lock().

This is pretty easy to test via ttest, e.g.

./ttest1 500 3 10000 1 100

Throughput is nearly 3x as fast.

This blames to the mutex profiling changes, however, we almost never
have 3 or more threads contending in properly configured production
workloads, but still worth fixing.
2018-11-28 15:17:02 -08:00
Qi Wang
43f3b1ad0c Deprecate OSSpinLock. 2018-11-14 08:44:05 -08:00
David Carlier
0771ff2cea FreeBSD build changes and allow to run the tests. 2018-08-09 10:41:20 -07:00
gnzlbg
3d29d11ac2 Clean compilation -Wextra
Before this commit jemalloc produced many warnings when compiled with -Wextra
with both Clang and GCC. This commit fixes the issues raised by these warnings
or suppresses them if they were spurious at least for the Clang and GCC
versions covered by CI.

This commit:

* adds `JEMALLOC_DIAGNOSTIC` macros: `JEMALLOC_DIAGNOSTIC_{PUSH,POP}` are
  used to modify the stack of enabled diagnostics. The
  `JEMALLOC_DIAGNOSTIC_IGNORE_...` macros are used to ignore a concrete
  diagnostic.

* adds `JEMALLOC_FALLTHROUGH` macro to explicitly state that falling
  through `case` labels in a `switch` statement is intended

* Removes all UNUSED annotations on function parameters. The warning
  -Wunused-parameter is now disabled globally in
  `jemalloc_internal_macros.h` for all translation units that include
  that header. It is never re-enabled since that header cannot be
  included by users.

* locally suppresses some -Wextra diagnostics:

  * `-Wmissing-field-initializer` is buggy in older Clang and GCC versions,
    where it does not understanding that, in C, `= {0}` is a common C idiom
    to initialize a struct to zero

  * `-Wtype-bounds` is suppressed in a particular situation where a generic
    macro, used in multiple different places, compares an unsigned integer for
    smaller than zero, which is always true.

  * `-Walloc-larger-than-size=` diagnostics warn when an allocation function is
    called with a size that is too large (out-of-range). These are suppressed in
    the parts of the tests where `jemalloc` explicitly does this to test that the
    allocation functions fail properly.

* adds a new CI build bot that runs the log unit test on CI.

Closes #1196 .
2018-07-09 21:40:42 -07:00
David Goldblatt
18ecbfa89e Header refactoring: unify and de-catchall mutex module 2017-05-24 15:27:30 -07:00
David Goldblatt
77cccac8cd Break up headers into constituent parts
This is part of a broader change to make header files better represent the
dependencies between one another (see
https://github.com/jemalloc/jemalloc/issues/533). It breaks up component headers
into smaller parts that can be made to have a simpler dependency graph.

For the autogenerated headers (smoothstep.h and size_classes.h), no splitting
was necessary, so I didn't add support to emit multiple headers.
2017-01-12 15:43:51 -08:00
Jason Evans
795f6689de Add os_unfair_lock support.
OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended
replacement.
2016-11-02 18:09:45 -07:00
Jason Evans
e75e9be130 Add rtree element witnesses. 2016-06-03 12:27:41 -07:00
Jason Evans
73d3d58dc2 Optimize witness fast path.
Short-circuit commonly called witness functions so that they only
execute in debug builds, and remove equivalent guards from mutex
functions.  This avoids pointless code execution in
witness_assert_lockless(), which is typically called twice per
allocation/deallocation function invocation.

Inline commonly called witness functions so that optimized builds can
completely remove calls as dead code.
2016-05-11 15:38:06 -07:00
Jason Evans
c1e00ef2a6 Resolve bootstrapping issues when embedded in FreeBSD libc.
b2c0d6322d (Add witness, a simple online
locking validator.) caused a broad propagation of tsd throughout the
internal API, but tsd_fetch() was designed to fail prior to tsd
bootstrapping.  Fix this by splitting tsd_t into non-nullable tsd_t and
nullable tsdn_t, and modifying all internal APIs that do not critically
rely on tsd to take nullable pointers.  Furthermore, add the
tsd_booted_get() function so that tsdn_fetch() can probe whether tsd
bootstrapping is complete and return NULL if not.  All dangerous
conversions of nullable pointers are tsdn_tsd() calls that assert-fail
on invalid conversion.
2016-05-10 22:51:33 -07:00
Jason Evans
b6e07d2389 Fix malloc_mutex_assert_[not_]owner() for --enable-lazy-lock case. 2016-04-18 15:42:09 -07:00
Jason Evans
1b5830178f Fix malloc_mutex_[un]lock() to conditionally check witness.
Also remove tautological cassert(config_debug) calls.
2016-04-17 13:44:59 -07:00
Jason Evans
b2c0d6322d Add witness, a simple online locking validator.
This resolves #358.
2016-04-14 02:09:28 -07:00
Matthijs
a1aaf949a5 Optimizations for Windows
- Set opt_lg_chunk based on run-time OS setting
- Verify LG_PAGE is compatible with run-time OS setting
- When targeting Windows Vista or newer, use SRWLOCK instead of CRITICAL_SECTION
- When targeting Windows Vista or newer, statically initialize init_lock
2015-06-25 22:53:58 +02:00
Eric Wong
4dcf04bfc0 correctly detect adaptive mutexes in pthreads
PTHREAD_MUTEX_ADAPTIVE_NP is an enum on glibc and not a macro,
we must test for their existence by attempting compilation.
2014-09-29 16:10:40 -07:00
Mike Hommey
a19e87fbad Add support for Mingw 2012-04-21 21:27:46 -07:00
Jason Evans
bedceea2a8 Fix isthreaded-related build breakage. 2012-04-20 14:12:30 -07:00
Jason Evans
633aaff967 Postpone mutex initialization on FreeBSD.
Postpone mutex initialization on FreeBSD until after base allocation is
safe.
2012-04-03 19:25:30 -07:00
Jason Evans
1e6138c88c Remove malloc_mutex_trylock().
Remove malloc_mutex_trylock(); it has not been used for quite some time.
2012-03-24 19:36:27 -07:00
Jason Evans
41b6afb834 Port to FreeBSD.
Use FreeBSD-specific functions (_pthread_mutex_init_calloc_cb(),
_malloc_{pre,post}fork()) to avoid bootstrapping issues due to
allocation in libc and libthr.

Add malloc_strtoumax() and use it instead of strtoul().  Disable
validation code in malloc_vsnprintf() and malloc_strtoumax() until
jemalloc is initialized.  This is necessary because locale
initialization causes allocation for both vsnprintf() and strtoumax().

Force the lazy-lock feature on in order to avoid pthread_self(),
because it causes allocation.

Use syscall(SYS_write, ...) rather than write(...), because libthr wraps
write() and causes allocation.  Without this workaround, it would not be
possible to print error messages in malloc_conf_init() without
substantially reworking bootstrapping.

Fix choose_arena_hard() to look at how many threads are assigned to the
candidate choice, rather than checking whether the arena is
uninitialized.  This bug potentially caused more arenas to be
initialized than necessary.
2012-02-02 23:09:53 -08:00
Jason Evans
6da5418ded Remove ephemeral mutexes.
Remove ephemeral mutexes from the prof machinery, and remove
malloc_mutex_destroy().  This simplifies mutex management on systems
that call malloc()/free() inside pthread_mutex_{create,destroy}().

Add atomic_*_u() for operation on unsigned values.

Fix prof_printf() to call malloc_vsnprintf() rather than
malloc_snprintf().
2012-03-23 18:05:51 -07:00
Jason Evans
4e2e3dd9cf Fix fork-related bugs.
Acquire/release arena bin locks as part of the prefork/postfork.  This
bug made deadlock in the child between fork and exec a possibility.

Split jemalloc_postfork() into jemalloc_postfork_{parent,child}() so
that the child can reinitialize mutexes rather than unlocking them.  In
practice, this bug tended not to cause problems.
2012-03-13 16:31:41 -07:00
Jason Evans
7372b15a31 Reduce cpp conditional logic complexity.
Convert configuration-related cpp conditional logic to use static
constant variables, e.g.:

  #ifdef JEMALLOC_DEBUG
    [...]
  #endif

becomes:

  if (config_debug) {
    [...]
  }

The advantage is clearer, more concise code.  The main disadvantage is
that data structures no longer have conditionally defined fields, so
they pay the cost of all fields regardless of whether they are used.  In
practice, this is only a minor concern; config_stats will go away in an
upcoming change, and config_prof is the only other major feature that
depends on more than a few special-purpose fields.
2012-02-10 20:22:09 -08:00
Jason Evans
7427525c28 Move repo contents in jemalloc/ to top level. 2011-03-31 20:36:17 -07:00