pthread_mutex_lock() can call malloc() on OS X (!!!), which causes
deadlock. Work around this by using spinlocks that are built of more
primitive stuff.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills. Fix this in two places:
- arena_run_t: Use a new bitmap implementation to track which regions
are available. Furthermore, revert to preferring the
lowest available region (as jemalloc did with its old
bitmap-based approach).
- tcache_t: Move read-only tcache_bin_t metadata into
tcache_bin_info_t, and add a contiguous array of pointers
to tcache_t in order to track cached objects. This
substantially increases the size of tcache_t, but results
in much higher data locality for common tcache operations.
As a side benefit, it is again possible to efficiently
flush the least recently used cached objects, so this
change changes flushing from MRU to LRU.
The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast. In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.
Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.
Use JEMALLOC_DEBUG rather than NDEBUG.
Add dassert(), and use it for debug-only asserts.
Clean up configuration for backtracing when profiling is enabled, and
document the configuration logic in INSTALL.
Disable libgcc-based backtracing except on x64 (where it is known to
work).
Add the --disable-prof-gcc option.
Compile with -fvisibility=hidden rather than -fvisibility=internal, in
order to avoid PLT lookups for internal functions. Also fix a
regression that caused the -fvisibility flag to be omitted, due to:
Port to Mac OS X.
2dbecf1f62
If mremap(2) is available and supports MREMAP_FIXED, use it for huge
realloc().
Initialize rtree later during bootstrapping, so that --enable-debug
--enable-dss works.
Fix a minor swap_avail stats bug.
Convert the man page source from roff to DocBook, and generate html and
roff output. Modify the build system such that the documentation can be
built as part of the release process, so that users need not have
DocBook tools installed.
Replace the single-character run-time flags with key/value pairs, which
can be set via the malloc_conf global, /etc/malloc.conf, and the
MALLOC_CONF environment variable.
Replace the JEMALLOC_PROF_PREFIX environment variable with the
"opt.prof_prefix" option.
Replace umax2s() with u2s().
Re-organize code for --enable-prof-libgcc so that configure doesn't
report both libgcc and libunwind support as being configured in. This
change has no impact on how jemalloc is actually configured/built.
Add test/jemalloc_test.h.in, which is processed to include
jemalloc/jemalloc@install_suffix@.h, so that test programs can include
it without worrying about the install suffix.
Don't build with -march=native by default, because the generated code
may perform especially poorly on ABI-compatible, but internally
different, systems.
Don't look for a shared libunwind if --with-static-libunwind is
specified.
Set SONAME when linking the shared libjemalloc.
Add DESTDIR support.
Add install_{include,lib/man} build targets.
Clean up compiler flag configuration.
Remove all functionality related to tracing. This functionality was
useful for understanding memory fragmentation during early algorithmic
design of jemalloc, but it had little utility for non-trivial
applications, due to the sheer volume of data written to disk.
Add the --disable-prof-libgcc configure option, and add backtracing
based on libgcc, which is used by default.
Fix a bug in hash().
Fix various configuration-dependent compilation errors.
Fix a stats bug in large object curruns accounting.
Replace tcache_bin_fill() with arena_tcache_fill(), and fix a bug in an OOM
error path.
Fix API name mangling to coexist with __attribute__((malloc)).
Use JEMALLOC_ATTR(tls_model("initial-exec)) instead of -ftls-model=initial-exec
so that libjemalloc_pic.a can be directly linked into another library without
needing linker options changes.
Add attributes to malloc, calloc, and posix_memalign, for compatibility with
glibc's declarations.
Add function prototypes for the standard malloc(3) API.
Add the 'G'/'g' and 'H'/'h' MALLOC_OPTIONS flags.
Add the malloc_tcache_flush() function.
Disable thread-specific caching until the application goes multi-threaded.
jemalloc is configured.
Modify arena_malloc() API to avoid unnecessary choose_arena() calls. Remove
unnecessary code from choose_arena().
Enable lazy-lock by default, now that choose_arena() is both faster and out of
the critical path.
Implement objdir support in the build system.
Implement minimal Makefile.
Make compile-time-optional jemalloc features controllable via configure
options (debug, stats, tiny, mag, balance, dss).
Conditionally exclude most of the opt_* run-time options, based on configure
options (fill, xmalloc, sysv).
Implement optional --enable-dynamic-page-shift.
Implement optional --enable-lazy-lock.
Re-order malloc_init_hard() and use the malloc_initializer variable to support
recursive allocation in malloc_ncpus().
Add mag_rack_tsd in order to receive notifications of thread termination.
Add jemalloc.h.