Add the --disable-munmap option, remove the configure test that
attempted to detect the VM allocation quirk known to exist on Linux
x86[_64], and make --disable-munmap implicit on Linux.
Add a configure test to determine whether common mmap()/munmap()
patterns cause VM map holes, and only use munmap() to discard unused
chunks if the problem does not exist.
Unify the chunk caching for mmap and dss.
Fix options processing to limit lg_chunk to be large enough that
redzones will always fit.
Implement Valgrind support, as well as the redzone and quarantine
features, which help Valgrind detect memory errors. Redzones are only
implemented for small objects because the changes necessary to support
redzones around large and huge objects are complicated by in-place
reallocation, to the point that it isn't clear that the maintenance
burden is worth the incremental improvement to Valgrind support.
Merge arena_salloc() and arena_salloc_demote().
Refactor i[v]salloc() to expose the 'demote' option.
Force the lazy-lock feature on FreeBSD in order to avoid pthread_self(),
because it causes allocation. (This change was mistakenly omitted from
41b6afb834b1f5250223678c52bd4f013d4234f6.)
These functions may be available as inlines or as libgcc functions. In the
former case, a __GCC_HAVE_SYNC_COMPARE_AND_SWAP_n macro is defined. But we
still want to use these functions in the latter case, when we don't have
our own implementation.
Use FreeBSD-specific functions (_pthread_mutex_init_calloc_cb(),
_malloc_{pre,post}fork()) to avoid bootstrapping issues due to
allocation in libc and libthr.
Add malloc_strtoumax() and use it instead of strtoul(). Disable
validation code in malloc_vsnprintf() and malloc_strtoumax() until
jemalloc is initialized. This is necessary because locale
initialization causes allocation for both vsnprintf() and strtoumax().
Force the lazy-lock feature on in order to avoid pthread_self(),
because it causes allocation.
Use syscall(SYS_write, ...) rather than write(...), because libthr wraps
write() and causes allocation. Without this workaround, it would not be
possible to print error messages in malloc_conf_init() without
substantially reworking bootstrapping.
Fix choose_arena_hard() to look at how many threads are assigned to the
candidate choice, rather than checking whether the arena is
uninitialized. This bug potentially caused more arenas to be
initialized than necessary.
Forcibly disable TLS on OS X. gcc and llvm-gcc on OS X do not support
TLS, but clang does. Unfortunately, the implementation calls malloc()
internally during TLS initialization, which causes an unresolvable
bootstrapping issue.
Implement tsd, which is a TLS/TSD abstraction that uses one or both
internally. Modify bootstrapping such that no tsd's are utilized until
allocation is safe.
Remove malloc_[v]tprintf(), and use malloc_snprintf() instead.
Fix %p argument size handling in malloc_vsnprintf().
Fix a long-standing statistics-related bug in the "thread.arena"
mallctl that could cause crashes due to linked list corruption.
I tested a build from 10.7 run on 10.7 and 10.6, and a build from 10.6
run on 10.6. The AC_COMPILE_IFELSE limbo is to avoid running a program
during configure, which presumably makes it work when cross compiling
for iOS.
If there is no libpthread, look for pthreads functionality in libc
before failing to configure pthreads. This is necessary on at least the
Android platform.
Reported by Mike Hommey.
Implement aligned_alloc(), which was added in the C11 standard. The
function is weakly specified to the point that a minimally compliant
implementation would be painful to use (size must be an integral
multiple of alignment!), which in practice makes posix_memalign() a
safer choice.
Revert JE_COMPILABLE() so that it detects link errors. Cross-compiling
should still work as long as a valid configure cache is provided.
Clean up some comments/whitespace.
Fix --with-mangling to remove mangled symbols from the set of functions
to apply a prefix to. Prior to this change, the interaction was correct
with autoconf 2.59, but incorrect with autoconf 2.65.
Implement malloc_vsnprintf() (a subset of vsnprintf(3)) as well as
several other printing functions based on it, so that formatted printing
can be relied upon without concern for inducing a dependency on floating
point runtime support. Replace malloc_write() calls with
malloc_*printf() where doing so simplifies the code.
Add name mangling for library-private symbols in the data and BSS
sections. Adjust CONF_HANDLE_*() macros in malloc_conf_init() to expose
all opt_* variable use to cpp so that proper mangling occurs.
Add the --with-mangling configure option, which can be used to specify
name mangling on a per public symbol basis that takes precedence over
--with-jemalloc-prefix.
Expose the memalign() and valloc() overrides even if
--with-jemalloc-prefix is specified. This change does no real harm, and
simplifies the code.
Program-generate small size class tables for all valid combinations of
LG_TINY_MIN, LG_QUANTUM, and PAGE_SHIFT. Use the appropriate table to generate
all relevant data structures, and remove the distinction between
tiny/quantum/cacheline/subpage bins.
Remove --enable-dynamic-page-shift. This option didn't prove useful in
practice, and it prevented optimizations.
Add Tilera architecture support.
When tiny size class support was first added, it was intended to support
truly tiny size classes (even 2 bytes). However, this wasn't very
useful in practice, so the minimum tiny size class has been limited to
sizeof(void *) for a long time now. This is too small to be standards
compliant, but other commonly used malloc implementations do not even
bother using a 16-byte quantum on systems with vector units (SSE2+,
AltiVEC, etc.). As such, it is safe in practice to support an 8-byte
tiny size class on 64-bit systems that support 16-byte types.
Do not enable lazy locking by default, because:
- It's fragile (applications can subvert detection of multi-threaded
mode).
- Thread caching amortizes locking overhead in the default
configuration.
Convert configuration-related cpp conditional logic to use static
constant variables, e.g.:
#ifdef JEMALLOC_DEBUG
[...]
#endif
becomes:
if (config_debug) {
[...]
}
The advantage is clearer, more concise code. The main disadvantage is
that data structures no longer have conditionally defined fields, so
they pay the cost of all fields regardless of whether they are used. In
practice, this is only a minor concern; config_stats will go away in an
upcoming change, and config_prof is the only other major feature that
depends on more than a few special-purpose fields.
Refactor the SO and REV such that they are set via autoconf variables,
@so@ and @rev@. These variables are both needed by the jemalloc.sh
script, so this unifies their definitions.