It turns out some OSX system libraries (like CoreGraphics on 10.6) like
to call malloc_zone_* functions, but giving them pointers that weren't
allocated with the zone they are using.
Possibly, they do malloc_zone_malloc(malloc_default_zone()) before we
register the jemalloc zone, and malloc_zone_realloc(malloc_default_zone())
after. malloc_default_zone() returning a different value in both cases.
These functions may be available as inlines or as libgcc functions. In the
former case, a __GCC_HAVE_SYNC_COMPARE_AND_SWAP_n macro is defined. But we
still want to use these functions in the latter case, when we don't have
our own implementation.
Use FreeBSD-specific functions (_pthread_mutex_init_calloc_cb(),
_malloc_{pre,post}fork()) to avoid bootstrapping issues due to
allocation in libc and libthr.
Add malloc_strtoumax() and use it instead of strtoul(). Disable
validation code in malloc_vsnprintf() and malloc_strtoumax() until
jemalloc is initialized. This is necessary because locale
initialization causes allocation for both vsnprintf() and strtoumax().
Force the lazy-lock feature on in order to avoid pthread_self(),
because it causes allocation.
Use syscall(SYS_write, ...) rather than write(...), because libthr wraps
write() and causes allocation. Without this workaround, it would not be
possible to print error messages in malloc_conf_init() without
substantially reworking bootstrapping.
Fix choose_arena_hard() to look at how many threads are assigned to the
candidate choice, rather than checking whether the arena is
uninitialized. This bug potentially caused more arenas to be
initialized than necessary.
Remove ephemeral mutexes from the prof machinery, and remove
malloc_mutex_destroy(). This simplifies mutex management on systems
that call malloc()/free() inside pthread_mutex_{create,destroy}().
Add atomic_*_u() for operation on unsigned values.
Fix prof_printf() to call malloc_vsnprintf() rather than
malloc_snprintf().
Forcibly disable TLS on OS X. gcc and llvm-gcc on OS X do not support
TLS, but clang does. Unfortunately, the implementation calls malloc()
internally during TLS initialization, which causes an unresolvable
bootstrapping issue.
Implement tsd, which is a TLS/TSD abstraction that uses one or both
internally. Modify bootstrapping such that no tsd's are utilized until
allocation is safe.
Remove malloc_[v]tprintf(), and use malloc_snprintf() instead.
Fix %p argument size handling in malloc_vsnprintf().
Fix a long-standing statistics-related bug in the "thread.arena"
mallctl that could cause crashes due to linked list corruption.
I tested a build from 10.7 run on 10.7 and 10.6, and a build from 10.6
run on 10.6. The AC_COMPILE_IFELSE limbo is to avoid running a program
during configure, which presumably makes it work when cross compiling
for iOS.
If there is no libpthread, look for pthreads functionality in libc
before failing to configure pthreads. This is necessary on at least the
Android platform.
Reported by Mike Hommey.
Acquire/release arena bin locks as part of the prefork/postfork. This
bug made deadlock in the child between fork and exec a possibility.
Split jemalloc_postfork() into jemalloc_postfork_{parent,child}() so
that the child can reinitialize mutexes rather than unlocking them. In
practice, this bug tended not to cause problems.
Modify malloc_vsnprintf() validation code to verify that output is
identical to vsnprintf() output, even if both outputs are truncated due
to buffer exhaustion.
Implement aligned_alloc(), which was added in the C11 standard. The
function is weakly specified to the point that a minimally compliant
implementation would be painful to use (size must be an integral
multiple of alignment!), which in practice makes posix_memalign() a
safer choice.
Revert JE_COMPILABLE() so that it detects link errors. Cross-compiling
should still work as long as a valid configure cache is provided.
Clean up some comments/whitespace.
Fix --with-mangling to remove mangled symbols from the set of functions
to apply a prefix to. Prior to this change, the interaction was correct
with autoconf 2.59, but incorrect with autoconf 2.65.
Implement malloc_vsnprintf() (a subset of vsnprintf(3)) as well as
several other printing functions based on it, so that formatted printing
can be relied upon without concern for inducing a dependency on floating
point runtime support. Replace malloc_write() calls with
malloc_*printf() where doing so simplifies the code.
Add name mangling for library-private symbols in the data and BSS
sections. Adjust CONF_HANDLE_*() macros in malloc_conf_init() to expose
all opt_* variable use to cpp so that proper mangling occurs.
Remove the lg_tcache_gc_sweep option, because it is no longer
very useful. Prior to the addition of dynamic adjustment of tcache fill
count, it was possible for fill/flush overhead to be a problem, but this
problem no longer occurs.
Add the --with-mangling configure option, which can be used to specify
name mangling on a per public symbol basis that takes precedence over
--with-jemalloc-prefix.
Expose the memalign() and valloc() overrides even if
--with-jemalloc-prefix is specified. This change does no real harm, and
simplifies the code.
Add nallocm(), which computes the real allocation size that would result
from the corresponding allocm() call. nallocm() is a functional
superset of OS X's malloc_good_size(), in that it takes alignment
constraints into account.
When jemalloc is used as a libc malloc replacement (i.e. not prefixed),
some particular setups may end up inconsistently calling malloc from
libc and free from jemalloc, or the other way around.
glibc provides hooks to make its functions use alternative
implementations. Use them.
Submitted by Karl Tomlinson and Mike Hommey.
Do not enforce minimum alignment in memalign(). This is a non-standard
function, and there is disagreement over whether to enforce minimum
alignment. Solaris documentation (whence memalign() originated) says
that minimum alignment is required:
The value of alignment must be a power of two and must be greater than
or equal to the size of a word.
However, Linux's manual page says in its NOTES section:
memalign() may not check that the boundary parameter is correct.
This is descriptive rather than prescriptive, but applications with
bad assumptions about memalign() exist, so be as forgiving as possible.
Reported by Mike Hommey.