Remove ephemeral mutexes from the prof machinery, and remove
malloc_mutex_destroy(). This simplifies mutex management on systems
that call malloc()/free() inside pthread_mutex_{create,destroy}().
Add atomic_*_u() for operation on unsigned values.
Fix prof_printf() to call malloc_vsnprintf() rather than
malloc_snprintf().
Forcibly disable TLS on OS X. gcc and llvm-gcc on OS X do not support
TLS, but clang does. Unfortunately, the implementation calls malloc()
internally during TLS initialization, which causes an unresolvable
bootstrapping issue.
Implement tsd, which is a TLS/TSD abstraction that uses one or both
internally. Modify bootstrapping such that no tsd's are utilized until
allocation is safe.
Remove malloc_[v]tprintf(), and use malloc_snprintf() instead.
Fix %p argument size handling in malloc_vsnprintf().
Fix a long-standing statistics-related bug in the "thread.arena"
mallctl that could cause crashes due to linked list corruption.
I tested a build from 10.7 run on 10.7 and 10.6, and a build from 10.6
run on 10.6. The AC_COMPILE_IFELSE limbo is to avoid running a program
during configure, which presumably makes it work when cross compiling
for iOS.
If there is no libpthread, look for pthreads functionality in libc
before failing to configure pthreads. This is necessary on at least the
Android platform.
Reported by Mike Hommey.
Acquire/release arena bin locks as part of the prefork/postfork. This
bug made deadlock in the child between fork and exec a possibility.
Split jemalloc_postfork() into jemalloc_postfork_{parent,child}() so
that the child can reinitialize mutexes rather than unlocking them. In
practice, this bug tended not to cause problems.
Modify malloc_vsnprintf() validation code to verify that output is
identical to vsnprintf() output, even if both outputs are truncated due
to buffer exhaustion.
Implement aligned_alloc(), which was added in the C11 standard. The
function is weakly specified to the point that a minimally compliant
implementation would be painful to use (size must be an integral
multiple of alignment!), which in practice makes posix_memalign() a
safer choice.
Revert JE_COMPILABLE() so that it detects link errors. Cross-compiling
should still work as long as a valid configure cache is provided.
Clean up some comments/whitespace.
Fix --with-mangling to remove mangled symbols from the set of functions
to apply a prefix to. Prior to this change, the interaction was correct
with autoconf 2.59, but incorrect with autoconf 2.65.
Implement malloc_vsnprintf() (a subset of vsnprintf(3)) as well as
several other printing functions based on it, so that formatted printing
can be relied upon without concern for inducing a dependency on floating
point runtime support. Replace malloc_write() calls with
malloc_*printf() where doing so simplifies the code.
Add name mangling for library-private symbols in the data and BSS
sections. Adjust CONF_HANDLE_*() macros in malloc_conf_init() to expose
all opt_* variable use to cpp so that proper mangling occurs.
Remove the lg_tcache_gc_sweep option, because it is no longer
very useful. Prior to the addition of dynamic adjustment of tcache fill
count, it was possible for fill/flush overhead to be a problem, but this
problem no longer occurs.
Add the --with-mangling configure option, which can be used to specify
name mangling on a per public symbol basis that takes precedence over
--with-jemalloc-prefix.
Expose the memalign() and valloc() overrides even if
--with-jemalloc-prefix is specified. This change does no real harm, and
simplifies the code.
Add nallocm(), which computes the real allocation size that would result
from the corresponding allocm() call. nallocm() is a functional
superset of OS X's malloc_good_size(), in that it takes alignment
constraints into account.
When jemalloc is used as a libc malloc replacement (i.e. not prefixed),
some particular setups may end up inconsistently calling malloc from
libc and free from jemalloc, or the other way around.
glibc provides hooks to make its functions use alternative
implementations. Use them.
Submitted by Karl Tomlinson and Mike Hommey.
Do not enforce minimum alignment in memalign(). This is a non-standard
function, and there is disagreement over whether to enforce minimum
alignment. Solaris documentation (whence memalign() originated) says
that minimum alignment is required:
The value of alignment must be a power of two and must be greater than
or equal to the size of a word.
However, Linux's manual page says in its NOTES section:
memalign() may not check that the boundary parameter is correct.
This is descriptive rather than prescriptive, but applications with
bad assumptions about memalign() exist, so be as forgiving as possible.
Reported by Mike Hommey.
Program-generate small size class tables for all valid combinations of
LG_TINY_MIN, LG_QUANTUM, and PAGE_SHIFT. Use the appropriate table to generate
all relevant data structures, and remove the distinction between
tiny/quantum/cacheline/subpage bins.
Remove --enable-dynamic-page-shift. This option didn't prove useful in
practice, and it prevented optimizations.
Add Tilera architecture support.
Remove opt.lg_prof_bt_max, and hard code it to 7. The original
intention of this option was to enable faster backtracing by limiting
backtrace depth. However, this makes graphical pprof output very
difficult to interpret. In practice, decreasing sampling frequency is a
better mechanism for limiting profiling overhead.