Don't assume Bourne shell is in /bin/sh when running size_classes.sh .
Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM.
This resolves#275.
Fix TLS configuration such that it is enabled by default for platforms
on which it works correctly. This regression was introduced by
ac5db02034 (Make --enable-tls and
--enable-lazy-lock take precedence over configure.ac-hardcoded
defaults).
Replace JEMALLOC_ATTR(format(printf, ...). with
JEMALLOC_FORMAT_PRINTF(), so that configuration feature tests can
omit the attribute if it would cause extraneous compilation warnings.
Add various function attributes to the exported functions to give the
compiler more information to work with during optimization, and also
specify throw() when compiling with C++ on Linux, in order to adequately
match what __THROW does in glibc.
This resolves#237.
Fix size class overflow handling for malloc(), posix_memalign(),
memalign(), calloc(), and realloc() when profiling is enabled.
Remove an assertion that erroneously caused arena_sdalloc() to fail when
profiling was enabled.
This resolves#232.
Extract szad size quantization into {extent,run}_quantize(), and .
quantize szad run sizes to the union of valid small region run sizes and
large run sizes.
Refactor iteration in arena_run_first_fit() to use
run_quantize{,_first,_next(), and add support for padded large runs.
For large allocations that have no specified alignment constraints,
compute a pseudo-random offset from the beginning of the first backing
page that is a multiple of the cache line size. Under typical
configurations with 4-KiB pages and 64-byte cache lines this results in
a uniform distribution among 64 page boundary offsets.
Add the --disable-cache-oblivious option, primarily intended for
performance testing.
This resolves#13.
This rename avoids installation collisions with the upstream gperftools.
Additionally, jemalloc's per thread heap profile functionality
introduced an incompatible file format, so it's now worthwhile to
clearly distinguish jemalloc's version of this script from the upstream
version.
This resolves#229.
under some compiler (gcc 4.8.4 in particular), the auto-detection of TLS
don't work properly.
force tls to be disabled.
the testsuite pass under gcc (4.8.4) and gcc (4.2.1)
However, unlike before it was removed do not force --enable-ivsalloc
when Darwin zone allocator integration is enabled, since the zone
allocator code uses ivsalloc() regardless of whether
malloc_usable_size() and sallocx() do.
This resolves#211.
Migrate all centralized data structures related to huge allocations and
recyclable chunks into arena_t, so that each arena can manage huge
allocations and recyclable virtual memory completely independently of
other arenas.
Add chunk node caching to arenas, in order to avoid contention on the
base allocator.
Use chunks_rtree to look up huge allocations rather than a red-black
tree. Maintain a per arena unsorted list of huge allocations (which
will be needed to enumerate huge allocations during arena reset).
Remove the --enable-ivsalloc option, make ivsalloc() always available,
and use it for size queries if --enable-debug is enabled. The only
practical implications to this removal are that 1) ivsalloc() is now
always available during live debugging (and the underlying radix tree is
available during core-based debugging), and 2) size query validation can
no longer be enabled independent of --enable-debug.
Remove the stats.chunks.{current,total,high} mallctls, and replace their
underlying statistics with simpler atomically updated counters used
exclusively for gdump triggering. These statistics are no longer very
useful because each arena manages chunks independently, and per arena
statistics provide similar information.
Simplify chunk synchronization code, now that base chunk allocation
cannot cause recursive lock acquisition.
It often happens that code changes introduce mixed declarations, that then
break building with Visual Studio. Since the code style is to not use
mixed declarations anyways, we might as well enforce it with -Werror.
Add:
--with-lg-page
--with-lg-page-sizes
--with-lg-size-class-group
--with-lg-quantum
Get rid of STATIC_PAGE_SHIFT, in favor of directly setting LG_PAGE.
Fix various edge conditions exposed by the configure options.
Revert 6716aa8352 (Force use of TLS if
heap profiling is enabled.). No existing tests indicate that this is
necessary, nor does code inspection uncover any potential issues. Most
likely the original commit covered up a bug related to tsd-internal
allocation that has since been fixed.
It has an unused variable, so it was always failing (at least with gcc
4.9.1). Alternatively, the `-Werror` flag could be removed if it isn't
strictly necessary.
This adds a new `sdallocx` function to the external API, allowing the
size to be passed by the caller. It avoids some extra reads in the
thread cache fast path. In the case where stats are enabled, this
avoids the work of calculating the size from the pointer.
An assertion validates the size that's passed in, so enabling debugging
will allow users of the API to debug cases where an incorrect size is
passed in.
The performance win for a contrived microbenchmark doing an allocation
and immediately freeing it is ~10%. It may have a different impact on a
real workload.
Closes#28
Relax the "are we in a git repo?" check to succeed even if the top level
jemalloc directory is not at the top level of the git repo.
Add git tag filtering so that only version triplets match when
generating VERSION.
Add fallback bogus VERSION creation, so that in the worst case, rather
than generating empty values for e.g. JEMALLOC_VERSION_MAJOR,
configuration ends up generating useless constants.
__*_hook() is glibc, but on at least one glibc platform (homebrew),
the __GLIBC__ define isn't set correctly and we miss being able to
use these hooks.
Do a feature test for it during configuration so that we enable it
anywhere the hooks are actually available.
On Windows, srcroot may start with "drive:", which confuses autoconf's
AC_CONFIG_* macros. The macros works equally well without ${srcroot},
provided some adjustment to Makefile.in.
Now this code is more portable and now people can use faster shells than
Bash such as Dash.
To use a faster shell with autoconf set the CONFIG_SHELL environment
variable to the shell and run the configure script with the shell.
When building with -O0, GCC doesn't use builtins for ffs and ffsl calls,
and uses library function calls instead. But the Android NDK doesn't have
those functions exported from any library, leading to build failure.
However, using __builtin_ffs* uses the builtin inlines.