---
Motivation:
This new experimental memory-allocaction API returns a pointer to
the allocation as well as the usable size of the allocated memory
region.
The `s` in `smallocx` stands for `sized`-`mallocx`, attempting to
convey that this API returns the size of the allocated memory region.
It should allow C++ P0901r0 [0] and Rust Alloc::alloc_excess to make
use of it.
The main purpose of these APIs is to improve telemetry. It is more accurate
to register `smallocx(size, flags)` than `smallocx(nallocx(size), flags)`,
for example. The latter will always line up perfectly with the existing
size classes, causing a loss of telemetry information about the internal
fragmentation induced by potentially poor size-classes choices.
Instrumenting `nallocx` does not help much since user code can cache its
result and use it repeatedly.
---
Implementation:
The implementation adds a new `usize` option to `static_opts_s` and an `usize`
variable to `dynamic_opts_s`. These are then used to cache the result of
`sz_index2size` and similar functions in the code paths in which they are
unconditionally invoked. In the code-paths in which these functions are not
unconditionally invoked, `smallocx` calls, as opposed to `mallocx`, these
functions explicitly.
---
[0]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0901r0.html
Before this commit jemalloc produced many warnings when compiled with -Wextra
with both Clang and GCC. This commit fixes the issues raised by these warnings
or suppresses them if they were spurious at least for the Clang and GCC
versions covered by CI.
This commit:
* adds `JEMALLOC_DIAGNOSTIC` macros: `JEMALLOC_DIAGNOSTIC_{PUSH,POP}` are
used to modify the stack of enabled diagnostics. The
`JEMALLOC_DIAGNOSTIC_IGNORE_...` macros are used to ignore a concrete
diagnostic.
* adds `JEMALLOC_FALLTHROUGH` macro to explicitly state that falling
through `case` labels in a `switch` statement is intended
* Removes all UNUSED annotations on function parameters. The warning
-Wunused-parameter is now disabled globally in
`jemalloc_internal_macros.h` for all translation units that include
that header. It is never re-enabled since that header cannot be
included by users.
* locally suppresses some -Wextra diagnostics:
* `-Wmissing-field-initializer` is buggy in older Clang and GCC versions,
where it does not understanding that, in C, `= {0}` is a common C idiom
to initialize a struct to zero
* `-Wtype-bounds` is suppressed in a particular situation where a generic
macro, used in multiple different places, compares an unsigned integer for
smaller than zero, which is always true.
* `-Walloc-larger-than-size=` diagnostics warn when an allocation function is
called with a size that is too large (out-of-range). These are suppressed in
the parts of the tests where `jemalloc` explicitly does this to test that the
allocation functions fail properly.
* adds a new CI build bot that runs the log unit test on CI.
Closes#1196 .
This patch allows to override the lg-vaddr values, which
are defined by the build machine's CPUID information (x86_64)
or default values (other architectures like aarch64).
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Instead of setting a fix value of 48 allowed VA bits,
we distiguish between LP64 and ILP32.
Testsuite result with LP64:
Test suite summary: pass: 13/13, skip: 0/13, fail: 0/13
Testsuit result with ILP32:
Test suite summary: pass: 13/13, skip: 0/13, fail: 0/13
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Reviewed-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Right now we always make our TLS use the initial-exec model if the compiler
supports it. This change allows configure-time disabling of this setting, which
can be helpful when dynamically loading jemalloc is the only option.
On glibc and Android's bionic, strerror_r returns char* when
_GNU_SOURCE is defined.
Add a configure check for this rather than assume glibc is the
only libc that behaves this way.
All the invocations of AC_COMPILE_IFELSE inside JE_CXXFLAGS_ADD were
running 'the compiler and compilation flags of the current language'
which was always the C compiler and the CXXFLAGS were never being tested
against a C++ compiler. This patch fixes this issue by temporarily
changing the chosen compiler to C++ by pushing it over the stack and
popping it immediately after the compilation check.
On x86 Linux, we define our own MADV_FREE if madvise(2) is available, but no
MADV_FREE is detected. This allows the feature to be built in and enabled with
runtime detection.
Quoting from https://github.com/jemalloc/jemalloc/issues/761 :
[...] reading the Power ISA documentation[1], the assembly in [the CPU_SPINWAIT
macro] isn't correct anyway (as @marxin points out): the setting of the
program-priority register is "sticky", and we never undo the lowering.
We could do something similar, but given that we don't have testing here in the
first place, I'm inclined to simply not try. I'll put something up reverting the
problematic commit tomorrow.
[1] Book II, chapter 3 of the 2.07B or 3.0B ISA documents.
The configure.ac seciton right now is the same for Linux and kFreeBSD,
which results into an incorrect configuration of e.g. defining
JEMALLOC_PROC_SYS_VM_OVERCOMMIT_MEMORY instead of FreeBSD's
JEMALLOC_SYSCTL_VM_OVERCOMMIT.
GNU/kFreeBSD is really a glibc + FreeBSD kernel system, so it needs its
own entry which has a mixture of configuration options from Linux and
FreeBSD.
Currently, the log macro requires at least one argument after the format string,
because of the way the preprocessor handles varargs macros. We can hide some of
that irritation by pushing the extra arguments into a varargs function.
Added opt.background_thread to enable background threads, which handles purging
currently. When enabled, decay ticks will not trigger purging (which will be
left to the background threads). We limit the max number of threads to NCPUs.
When percpu arena is enabled, set CPU affinity for the background threads as
well.
The sleep interval of background threads is dynamic and determined by computing
number of pages to purge in the future (based on backlog).
Rather than using a manually maintained list of internal symbols to
drive name mangling, add a compilation phase to automatically extract
the list of internal symbols.
This resolves#677.
Add the extent_destroy_t extent destruction hook to extent_hooks_t, and
use it during arena destruction. This hook explicitly communicates to
the callee that the extent must be destroyed or tracked for later reuse,
lest it be permanently leaked. Prior to this change, retained extents
could unintentionally be leaked if extent retention was enabled.
This resolves#560.
Control use of munmap(2) via a run-time option rather than a
compile-time option (with the same per platform default). The old
behavior of --disable-munmap can be achieved with
--with-malloc-conf=munmap:false.
This partially resolves#580.
The explicit compiler warning suppression controlled by this option is
universally desirable, so remove the ability to disable suppression.
This partially resolves#580.
Four size classes per size doubling has proven to be a universally good
choice for the entire 4.x release series, so there's little point to
preserving this configurability.
This partially resolves#580.
This can catch bugs in which one header defines a numeric constant, and another
uses it without including the defining header. Undefined preprocessor symbols
expand to '0', so that this will compile fine, silently doing the math wrong.
Continue to use ivsalloc() when --enable-debug is specified (and add
assertions to guard against 0 size), but stop providing a documented
explicit semantics-changing band-aid to dodge undefined behavior in
sallocx() and malloc_usable_size(). ivsalloc() remains compiled in,
unlike when #211 restored --enable-ivsalloc, and if
JEMALLOC_FORCE_IVSALLOC is defined during compilation, sallocx() and
malloc_usable_size() will still use ivsalloc().
This partially resolves#580.
Simplify configuration by removing the --disable-tcache option, but
replace the testing for that configuration with
--with-malloc-conf=tcache:false.
Fix the thread.arena and thread.tcache.flush mallctls to work correctly
if tcache is disabled.
This partially resolves#580.
This is a biggy. jemalloc_internal.h has been doing multiple jobs for a while
now:
- The source of system-wide definitions.
- The catch-all include file.
- The module header file for jemalloc.c
This commit splits up this functionality. The system-wide definitions
responsibility has moved to jemalloc_preamble.h. The catch-all include file is
now jemalloc_internal_includes.h. The module headers for jemalloc.c are now in
jemalloc_internal_[externs|inlines|types].h, just as they are for the other
modules.
Hyper-threaded CPUs may need a special instruction inside spin loops in
order to yield to another virtual CPU. The 'pause' instruction that is
available for x86 is not supported on Power.
Apparently the extended mnemonics like yield, mdoio, and mdoom are not
actually implemented on POWER8, although mentioned in the ISA 2.07
document. The recommended magic bits are an 'or 31,31,31'.