server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Qi Wang	b75822bc6e	Implement use-after-free detection using junk and stash. On deallocation, sampled pointers (specially aligned) get junked and stashed into tcache (to prevent immediate reuse). The expected behavior is to have read-after-free corrupted and stopped by the junk-filling, while write-after-free is checked when flushing the stashed pointers.	2021-12-29 14:44:43 -08:00
David CARLIER	cf9724531a	Darwin malloc_size override support proposal. Darwin has similar api than Linux/FreeBSD's malloc_usable_size.	2021-10-01 14:32:40 -07:00
Qi Wang	deb8e62a83	Implement guard pages. Adding guarded extents, which are regular extents surrounded by guard pages (mprotected). To reduce syscalls, small guarded extents are cached as a separate eset in ecache, and decay through the dirty / muzzy / retained pipeline as usual.	2021-09-26 16:30:15 -07:00
David Goldblatt	4452a4812f	Add opt.experimental_infallible_new. This allows a guarantee that operator new never throws. Fix the .gitignore rules to include test/integration/cpp while we're here.	2021-06-24 12:22:51 -07:00
David CARLIER	35a8552605	Mac OS: Tag mapped pages. This can be used to help profiling tools (e.g. vmmap) identify the sources of mappings more specifically.	2021-02-03 15:05:53 -08:00
Jin Qian	26c1dc5a3a	Support AutoConf for posix_madvise and POSIX_MADV_DONTNEED	2020-12-18 10:05:59 -08:00
David Carlier	520b75fa2d	utrace support with label based signature.	2020-11-30 11:43:00 -08:00
David Carlier	95f0a77fde	Detect pthread_getname_np explicitly. At least one libc (musl) defines pthread_setname_np without defining pthread_getname_np. Detect the presence of each individually, rather than inferring both must be defined if set is.	2020-11-11 17:31:22 -08:00
David Carlier	d2d941017b	MADV_DO[NOT]DUMP support equivalence on FreeBSD.	2020-11-02 09:15:15 -08:00
David Goldblatt	7ad2f78663	Avoid a -Wundef warning on LG_SLAB_MAXREGS.	2020-09-17 10:05:40 -07:00
Hao Liu	1541ffc765	configure: add --with-lg-slab-maxregs configure option. Specify the maximum number of regions in a slab, which is (<lg-page> - <lg-tiny-min>) by default. This increases the limit of slab sizes specified by "slab_sizes" in malloc_conf. This should never be less than the default value. The max value of this option is related to LG_BITMAP_MAXBITS (see more in bitmap.h). For example, on a 4k page size system, if we: 1) configure jemalloc with with --with-lg-slab-maxregs=12. 2) export MALLOC_CONF="slab_sizes:9-16:4" The slab size of 16 bytes is set to 4 pages. Previously, the default lg-slab-maxregs is 9 (i.e. 12 - 3). The max slab size of 16 bytes is 2 pages (i.e. (1<<9) * 16 bytes). By increasing the value from 9 to 12, the max slab size can be set by MALLOC_CONF is 16 pages (i.e. (1<<12) * 16 bytes).	2020-09-16 13:58:38 -07:00
David Goldblatt	eaed1e39be	Add sized-delete size-checking functionality. The existing checks are good at finding such issues (on tcache flush), but not so good at pinpointing them. Debug mode can find them, but sometimes debug mode slows down a program so much that hard-to-hit bugs can take a long time to crash. This commit adds functionality to keep programs mostly on their fast paths, while also checking every sized delete argument they get.	2020-08-05 19:34:05 -07:00
David Carlier	00f06c9beb	enabling mpss on solaris/illumos. reusing slighty linux configuration as possible, aligning the address range to HUGEPAGE.	2020-07-06 09:59:10 -07:00
Jon Haslam	4aea743279	High Resolution Timestamps for Profiling	2020-06-15 12:12:49 -07:00
David Goldblatt	f4d24f05e1	Move extra size checks behind a config flag. This will let us turn that flag into a generic "turn on runtime checks" flag that guards other functionality we have planned.	2019-04-15 16:48:12 -07:00
Qi Wang	f6c30cbafa	Remove some unused comments.	2019-03-14 17:34:55 -07:00
Qi Wang	06f0850427	Detect if 8-bit atomics are available. In some rare cases (older compiler, e.g. gcc 4.2 w/ MIPS), 8-bit atomics might be unavailable. Detect such cases so that we can workaround.	2019-03-09 12:52:06 -08:00
Jason Evans	775fe302a7	Remove JE_FORCE_SYNC_COMPARE_AND_SWAP_[48]. These macros have been unused since `d4ac7582f3` (Introduce a backport of C11 atomics).	2019-02-22 14:22:16 -08:00
Qi Wang	e13400c919	Sanity check szind on tcache flush. This adds some overhead to the tcache flush path (which is one of the popular paths). Guard it behind a config option.	2019-02-01 12:31:34 -08:00
Qi Wang	43f3b1ad0c	Deprecate OSSpinLock.	2018-11-14 08:44:05 -08:00
Dave Watson	13c237c7ef	Add a fastpath for arena_slab_reg_alloc_batch Also adds a configure.ac check for __builtin_popcount, which is used in the new fastpath.	2018-11-14 07:09:11 -08:00
gnzlbg	08260a6b94	Add experimental API: smallocx_return_t smallocx(size, flags) --- Motivation: This new experimental memory-allocaction API returns a pointer to the allocation as well as the usable size of the allocated memory region. The `s` in `smallocx` stands for `sized`-`mallocx`, attempting to convey that this API returns the size of the allocated memory region. It should allow C++ P0901r0 [0] and Rust Alloc::alloc_excess to make use of it. The main purpose of these APIs is to improve telemetry. It is more accurate to register `smallocx(size, flags)` than `smallocx(nallocx(size), flags)`, for example. The latter will always line up perfectly with the existing size classes, causing a loss of telemetry information about the internal fragmentation induced by potentially poor size-classes choices. Instrumenting `nallocx` does not help much since user code can cache its result and use it repeatedly. --- Implementation: The implementation adds a new `usize` option to `static_opts_s` and an `usize` variable to `dynamic_opts_s`. These are then used to cache the result of `sz_index2size` and similar functions in the code paths in which they are unconditionally invoked. In the code-paths in which these functions are not unconditionally invoked, `smallocx` calls, as opposed to `mallocx`, these functions explicitly. --- [0]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0901r0.html	2018-10-17 07:12:28 -07:00
David Goldblatt	e8ec9528ab	Allow the use of readlinkat over readlink. This can be useful in situations where readlink is disallowed.	2018-08-03 14:04:32 -07:00
Christopher Ferris	f78d4ca3fb	Modify configure to determine return value of strerror_r. On glibc and Android's bionic, strerror_r returns char* when _GNU_SOURCE is defined. Add a configure check for this rather than assume glibc is the only libc that behaves this way.	2018-01-10 21:01:18 -08:00
David Goldblatt	ccd09050aa	Add configure-time detection for madvise(..., MADV_DO[NT]DUMP)	2017-10-16 15:35:49 -07:00
Qi Wang	31ab38be5f	Define MADV_FREE on our own when needed. On x86 Linux, we define our own MADV_FREE if madvise(2) is available, but no MADV_FREE is detected. This allows the feature to be built in and enabled with runtime detection.	2017-10-11 15:49:22 -07:00
David Goldblatt	1245faae90	Power: disable the CPU_SPINWAIT macro. Quoting from https://github.com/jemalloc/jemalloc/issues/761 : [...] reading the Power ISA documentation[1], the assembly in [the CPU_SPINWAIT macro] isn't correct anyway (as @marxin points out): the setting of the program-priority register is "sticky", and we never undo the lowering. We could do something similar, but given that we don't have testing here in the first place, I'm inclined to simply not try. I'll put something up reverting the problematic commit tomorrow. [1] Book II, chapter 3 of the 2.07B or 3.0B ISA documents.	2017-10-04 18:37:23 -07:00
Qi Wang	8fdd9a5797	Implement opt.metadata_thp This option enables transparent huge page for base allocators (require MADV_HUGEPAGE support).	2017-08-11 14:51:20 -07:00
David T. Goldblatt	9761b449c8	Add a logging facility. This sets up a hierarchical logging facility, so that we can add logging statements liberally, and turn them on in a fine-grained manner.	2017-07-20 17:58:37 -07:00
Qi Wang	a3f4977217	Add thread name for background threads.	2017-06-23 10:54:54 -07:00
Jason Evans	13685ab1b7	Normalize background thread configuration. Also fix a compilation error #ifndef JEMALLOC_PTHREAD_CREATE_WRAPPER.	2017-06-08 23:01:26 -07:00
Jason Evans	c606a87d2a	Add the --disable-thp option to support cross compiling. This resolves #669.	2017-05-30 11:30:54 -07:00
Qi Wang	b693c7868e	Implementing opt.background_thread. Added opt.background_thread to enable background threads, which handles purging currently. When enabled, decay ticks will not trigger purging (which will be left to the background threads). We limit the max number of threads to NCPUs. When percpu arena is enabled, set CPU affinity for the background threads as well. The sleep interval of background threads is dynamic and determined by computing number of pages to purge in the future (based on backlog).	2017-05-23 12:26:20 -07:00
Jason Evans	909f0482e4	Automatically generate private symbol name mangling macros. Rather than using a manually maintained list of internal symbols to drive name mangling, add a compilation phase to automatically extract the list of internal symbols. This resolves #677.	2017-05-11 23:06:54 -07:00
Jason Evans	b9ab04a191	Refactor !opt.munmap to opt.retain.	2017-04-29 09:24:12 -07:00
Jason Evans	c67c3e4a63	Replace --disable-munmap with opt.munmap. Control use of munmap(2) via a run-time option rather than a compile-time option (with the same per platform default). The old behavior of --disable-munmap can be achieved with --with-malloc-conf=munmap:false. This partially resolves #580.	2017-04-24 20:37:16 -07:00
Jason Evans	e2cc6280ed	Remove --enable-code-coverage. This option hasn't been particularly useful since the original pre-3.0.0 push to broaden test coverage. This partially resolves #580.	2017-04-24 16:33:04 -07:00
Jason Evans	0f63396b23	Remove --disable-cc-silence. The explicit compiler warning suppression controlled by this option is universally desirable, so remove the ability to disable suppression. This partially resolves #580.	2017-04-24 15:02:45 -07:00
Jason Evans	af76f0e5d2	Remove --with-lg-tiny-min. This option isn't useful in practice. This partially resolves #580.	2017-04-24 11:48:28 -07:00
David Goldblatt	425253e2cd	Enable -Wundef, when supported. This can catch bugs in which one header defines a numeric constant, and another uses it without including the defining header. Undefined preprocessor symbols expand to '0', so that this will compile fine, silently doing the math wrong.	2017-04-21 17:03:56 -07:00
Jason Evans	3823effe12	Remove --enable-ivsalloc. Continue to use ivsalloc() when --enable-debug is specified (and add assertions to guard against 0 size), but stop providing a documented explicit semantics-changing band-aid to dodge undefined behavior in sallocx() and malloc_usable_size(). ivsalloc() remains compiled in, unlike when #211 restored --enable-ivsalloc, and if JEMALLOC_FORCE_IVSALLOC is defined during compilation, sallocx() and malloc_usable_size() will still use ivsalloc(). This partially resolves #580.	2017-04-21 14:34:35 -07:00
Jason Evans	4403c9ab44	Remove --disable-tcache. Simplify configuration by removing the --disable-tcache option, but replace the testing for that configuration with --with-malloc-conf=tcache:false. Fix the thread.arena and thread.tcache.flush mallctls to work correctly if tcache is disabled. This partially resolves #580.	2017-04-21 10:06:12 -07:00
Jason Evans	7cbcd2e2b7	Fix pages_purge_forced() to discard pages on non-Linux systems. madvise(..., MADV_DONTNEED) only causes demand-zeroing on Linux, so fall back to overlaying a new mapping.	2017-03-13 18:19:57 -07:00
Qi Wang	ec532e2c5c	Implement per-CPU arena. The new feature, opt.percpu_arena, determines thread-arena association dynamically based CPU id. Three modes are supported: "percpu", "phycpu" and disabled. "percpu" uses the current core id (with help from sched_getcpu()) directly as the arena index, while "phycpu" will assign threads on the same physical CPU to the same arena. In other words, "percpu" means # of arenas == # of CPUs, while "phycpu" has # of arenas == 1/2 * (# of CPUs). Note that no runtime check on whether hyper threading is enabled is added yet. When enabled, threads will be migrated between arenas when a CPU change is detected. In the current design, to reduce overhead from reading CPU id, each arena tracks the thread accessed most recently. When a new thread comes in, we will read CPU id and update arena if necessary.	2017-03-08 23:19:01 -08:00
David Goldblatt	d4ac7582f3	Introduce a backport of C11 atomics This introduces a backport of C11 atomics. It has four implementations; ranked in order of preference, they are: - GCC/Clang __atomic builtins - GCC/Clang __sync builtins - MSVC _Interlocked builtins - C11 atomics, from <stdatomic.h> The primary advantages are: - Close adherence to the standard API gives us a defined memory model. - Type safety: atomic objects are now separate types from non-atomic ones, so that it's impossible to mix up atomic and non-atomic updates (which is undefined behavior that compilers are starting to take advantage of). - Efficiency: we can specify ordering for operations, avoiding fences and atomic operations on strongly ordered architectures (example: `atomic_write_u32(ptr, val);` involves a CAS loop, whereas `atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store. This diff leaves in the current atomics API (implementing them in terms of the backport). This lets us transition uses over piecemeal. Testing: This is by nature hard to test. I've manually tested the first three options on Linux on gcc by futzing with the #defines manually, on freebsd with gcc and clang, on MSVC, and on OS X with clang. All of these were x86 machines though, and we don't have any test infrastructure set up for non-x86 platforms.	2017-03-03 13:40:59 -08:00
Jason Evans	f5cf9b19c8	Determine rtree levels at compile time. Rather than dynamically building a table to aid per level computations, define a constant table at compile time. Omit both high and low insignificant bits. Use one to three tree levels, depending on the number of significant bits.	2017-02-08 18:50:03 -08:00
Jason Evans	c0cc5db871	Replace tabs following #define with spaces. This resolves #564.	2017-01-20 21:45:53 -08:00
Mike Hommey	0f7376eb62	Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions The SDK jemalloc is built against might be not be the latest for various reasons, but the resulting binary ought to work on newer versions of OSX. In order to ensure this, we need the fullest definitions possible, so copy what we need from the latest version of malloc/malloc.h available on opensource.apple.com.	2017-01-17 20:13:28 -08:00
Jason Evans	c1baa0a9b7	Add huge page configuration and pages_[no}huge(). Add the --with-lg-hugepage configure option, but automatically configure LG_HUGEPAGE even if it isn't specified. Add the pages_[no]huge() functions, which toggle huge page state via madvise(..., MADV_[NO]HUGEPAGE) calls.	2016-12-26 17:59:34 -08:00
Jason Evans	acb7b1f53e	Add --disable-syscall. This resolves #517.	2016-12-03 16:50:58 -08:00

1 2

90 Commits