server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Amaury Séchet	f743690739	Remove unused mutex from hpa_central	2023-03-10 11:25:47 -08:00
Qi Wang	c7805f1eb5	Add a header in HPA stats for the nonfull slabs.	2023-02-17 13:31:27 -08:00
Qi Wang	b6125120ac	Add an explicit name to the dedicated oversize arena.	2023-02-17 13:31:09 -08:00
Qi Wang	5fd55837bb	Fix thread_name updating for heap profiling. The current thread name reading path updates the name every time, which requires both alloc and dalloc -- and the temporary NULL value in the middle causes races where the prof dump read path gets NULLed in the middle. Minimize the changes in this commit to isolate the bugfix testing; will also refactor the whole thread name paths later.	2023-02-15 17:49:40 -08:00
Qi Wang	8580c65f81	Implement prof sample hooks "experimental.hooks.prof_sample(_free)". The added hooks hooks.prof_sample and hooks.prof_sample_free are intended to allow advanced users to track additional information, to enable new ways of profiling on top of the jemalloc heap profile and sample features. The sample hook is invoked after the allocation and backtracing, and forwards the both the allocation and backtrace to the user hook; the sample_free hook happens before the actual deallocation, and forwards only the ptr and usz to the hook.	2022-12-07 16:06:49 -08:00
Guangli Dai	e8f9f13811	Inline free and sdallocx into operator delete	2022-11-21 11:14:05 -08:00
Qi Wang	481bbfc990	Add a configure option --enable-force-getenv. Allows the use of getenv() rather than secure_getenv() to read MALLOC_CONF. This helps in situations where hosts are under full control, and setting MALLOC_CONF is needed while also setuid. Disabled by default.	2022-11-04 13:37:14 -07:00
Qi Wang	143e9c4a2f	Enable fast thread locals for dealloc-only threads. Previously if a thread does only allocations, it stays on the slow path / minimal initialized state forever. However, dealloc-only is a valid pattern for dedicated reclamation threads -- this means thread cache is disabled (no batched flush) for them, which causes high overhead and contention. Added the condition to fully initialize TSD when a fair amount of dealloc activities are observed.	2022-10-25 09:54:38 -07:00
David Carlier	4c95c953e2	fix build for non linux/BSD platforms.	2022-10-03 10:42:09 -07:00
Guangli Dai	ba19d2cb78	Add arena-level name. An arena-level name can help identify manual arenas.	2022-09-16 15:04:59 -07:00
Guangli Dai	a0734fd6ee	Making jemalloc max stack depth a runtime option	2022-09-12 13:56:22 -07:00
Abael He	56ddbea270	error: implicit declaration of function 'pthread_create_fptr_init' is invalid in C99 ./autogen.sh \ && ./configure --prefix=/usr/local --enable-static --enable-autogen --enable-xmalloc --with-static-libunwind=/usr/local/lib/libunwind.a --enable-lazy-lock --with-jemalloc-prefix='' \ && make -j16 ... gcc -std=gnu11 -Werror=unknown-warning-option -Wall -Wextra -Wshorten-64-to-32 -Wsign-compare -Wundef -Wno-format-zero-length -Wpointer-arith -Wno-missing-braces -Wno-missing-field-initializers -pipe -g3 -Wimplicit-fallthrough -O3 -funroll-loops -fPIC -DPIC -c -D_REENTRANT -Iinclude -Iinclude -DJEMALLOC_NO_PRIVATE_NAMESPACE -o src/edata_cache.sym.o src/edata_cache.c src/background_thread.c:768:6: error: implicit declaration of function 'pthread_create_fptr_init' is invalid in C99 [-Werror,-Wimplicit-function-declaration] pthread_create_fptr_init()) { ^ src/background_thread.c:768:6: note: did you mean 'pthread_create_wrapper_init'? src/background_thread.c:34:1: note: 'pthread_create_wrapper_init' declared here pthread_create_wrapper_init(void) { ^ 1 error generated. make: * [src/background_thread.sym.o] Error 1 make: * Waiting for unfinished jobs....	2022-09-07 11:56:41 -07:00
Guangli Dai	ce29b4c3d9	Refactor the remote / cross thread cache bin stats reading Refactored cache_bin.h so that only one function is racy.	2022-09-06 19:41:19 -07:00
Ivan Zaitsev	36366f3c4c	Add double free detection in thread cache for debug build Add new runtime option `debug_double_free_max_scan` that specifies the max number of stack entries to scan in the cache bit when trying to detect the double free bug (currently debug build only).	2022-08-04 16:58:22 -07:00
David Carlier	58478412be	OpenBSD build fix. still no cpu affinity. - enabling pthread_get/pthread_set_name_np api. - disabling per thread cpu affinity handling, unsupported on this platform.	2022-07-19 13:20:11 -07:00
Qi Wang	a1c7d9c046	Add the missing opt.cache_oblivious handling.	2022-07-14 22:41:27 -07:00
Azat Khuzhin	cb578bbe01	Fix possible "nmalloc >= ndalloc" assertion In arena_stats_merge() first nmalloc was read, and after ndalloc. However with this order, it is possible for some thread to incement ndalloc in between, and then nmalloc < ndalloc, and assertion will fail, like again found by ClickHouse CI [1] (even after #2234). [1]: https://github.com/ClickHouse/ClickHouse/issues/31531 Swap the order to avoid possible assertion. Cc: @interwq Follow-up for: #2234	2022-07-11 15:27:51 -07:00
David Carlier	4fc5c4fbac	New configure option '--enable-pageid' for Linux The option makes jemalloc use prctl with PR_SET_VMA to tag memory mappings with "jemalloc_pg" or "jemalloc_pg_overcommit". This allows to easily identify jemalloc's mappings in /proc/<pid>/maps. PR_SET_VMA is only available in Linux 5.17 and above.	2022-06-09 18:54:08 -07:00
Alex Lapenkou	5b1f2cc5d7	Implement pvalloc replacement Despite being an obsolete function, pvalloc is still present in GLIBC and should work correctly when jemalloc replaces libc allocator.	2022-05-18 17:01:09 -07:00
Qi Wang	cd5aaf308a	Improve the failure message upon opt_experimental_infallible_new.	2022-05-17 16:07:40 -07:00
Qi Wang	8cb814629a	Make the default option of zero realloc match the system allocator.	2022-05-05 17:11:18 -07:00
Qi Wang	391bad4b95	Avoid abort() in test/integration/cpp/infallible_new_true. Allow setting the safety check abort hook through mallctl, which avoids abort() and core dumps.	2022-04-25 11:29:32 -07:00
cuishuang	9a242f16d9	fix some typos Signed-off-by: cuishuang <imcusg@gmail.com>	2022-04-25 11:29:00 -07:00
Qi Wang	0e29ad4efa	Rename zero_realloc option "strict" to "alloc". With realloc(ptr, 0) being UB per C23, the option name "strict" makes less sense now. Rename to "alloc" which describes the behavior.	2022-04-20 10:27:25 -07:00
Alex Lapenkou	a93931537e	Do not disable SEC by default for 64k pages platforms Default SEC max_alloc option value was 32k, disabling SEC for platforms with lg-page=16. This change enables SEC for all platforms, making minimum max_alloc value equal to PAGE.	2022-03-24 22:05:35 -07:00
Charles	eaaa368bab	Add comments and use meaningful vars in sz_psz2ind.	2022-03-24 16:56:59 -07:00
Alex Lapenkou	5bf03f8ce5	Implement PAGE_FLOOR macro	2022-03-22 17:45:55 -07:00
Alex Lapenkou	52631c90f6	Fix size class calculation for sec Due to a bug in sec initialization, the number of cached size classes was equal to 198. The bug caused the creation of more than a hundred of unused bins, although it didn't affect the caching logic.	2022-03-22 17:45:55 -07:00
Alex Lapenkov	eb65d1b078	Fix FreeBSD system jemalloc TSD cleanup Before this commit, in case FreeBSD libc jemalloc was overridden by another jemalloc, proper thread shutdown callback was involved only for the overriding jemalloc. A call to _malloc_thread_cleanup from libthr would be redirected to user jemalloc, leaving data about dead threads hanging in system jemalloc. This change tackles the issue in two ways. First, for current and old system jemallocs, which we can not modify, the overriding jemalloc would locate and invoke system cleanup routine. For upcoming jemalloc integrations, the cleanup registering function will also be redirected to user jemalloc, which means that system jemalloc's cleanup routine will be registered in user's jemalloc and a single call to _malloc_thread_cleanup will be sufficient to invoke both callbacks.	2022-03-02 10:10:27 -08:00
Azat Khuzhin	78b58379c8	Fix possible "nmalloc >= ndalloc" assertion. It is possible that ndalloc will be updated before nmalloc, in arena_large_ralloc_stats_update(), fix this by reorder those calls. It was found by ClickHouse CI, that periodically hits this assertion [1]. [1]: https://github.com/ClickHouse/ClickHouse/issues/31531 That issue contains lots of examples, with core dump and some gdb output [2]. [2]: https://s3.amazonaws.com/clickhouse-test-reports/34951/96390a9263cb5af3d6e42a84988239c9ae87ce32/stress_test__debug__actions_.html Here you can find binaries for that particular report [3] you need clickhouse debug build [4]. [3]: https://s3.amazonaws.com/clickhouse-builds/34951/96390a9263cb5af3d6e42a84988239c9ae87ce32/clickhouse_build_check_(actions)/report.html [4]: https://s3.amazonaws.com/clickhouse-builds/34951/96390a9263cb5af3d6e42a84988239c9ae87ce32/package_debug/clickhouse Brief info from that report: 2 0x000000002ad6dbfe in arena_stats_merge (tsdn=0x7f2399abdd20, arena=0x7f241ce01080, nthreads=0x7f24e4360958, dss=0x7f24e4360960, dirty_decay_ms=0x7f24e4360968, muzzy_decay_ms=0x7f24e4360970, nactive=0x7f24e4360978, ndirty=0x7f24e43 e4360988, astats=0x7f24e4360998, bstats=0x7f24e4363310, lstats=0x7f24e4364990, estats=0x7f24e4366e50, hpastats=0x7f24e43693a0, secstats=0x7f24e436a020) at ../contrib/jemalloc/src/arena.c:138 ndalloc = 226 nflush = 0 curlextents = 0 nmalloc = 225 nrequests = 0 Here you can see that they differs only by 1. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-03-01 12:28:28 -08:00
Alex Lapenkou	ca709c3139	Fix failed assertion due to racy memory access While calculating the number of stashed pointers, multiple variables potentially modified by a concurrent thread were used for the calculation. This led to some inconsistencies, correctly detected by the assertions. The change eliminates some possible inconsistencies by using unmodified variables and only once a concurrently modified one. The assertions are omitted for the cases where we acknowledge potential inconsistencies too.	2022-02-17 09:35:52 -08:00
Qi Wang	8c59c44ffa	Add a dependency checking step at the end of malloc_conf_init. Currently only prof_leak_error and prof_final are checked.	2022-01-26 17:17:48 -08:00
Qi Wang	efc539c040	Initialize prof_leak during prof init. Otherwise, prof_leak may get set after prof_leak_error, and disagree with each other.	2022-01-26 17:17:48 -08:00
yunxu	b798fabdf7	Add prof_leak_error option The option makes the process to exit with error code 1 if a memory leak is detected. This is useful for implementing automated tools that rely on leak detection.	2022-01-21 16:24:20 -08:00
Charles	eb196815d6	Avoid calculating size of size class twice & delete sc_data_global.	2022-01-18 11:54:12 -08:00
Qi Wang	ddb170b1d9	Simplify arena_migrate() to take arena_t* instead of indices. This makes debugging slightly easier and avoids the confusion of "should we create new arenas" here.	2022-01-11 16:59:22 -08:00
Qi Wang	d66162e032	Fix the extent state checking on the merge error path. With DSS as primary, the default merge impl will (correctly) decline to merge when one of the extent is non-dss. The error path should tolerate the not-merged extent being in a merging state.	2022-01-11 16:58:47 -08:00
Qi Wang	61978bbe69	Purge all if the last thread migrated away from an arena.	2022-01-06 19:02:26 -08:00
Yuriy Chernyshov	c91e62dd37	#include <features.h> as requested	2022-01-05 18:45:27 -08:00
Yuriy Chernyshov	18510020e7	Fix symbol conflict with musl libc `__libc` prefixed functions are used by musl libc as non-replaceable malloc stubs. Fix this conflict by checking if we are linking against glibc.	2022-01-05 18:45:27 -08:00
Qi Wang	f509703af5	Fix two conversion warnings in tcache.	2022-01-04 13:55:06 -08:00
Qi Wang	8b34a788b5	Fix an used-uninitialized warning (false positive).	2021-12-29 14:44:43 -08:00
Qi Wang	e491cef9ab	Add stats for stashed bytes in tcache.	2021-12-29 14:44:43 -08:00
Qi Wang	b75822bc6e	Implement use-after-free detection using junk and stash. On deallocation, sampled pointers (specially aligned) get junked and stashed into tcache (to prevent immediate reuse). The expected behavior is to have read-after-free corrupted and stopped by the junk-filling, while write-after-free is checked when flushing the stashed pointers.	2021-12-29 14:44:43 -08:00
Qi Wang	06aac61c4b	Split the core logic of tcache flush into a separate function. The core function takes a ptr array as input (containing items to be flushed), which will be reused to flush sanitizer-stashed items.	2021-12-29 14:44:43 -08:00
Qi Wang	d038160f3b	Fix shadowed variable usage. Verified with EXTRA_CFLAGS=-Wshadow.	2021-12-23 10:55:08 -08:00
Qi Wang	60b9637cc0	Only invoke malloc_cpu_count_is_deterministic() when necessary. Also refactor the handling of the non-deterministic case. Notably allow the case with narenas set to proceed w/o warnings, to not affect existing valid use cases.	2021-12-22 13:52:12 -08:00
Qi Wang	837b37c4ce	Fix the time-since computation in HPA. nstime module guarantees monotonic clock update within a single nstime_t. This means, if two separate nstime_t variables are read and updated separately, nstime_subtract between them may result in underflow. Fixed by switching to the time since utility provided by nstime.	2021-12-21 23:37:22 -08:00
Qi Wang	310af725b0	Add nstime_ns_since which obtains the duration since the input time.	2021-12-21 23:37:22 -08:00
Azat Khuzhin	cafe9a3158	Disable percpu arena in case of non deterministic CPU count Determinitic number of CPUs is important for percpu arena to work correctly, since it uses cpu index - sched_getcpu(), and if it will greater then number of CPUs bad thing will happen, or assertion will be failed in debug build: <jemalloc>: ../contrib/jemalloc/src/jemalloc.c:321: Failed assertion: "ind <= narenas_total_get()" Aborted (core dumped) Number of CPUs can be obtained from the following places: - sched_getaffinity() - sysconf(_SC_NPROCESSORS_ONLN) - sysconf(_SC_NPROCESSORS_CONF) For the sched_getaffinity() you may simply use taskset(1) to run program on a different cpu, and in case it will be not first, percpu will work incorrectly, i.e.: $ taskset --cpu-list $(( $(getconf _NPROCESSORS_ONLN)-1 )) <your_program> _SC_NPROCESSORS_ONLN uses /sys/devices/system/cpu/online, LXD/LXC virtualize /sys/devices/system/cpu/online file [1], and so when you run container with limited limits.cpus it will bind randomly selected CPU to it [1]: https://github.com/lxc/lxcfs/issues/301 _SC_NPROCESSORS_CONF uses /sys/devices/system/cpu/cpu*, and AFAIK nobody playing with dentries there. So if all three of these are equal, percpu arenas should work correctly. And a small note regardless _SC_NPROCESSORS_ONLN/_SC_NPROCESSORS_CONF, musl uses sched_getaffinity() for both. So this will also increase the entropy. Also note, that you can check is percpu arena really applied using abort_conf:true. Refs: https://github.com/jemalloc/jemalloc/pull/1939 Refs: https://github.com/ClickHouse/ClickHouse/issues/32806 v2: move malloc_cpu_count_is_deterministic() into malloc_init_hard_recursible() since _SC_NPROCESSORS_CONF does allocations for readdir() v3: - mark cpu_count_is_deterministic static - check only if percpu arena is enabled - check narenas	2021-12-21 11:53:09 -08:00

1 2 3 4 5 ...

1827 Commits