server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Qi Wang	d131331310	Avoid eager purging on the dedicated oversize arena when using bg thds. We have observed new workload patterns (namely ML training type) that cycle through oversized allocations frequently, because 1) the dataset might be sparse which is faster to go through, and 2) GPU accelerated. As a result, the eager purging from the oversize arena becomes a bottleneck. To offer an easy solution, allow normal purging of the oversized extents when background threads are enabled.	2023-06-27 11:57:41 -07:00
Qi Wang	d4a2b8bab1	Add the prof_sys_thread_name feature in the prof_recent unit test. This tests the combination of the prof_recent and thread_name features. Verified that it catches the issue being fixed in this PR. Also explicitly set thread name in test/unit/prof_recent. This fixes the name testing when no default thread name is set (e.g. FreeBSD).	2023-05-11 09:10:57 -07:00
Kevin Svetlitski	70344a2d38	Make eligible functions `static` The codebase is already very disciplined in making any function which can be `static`, but there are a few that appear to have slipped through the cracks.	2023-05-08 15:00:02 -07:00
Qi Wang	434a68e221	Disallow decay during reentrancy. Decay should not be triggered during reentrant calls (may cause lock order reversal / deadlocks). Added a delay_trigger flag to the tickers to bypass decay when rentrancy_level is not zero.	2023-04-05 10:16:37 -07:00
Qi Wang	ce0b7ab6c8	Inline the storage for thread name in prof_tdata_t. The previous approach managed the thread name in a separate buffer, which causes races because the thread name update (triggered by new samples) can happen at the same time as prof dumping (which reads the thread names) -- these two operations are under separate locks to avoid blocking each other. Implemented the thread name storage as part of the tdata struct, which resolves the lifetime issue and also avoids internal alloc / dalloc during prof_sample.	2023-04-05 10:03:12 -07:00
Qi Wang	6cab460a45	Add a multithreaded test for prof_sys_thread_name. Verified that this catches the issue being fixed in 5fd5583.	2023-04-05 10:03:12 -07:00
Qi Wang	8b64be3441	Explicit arena assignment in test_tcache_max. Otherwise the associated arena could change with percpu arena enabled.	2023-03-22 15:16:43 -07:00
Qi Wang	8e7353a19b	Explicit arena assignment in test_thread_idle. Otherwise the associated arena could change with percpu arena enabled.	2023-03-22 15:16:43 -07:00
Qi Wang	71bc1a3d91	Avoid assuming the arena id in test when percpu_arena is used.	2023-03-13 10:50:10 -07:00
Qi Wang	97b313c7d4	More conservative setting for /test/unit/background_thread_enable. Lower the thread and arena count to avoid resource exhaustion on 32-bit.	2023-02-16 14:42:21 -08:00
Qi Wang	8580c65f81	Implement prof sample hooks "experimental.hooks.prof_sample(_free)". The added hooks hooks.prof_sample and hooks.prof_sample_free are intended to allow advanced users to track additional information, to enable new ways of profiling on top of the jemalloc heap profile and sample features. The sample hook is invoked after the allocation and backtracing, and forwards the both the allocation and backtrace to the user hook; the sample_free hook happens before the actual deallocation, and forwards only the ptr and usz to the hook.	2022-12-07 16:06:49 -08:00
Qi Wang	143e9c4a2f	Enable fast thread locals for dealloc-only threads. Previously if a thread does only allocations, it stays on the slow path / minimal initialized state forever. However, dealloc-only is a valid pattern for dedicated reclamation threads -- this means thread cache is disabled (no batched flush) for them, which causes high overhead and contention. Added the condition to fully initialize TSD when a fair amount of dealloc activities are observed.	2022-10-25 09:54:38 -07:00
Guangli Dai	ba19d2cb78	Add arena-level name. An arena-level name can help identify manual arenas.	2022-09-16 15:04:59 -07:00
Guangli Dai	a0734fd6ee	Making jemalloc max stack depth a runtime option	2022-09-12 13:56:22 -07:00
Guangli Dai	42daa1ac44	Add double free detection using slab bitmap for debug build Add a sanity check for double free issue in the arena in case that the tcache has been flushed.	2022-09-06 12:54:21 -07:00
Ivan Zaitsev	36366f3c4c	Add double free detection in thread cache for debug build Add new runtime option `debug_double_free_max_scan` that specifies the max number of stack entries to scan in the cache bit when trying to detect the double free bug (currently debug build only).	2022-08-04 16:58:22 -07:00
Alex Lapenkou	5b1f2cc5d7	Implement pvalloc replacement Despite being an obsolete function, pvalloc is still present in GLIBC and should work correctly when jemalloc replaces libc allocator.	2022-05-18 17:01:09 -07:00
Qi Wang	66c889500a	Make test/unit/background_thread_enable more conservative. To avoid resource exhaustion on 32-bit platforms.	2022-05-04 15:32:57 -07:00
cuishuang	9a242f16d9	fix some typos Signed-off-by: cuishuang <imcusg@gmail.com>	2022-04-25 11:29:00 -07:00
Qi Wang	0e29ad4efa	Rename zero_realloc option "strict" to "alloc". With realloc(ptr, 0) being UB per C23, the option name "strict" makes less sense now. Rename to "alloc" which describes the behavior.	2022-04-20 10:27:25 -07:00
Charles	eaaa368bab	Add comments and use meaningful vars in sz_psz2ind.	2022-03-24 16:56:59 -07:00
Alex Lapenkou	52631c90f6	Fix size class calculation for sec Due to a bug in sec initialization, the number of cached size classes was equal to 198. The bug caused the creation of more than a hundred of unused bins, although it didn't affect the caching logic.	2022-03-22 17:45:55 -07:00
Qi Wang	20f9802e4f	Avoid overflow warnings in test/unit/safety_check.	2022-01-27 10:29:54 -08:00
yunxu	b798fabdf7	Add prof_leak_error option The option makes the process to exit with error code 1 if a memory leak is detected. This is useful for implementing automated tools that rely on leak detection.	2022-01-21 16:24:20 -08:00
Qi Wang	648b3b9f76	Lower the num_threads in the stress test of test/unit/prof_recent This takes a fair amount of resources. Under high concurrency it was causing resource exhaustion such as pthread_create and mmap failures.	2022-01-11 16:58:56 -08:00
Qi Wang	6230cc88b6	Add background thread sleep retry in test/unit/hpa_background_thread Under high concurrency / heavy test load (e.g. using run_tests.sh), the background thread may not get scheduled for a longer period of time. Retry 100 times max before bailing out.	2022-01-07 10:28:28 -08:00
Qi Wang	d660683d3d	Fix test config of lg_san_uaf_align. The option may be configure-disabled, which resulted in the invalid options output from the tests.	2022-01-04 11:03:51 -08:00
Qi Wang	dfdd7562f5	Rename san_enabled() to san_guard_enabled().	2021-12-29 14:44:43 -08:00
Qi Wang	e491cef9ab	Add stats for stashed bytes in tcache.	2021-12-29 14:44:43 -08:00
Qi Wang	b75822bc6e	Implement use-after-free detection using junk and stash. On deallocation, sampled pointers (specially aligned) get junked and stashed into tcache (to prevent immediate reuse). The expected behavior is to have read-after-free corrupted and stopped by the junk-filling, while write-after-free is checked when flushing the stashed pointers.	2021-12-29 14:44:43 -08:00
Qi Wang	d038160f3b	Fix shadowed variable usage. Verified with EXTRA_CFLAGS=-Wshadow.	2021-12-23 10:55:08 -08:00
Qi Wang	bd70d8fc0f	Add the profiling settings for tests explicit. Many profiling related tests make assumptions on the profiling settings, e.g. opt_prof is off by default, and prof_active is default on when opt_prof is on. However the default settings can be changed via --with-malloc-conf at build time. Fixing the tests by adding the assumed settings explicitly.	2021-12-22 20:10:28 -08:00
Qi Wang	837b37c4ce	Fix the time-since computation in HPA. nstime module guarantees monotonic clock update within a single nstime_t. This means, if two separate nstime_t variables are read and updated separately, nstime_subtract between them may result in underflow. Fixed by switching to the time since utility provided by nstime.	2021-12-21 23:37:22 -08:00
Qi Wang	310af725b0	Add nstime_ns_since which obtains the duration since the input time.	2021-12-21 23:37:22 -08:00
mweisgut	bb5052ce90	Fix base_ehooks_get_for_metadata	2021-12-20 15:37:53 -08:00
Alex Lapenkou	800ce49c19	San: Bump alloc frequently reused guarded allocations To utilize a separate retained area for guarded extents, use bump alloc to allocate those extents.	2021-12-15 10:39:17 -08:00
Alex Lapenkou	f56f5b9930	Pass 'frequent_reuse' hint to PAI Currently used only for guarding purposes, the hint is used to determine if the allocation is supposed to be frequently reused. For example, it might urge the allocator to ensure the allocation is cached.	2021-12-15 10:39:17 -08:00
Alex Lapenkou	2c70e8d351	Rename 'arena_decay' to 'arena_util' While initially this file contained helper functions for one particular test, now its usage spread across different test files. Purpose has shifted towards a collection of handy arena ctl wrappers.	2021-12-15 10:39:17 -08:00
Alex Lapenkou	0f6da1257d	San: Implement bump alloc The new allocator will be used to allocate guarded extents used as slabs for guarded small allocations.	2021-12-15 10:39:17 -08:00
Alex Lapenkou	34b00f8969	San: Avoid running san tests with prof enabled With prof enabled, number of page aligned allocations doesn't match the number of slab "ends" because prof allocations skew the addresses. It leads to 'pages' array overflow and hard to debug failures.	2021-12-15 10:39:17 -08:00
Alex Lapenkou	62f9c54d2a	San: Rename 'guard' to 'san' This prepares the foundation for more sanitizer-related work in the future.	2021-12-15 10:39:17 -08:00
Qi Wang	400c59895a	Fix uninitialized nstime reading / updating on the stack in hpa. In order for nstime_update to handle non-monotonic clocks, it requires the input nstime to be initialized -- when reading for the first time, zero init has to be done. Otherwise random stack value may be seen as clocks and returned.	2021-11-16 16:54:12 -08:00
Qi Wang	4d56aaeca5	Optimize away the tsd_fast() check on free fastpath. To ensure that the free fastpath can tolerate uninitialized tsd, improved the static initializer for rtree_ctx in tsd.	2021-10-28 10:05:59 -07:00
Alex Lapenkou	8daac7958f	Redefine functions with test hooks only for tests Android build has issues with these defines, this will allow the build to succeed if it doesn't need to build the tests.	2021-10-15 15:25:36 -07:00
Alex Lapenkou	c9ebff0fd6	Initialize deferred_work_generated As the code evolves, some code paths that have previously assigned deferred_work_generated may cease being reached. This would leave the value uninitialized. This change initializes the value for safety.	2021-10-07 11:50:38 -07:00
David CARLIER	cf9724531a	Darwin malloc_size override support proposal. Darwin has similar api than Linux/FreeBSD's malloc_usable_size.	2021-10-01 14:32:40 -07:00
Qi Wang	83f3294027	Small refactors around 7bb05e0.	2021-09-27 16:05:13 -07:00
Qi Wang	deb8e62a83	Implement guard pages. Adding guarded extents, which are regular extents surrounded by guard pages (mprotected). To reduce syscalls, small guarded extents are cached as a separate eset in ecache, and decay through the dirty / muzzy / retained pipeline as usual.	2021-09-26 16:30:15 -07:00
Piotr Balcer	7bb05e04be	add experimental.arenas_create_ext mallctl This mallctl accepts an arena_config_t structure which can be used to customize the behavior of the arena. Right now it contains extent_hooks and a new option, metadata_use_hooks, which controls whether the extent hooks are also used for metadata allocation. The medata_use_hooks option has two main use cases: 1. In heterogeneous memory systems, to avoid metadata being placed on potentially slower memory. 2. Avoiding virtual memory from being leaked as a result of metadata allocation failure originating in an extent hook.	2021-09-24 13:43:18 -07:00
Alex Lapenkou	a9031a0970	Allow setting a dump hook If users want to be notified when a heap dump occurs, they can set this hook.	2021-09-22 15:04:01 -07:00

1 2 3 4 5 ...

607 Commits