server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Kevin Svetlitski	6d4aa33753	Extract the calculation of psset heap assignment for an hpdata into a common function This is in preparation for upcoming changes I plan to make to this logic. Extracting it into a common function will make this easier and less error-prone, and cleans up the existing code regardless.	2023-05-31 11:44:04 -07:00
Qi Wang	d577e9b588	Explicitly cast to unsigned for MALLOCX_ARENA and _TCACHE defines.	2023-05-26 11:52:42 -07:00
Qi Wang	a2259f9fa6	Fix the include path of "jemalloc_internal_overrides.h".	2023-05-25 15:22:02 -07:00
Kevin Svetlitski	4e6f1e9208	Allow overriding `LG_PAGE` This is useful for our internal builds where we override the configuration in the header files generated by autoconf.	2023-05-17 13:55:38 -07:00
Kevin Svetlitski	3e2ba7a651	Remove dead stores detected by static analysis None of these are harmful, and they are almost certainly optimized away by the compiler. The motivation for fixing them anyway is that we'd like to enable static analysis as part of CI, and the first step towards that is resolving the warnings it produces at present.	2023-05-11 20:27:49 -07:00
Qi Wang	6ea8a7e928	Add config detection for JEMALLOC_HAVE_PTHREAD_SET_NAME_NP. and use it on the background thread name setting.	2023-05-11 09:10:57 -07:00
Kevin Svetlitski	70344a2d38	Make eligible functions `static` The codebase is already very disciplined in making any function which can be `static`, but there are a few that appear to have slipped through the cracks.	2023-05-08 15:00:02 -07:00
Kevin Svetlitski	6841110bd6	Make `edata_cmp_summary_comp` 30% faster `edata_cmp_summary_comp` is one of the very hottest functions, taking up 3% of all time spent inside Jemalloc. I noticed that all existing callsites rely only on the sign of the value returned by this function, so I came up with this equivalent branchless implementation which preserves this property. After empirical measurement, I have found that this implementation is 30% faster, therefore representing a 1% speed-up to the allocator as a whole. At @interwq's suggestion, I've applied the same optimization to `edata_esnead_comp` in case this function becomes hotter in the future.	2023-05-04 09:59:17 -07:00
Amaury Séchet	f2b28906e6	Some nits in cache_bin.h	2023-05-01 10:21:17 -07:00
guangli-dai	5f64ad60cd	Remove locked flag set in malloc_mutex_trylock As a hint flag of the lock, parameter locked should be set only when the lock is gained or freed.	2023-04-06 10:57:04 -07:00
Qi Wang	434a68e221	Disallow decay during reentrancy. Decay should not be triggered during reentrant calls (may cause lock order reversal / deadlocks). Added a delay_trigger flag to the tickers to bypass decay when rentrancy_level is not zero.	2023-04-05 10:16:37 -07:00
Qi Wang	e62aa478c7	Rearrange the bools in prof_tdata_t to save some bytes. This lowered the sizeof(prof_tdata_t) from 200 to 192 which is a round size class. Afterwards the tdata_t size remain unchanged with the last commit, which effectively inlined the storage of thread names for free.	2023-04-05 10:03:12 -07:00
Qi Wang	ce0b7ab6c8	Inline the storage for thread name in prof_tdata_t. The previous approach managed the thread name in a separate buffer, which causes races because the thread name update (triggered by new samples) can happen at the same time as prof dumping (which reads the thread names) -- these two operations are under separate locks to avoid blocking each other. Implemented the thread name storage as part of the tdata struct, which resolves the lifetime issue and also avoids internal alloc / dalloc during prof_sample.	2023-04-05 10:03:12 -07:00
Amaury Séchet	5266152d79	Simplify the logic in ph_remove	2023-03-31 14:35:31 -07:00
Amaury Séchet	be6da4f663	Do not maintain root->prev in ph_remove.	2023-03-31 14:34:57 -07:00
Amaury Séchet	543e2d61e6	Simplify the logic in ph_insert Also fixes what looks like an off by one error in the lazy aux list merge part of the code that previously never touched the last node in the aux list.	2023-03-31 14:34:24 -07:00
guangli-dai	31e01a98f1	Fix the rdtscp detection bug and add prefix for the macro.	2023-03-23 11:16:19 -07:00
Amaury Séchet	f743690739	Remove unused mutex from hpa_central	2023-03-10 11:25:47 -08:00
guangli-dai	09e4b38fb1	Use asm volatile during benchmarks.	2023-02-24 11:17:48 -08:00
Qi Wang	8580c65f81	Implement prof sample hooks "experimental.hooks.prof_sample(_free)". The added hooks hooks.prof_sample and hooks.prof_sample_free are intended to allow advanced users to track additional information, to enable new ways of profiling on top of the jemalloc heap profile and sample features. The sample hook is invoked after the allocation and backtracing, and forwards the both the allocation and backtrace to the user hook; the sample_free hook happens before the actual deallocation, and forwards only the ptr and usz to the hook.	2022-12-07 16:06:49 -08:00
Guangli Dai	e8f9f13811	Inline free and sdallocx into operator delete	2022-11-21 11:14:05 -08:00
Qi Wang	481bbfc990	Add a configure option --enable-force-getenv. Allows the use of getenv() rather than secure_getenv() to read MALLOC_CONF. This helps in situations where hosts are under full control, and setting MALLOC_CONF is needed while also setuid. Disabled by default.	2022-11-04 13:37:14 -07:00
Qi Wang	143e9c4a2f	Enable fast thread locals for dealloc-only threads. Previously if a thread does only allocations, it stays on the slow path / minimal initialized state forever. However, dealloc-only is a valid pattern for dedicated reclamation threads -- this means thread cache is disabled (no batched flush) for them, which causes high overhead and contention. Added the condition to fully initialize TSD when a fair amount of dealloc activities are observed.	2022-10-25 09:54:38 -07:00
Paul Smith	be65438f20	jemalloc_internal_types.h: Use alloca if __STDC_NO_VLA__ is defined No currently-available version of Visual Studio C compiler supports variable length arrays, even if it defines __STDC_VERSION__ >= C99. As far as I know Microsoft has no plans to ever support VLAs in MSVC. The C11 standard requires that the __STDC_NO_VLA__ macro be defined if the compiler doesn't support VLAs, so fall back to alloca() if so.	2022-10-14 15:48:32 -07:00
divanorama	1897f185d2	Fix safety_check segfault in double free test	2022-10-03 10:55:10 -07:00
David Carlier	4c95c953e2	fix build for non linux/BSD platforms.	2022-10-03 10:42:09 -07:00
Guangli Dai	ba19d2cb78	Add arena-level name. An arena-level name can help identify manual arenas.	2022-09-16 15:04:59 -07:00
Guangli Dai	a0734fd6ee	Making jemalloc max stack depth a runtime option	2022-09-12 13:56:22 -07:00
Guangli Dai	ce29b4c3d9	Refactor the remote / cross thread cache bin stats reading Refactored cache_bin.h so that only one function is racy.	2022-09-06 19:41:19 -07:00
Guangli Dai	42daa1ac44	Add double free detection using slab bitmap for debug build Add a sanity check for double free issue in the arena in case that the tcache has been flushed.	2022-09-06 12:54:21 -07:00
Ivan Zaitsev	36366f3c4c	Add double free detection in thread cache for debug build Add new runtime option `debug_double_free_max_scan` that specifies the max number of stack entries to scan in the cache bit when trying to detect the double free bug (currently debug build only).	2022-08-04 16:58:22 -07:00
David CARLIER	4e12d21c8d	enabled percpu_arena settings on macOs. follow-up on #2280	2022-07-19 13:23:08 -07:00
David Carlier	58478412be	OpenBSD build fix. still no cpu affinity. - enabling pthread_get/pthread_set_name_np api. - disabling per thread cpu affinity handling, unsupported on this platform.	2022-07-19 13:20:11 -07:00
David Carlier	4fc5c4fbac	New configure option '--enable-pageid' for Linux The option makes jemalloc use prctl with PR_SET_VMA to tag memory mappings with "jemalloc_pg" or "jemalloc_pg_overcommit". This allows to easily identify jemalloc's mappings in /proc/<pid>/maps. PR_SET_VMA is only available in Linux 5.17 and above.	2022-06-09 18:54:08 -07:00
David Carlier	df8f7d10af	Implement malloc_getcpu for amd64 and arm64 macOS This enables per CPU arena on MacOS	2022-06-08 15:13:55 -07:00
barracuda156	70e3735f3a	jemalloc: fix PowerPC definitions in quantum.h	2022-05-26 10:51:10 -07:00
Alex Lapenkou	5b1f2cc5d7	Implement pvalloc replacement Despite being an obsolete function, pvalloc is still present in GLIBC and should work correctly when jemalloc replaces libc allocator.	2022-05-18 17:01:09 -07:00
Yuriy Chernyshov	70d4102f48	Fix compiling edata.h with MSVC At the time an attempt to compile jemalloc 5.3.0 with MSVC 2019 results in the followin error message: > jemalloc/include/jemalloc/internal/edata.h:660: error C4576: a parenthesized type followed by an initializer list is a non-standard explicit type conversion syntax	2022-05-09 14:51:07 -07:00
Qi Wang	8cb814629a	Make the default option of zero realloc match the system allocator.	2022-05-05 17:11:18 -07:00
Qi Wang	391bad4b95	Avoid abort() in test/integration/cpp/infallible_new_true. Allow setting the safety check abort hook through mallctl, which avoids abort() and core dumps.	2022-04-25 11:29:32 -07:00
cuishuang	9a242f16d9	fix some typos Signed-off-by: cuishuang <imcusg@gmail.com>	2022-04-25 11:29:00 -07:00
Qi Wang	0e29ad4efa	Rename zero_realloc option "strict" to "alloc". With realloc(ptr, 0) being UB per C23, the option name "strict" makes less sense now. Rename to "alloc" which describes the behavior.	2022-04-20 10:27:25 -07:00
Alex Lapenkou	a93931537e	Do not disable SEC by default for 64k pages platforms Default SEC max_alloc option value was 32k, disabling SEC for platforms with lg-page=16. This change enables SEC for all platforms, making minimum max_alloc value equal to PAGE.	2022-03-24 22:05:35 -07:00
Charles	eaaa368bab	Add comments and use meaningful vars in sz_psz2ind.	2022-03-24 16:56:59 -07:00
Alex Lapenkou	5bf03f8ce5	Implement PAGE_FLOOR macro	2022-03-22 17:45:55 -07:00
Alex Lapenkov	eb65d1b078	Fix FreeBSD system jemalloc TSD cleanup Before this commit, in case FreeBSD libc jemalloc was overridden by another jemalloc, proper thread shutdown callback was involved only for the overriding jemalloc. A call to _malloc_thread_cleanup from libthr would be redirected to user jemalloc, leaving data about dead threads hanging in system jemalloc. This change tackles the issue in two ways. First, for current and old system jemallocs, which we can not modify, the overriding jemalloc would locate and invoke system cleanup routine. For upcoming jemalloc integrations, the cleanup registering function will also be redirected to user jemalloc, which means that system jemalloc's cleanup routine will be registered in user's jemalloc and a single call to _malloc_thread_cleanup will be sufficient to invoke both callbacks.	2022-03-02 10:10:27 -08:00
Alex Lapenkou	ca709c3139	Fix failed assertion due to racy memory access While calculating the number of stashed pointers, multiple variables potentially modified by a concurrent thread were used for the calculation. This led to some inconsistencies, correctly detected by the assertions. The change eliminates some possible inconsistencies by using unmodified variables and only once a concurrently modified one. The assertions are omitted for the cases where we acknowledge potential inconsistencies too.	2022-02-17 09:35:52 -08:00
yunxu	b798fabdf7	Add prof_leak_error option The option makes the process to exit with error code 1 if a memory leak is detected. This is useful for implementing automated tools that rely on leak detection.	2022-01-21 16:24:20 -08:00
Qi Wang	ddb170b1d9	Simplify arena_migrate() to take arena_t* instead of indices. This makes debugging slightly easier and avoids the confusion of "should we create new arenas" here.	2022-01-11 16:59:22 -08:00
Qi Wang	067c2da074	Fix unnecessary returns in san_(un)guard_pages_two_sided.	2022-01-04 13:55:06 -08:00

1 2 3 4 5 ...

1524 Commits