server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
guangli-dai	867eedfc58	Fix the bug in dalloc promoted allocations. An allocation small enough will be promoted so that it does not share an extent with others. However, when dalloc, such allocations may not be dalloc as a promoted one if nbins < SC_NBINS. This commit fixes the bug.	2023-10-17 14:53:23 -07:00
guangli-dai	630f7de952	Add mallctl to set and get ncached_max of each cache_bin. 1. `thread_tcache_ncached_max_read_sizeclass` allows users to get the ncached_max of the bin with the input sizeclass, passed in through oldp (will be upper casted if not an exact bin size is given). 2. `thread_tcache_ncached_max_write` takes in a char array representing the settings for bins in the tcache.	2023-10-17 14:53:23 -07:00
guangli-dai	6b197fdd46	Pre-generate ncached_max for all bins for better tcache_max tuning experience.	2023-10-17 14:53:23 -07:00
guangli-dai	a442d9b895	Enable per-tcache tcache_max 1. add tcache_max and nhbins into tcache_t so that they are per-tcache, with one auto tcache per thread, it's also per-thread; 2. add mallctl for each thread to set its own tcache_max (of its auto tcache); 3. store the maximum number of items in each bin instead of using a global storage; 4. add tests for the modifications above. 5. Rename `nhbins` and `tcache_maxclass` to `global_do_not_change_nhbins` and `global_do_not_change_tcache_maxclass`.	2023-09-06 10:47:14 -07:00
Kevin Svetlitski	4f50f782fa	Use compiler-provided assume builtins when available There are several benefits to this: 1. It's cleaner and more reliable to use the builtin to inform the compiler of assumptions instead of hoping that the optimizer understands your intentions. 2. `clang` will warn you if any of your assumptions would produce side-effects (which the compiler will discard). [This blog post](https://fastcompression.blogspot.com/2019/01/compiler-checked-contracts.html) by Yann Collet highlights that a hazard of using the `unreachable()`-based method of signaling assumptions is that it can sometimes result in additional instructions being generated (see [this Godbolt link](https://godbolt.org/z/lKNMs3) from the blog post for an example).	2023-08-08 14:59:36 -07:00
Kevin Svetlitski	3e82f357bb	Fix all optimization-inhibiting integer-to-pointer casts Following from PR #2481, we replace all integer-to-pointer casts [which hide pointer provenance information (and thus inhibit optimizations)](https://clang.llvm.org/extra/clang-tidy/checks/performance/no-int-to-ptr.html) with equivalent operations that preserve this information. I have enabled the corresponding clang-tidy check in our static analysis CI so that we do not get bitten by this again in the future.	2023-07-24 14:40:42 -07:00
Kevin Svetlitski	7e54dd1ddb	Define `PROF_TCTX_SENTINEL` instead of using magic numbers This makes the code more readable on its own, and also sets the stage for more cleanly handling the pointer provenance lints in a following commit.	2023-07-24 14:40:42 -07:00
Kevin Svetlitski	41e0b857be	Make headers self-contained by fixing `#include`s Header files are now self-contained, which makes the relationships between the files clearer, and crucially allows LSP tools like `clangd` to function correctly in all of our header files. I have verified that the headers are self-contained (aside from the various Windows shims) by compiling them as if they were C files – in a follow-up commit I plan to add this to CI to ensure we don't regress on this front.	2023-07-14 09:06:32 -07:00
Kevin Svetlitski	5a858c64d6	Reduce the memory overhead of sampled small allocations Previously, small allocations which were sampled as part of heap profiling were rounded up to `SC_LARGE_MINCLASS`. This additional memory usage becomes problematic when the page size is increased, as noted in #2358. Small allocations are now rounded up to the nearest multiple of `PAGE` instead, reducing the memory overhead by a factor of 4 in the most extreme cases.	2023-07-03 16:19:06 -07:00
Kevin Svetlitski	210f0d0b2b	Fix read of uninitialized data in `prof_free` In #2433, I inadvertently introduced a regression which causes the use of uninitialized data. Namely, the control path I added for the safety check in `arena_prof_info_get` neglected to set `prof_info->alloc_tctx` when the check fails, resulting in `prof_info.alloc_tctx` being uninitialized [when it is read at the end of `prof_free`](`90176f8a87/include/jemalloc/internal/prof_inlines.h (L272)`).	2023-06-15 18:30:05 -07:00
Kevin Svetlitski	90176f8a87	Fix segfault in rb `_tree_remove` Static analysis flagged this. It's possible to segfault in the `_tree_remove` function generated by `rb_gen`, as `nodep` may still be `NULL` after the initial for loop. I can confirm from reviewing the fleetwide coredump data that this was in fact being hit in production, primarily through `tctx_tree_remove`, and much more rarely through `gctx_tree_remove`.	2023-06-07 14:48:41 -07:00
Qi Wang	86eb49b478	Fix the arena selection for oversized allocations. Use the per-arena oversize_threshold, instead of the global setting.	2023-06-06 15:03:13 -07:00
Qi Wang	434a68e221	Disallow decay during reentrancy. Decay should not be triggered during reentrant calls (may cause lock order reversal / deadlocks). Added a delay_trigger flag to the tickers to bypass decay when rentrancy_level is not zero.	2023-04-05 10:16:37 -07:00
divanorama	1897f185d2	Fix safety_check segfault in double free test	2022-10-03 10:55:10 -07:00
Guangli Dai	42daa1ac44	Add double free detection using slab bitmap for debug build Add a sanity check for double free issue in the arena in case that the tcache has been flushed.	2022-09-06 12:54:21 -07:00
Qi Wang	deb8e62a83	Implement guard pages. Adding guarded extents, which are regular extents surrounded by guard pages (mprotected). To reduce syscalls, small guarded extents are cached as a separate eset in ecache, and decay through the dirty / muzzy / retained pipeline as usual.	2021-09-26 16:30:15 -07:00
Qi Wang	041145c272	Report the correct and wrong sizes on sized dealloc bug detection.	2021-02-08 14:42:27 -08:00
Qi Wang	f3b2668b32	Report the offending pointer on sized dealloc bug detection.	2021-02-08 14:42:27 -08:00
David Goldblatt	c259323ab3	Use ticker_geom_t for arena tcache decay.	2021-02-04 14:10:43 -08:00
David Goldblatt	3967329813	Arena: share bin offsets in a global. This saves us a cache miss when lookup up the arena bin offset in a remote arena during tcache flush. All arenas share the base offset, and so we don't need to look it up repeatedly for each arena. Secondarily, it shaves 288 bytes off the arena on, e.g., x86-64.	2021-02-04 14:10:43 -08:00
David Goldblatt	229994a204	Tcache flush: keep common path state in registers. By carefully force-inlining the division constants and the operation sum count, we can eliminate redundant operations in the arena-level dalloc function. Do so.	2021-02-04 14:10:43 -08:00
Yinan Zhang	afa489c3c5	Record request size in prof info	2021-01-07 20:39:49 -08:00
Qi Wang	3de19ba401	Eagerly detect double free and sized dealloc bugs for large sizes.	2020-10-15 10:03:16 -07:00
David Goldblatt	db211eefbf	PAC: Move in decay.	2020-07-09 13:41:04 -07:00
David Goldblatt	294b276fc7	PA: Parameterize emap. Move emap_global to arena. This lets us test the PA module without interfering with the global emap used by the real allocator (the one not under test).	2020-04-10 13:12:47 -07:00
David Goldblatt	e77f47a85a	Move arena decay getters to PA.	2020-04-10 13:12:47 -07:00
David Goldblatt	7b62885476	Introduce decay module and put decay objects in PA	2020-04-10 13:12:47 -07:00
David Goldblatt	22a0a7b93a	Move arena_decay_extent to extent module.	2020-04-10 13:12:47 -07:00
David Goldblatt	70d12ffa05	PA: Move mapped into pa stats.	2020-04-10 13:12:47 -07:00
David Goldblatt	1ad368c8b7	PA: Move in decay stats.	2020-04-10 13:12:47 -07:00
David Goldblatt	356aaa7dc6	Introduce lockedint module. This pulls out the various abstractions where some stats counter is sometimes an atomic, sometimes a plain variable, sometimes always protected by a lock, sometimes protected by reads but not writes, etc. With this change, these cases are treated consistently, and access patterns tagged. In the process, we fix a few missed-update bugs (where one caller assumes "protected-by-a-lock" semantics and another does not).	2020-04-10 13:12:47 -07:00
David Goldblatt	585f925055	Move cache index randomization out of extent. This is logically at a higher level of the stack; extent should just allocate things at the page-level; it shouldn't care exactly why the callers wants a given number of pages.	2020-04-10 13:12:47 -07:00
David Goldblatt	7e6c8a7286	Emap: Standardize naming. Namespace everything under emap_, always specify what it is we're looking up (emap_lookup -> emap_edata_lookup), and use "ctx" over "info".	2020-02-17 10:50:51 -08:00
David Goldblatt	ac50c1e44b	Emap: Remove direct access to emap internals. In the process, we do a few local cleanups and optimizations. In particular, the size safety check on tcache flush no longer does a redundant load.	2020-02-17 10:50:51 -08:00
David Goldblatt	9b5d105fc3	Emap: Move in iealloc. This is logically scoped to the emap.	2020-02-17 10:50:51 -08:00
David Goldblatt	01f255161c	Add emap, for tracking extent locking.	2020-02-17 10:50:51 -08:00
Qi Wang	0f552ed673	Don't purge huge extents when decay is off.	2020-01-30 14:40:38 -08:00
Yinan Zhang	9a60cf54ec	Last-N profiling mode	2019-12-30 15:58:57 -08:00
Yinan Zhang	e98ddf7987	Fix unlikely condition in arena_prof_info_get()	2019-12-30 15:58:57 -08:00
David Goldblatt	a7862df616	Rename extent_t to edata_t. This frees us up from the unfortunate extent/extent2 naming collision.	2019-12-20 10:18:40 -08:00
David Goldblatt	ae0d8e8591	Move extent ehook calls into ehooks	2019-12-20 10:18:40 -08:00
David Goldblatt	9f6eb09585	Extents: Eagerly initialize extent hooks. When deferred initialization was added, initializing required copying sizeof(extent_hooks_t) bytes after a pointer chase. Today, it's just a single pointer loaded from the base_t. In subsequent diffs, we'll get rid of even that.	2019-12-20 10:18:40 -08:00
David Goldblatt	4278f84603	Move extent hook getters/setters to arena.c This is where they're logically scoped; they access arena data.	2019-12-20 10:18:40 -08:00
Yinan Zhang	4afd709d1f	Restructure setters for profiling info Explicitly define three setters: - `prof_tctx_reset()`: set `prof_tctx` to `1U`, if we don't know in advance whether the allocation is large or not; - `prof_tctx_reset_sampled()`: set `prof_tctx` to `1U`, if we already know in advance that the allocation is large; - `prof_info_set()`: set a real `prof_tctx`, and also set other profiling info e.g. the allocation time. Code structure wise, the prof level is kept as a thin wrapper, the large level only provides low level setter APIs, and the arena level carries out the main logic.	2019-12-17 10:01:28 -08:00
Yinan Zhang	45836d7fd3	Pass nstime_t pointer for profiling	2019-12-11 11:38:16 -08:00
Yinan Zhang	aa1d71fb7a	Rename prof_tctx to alloc_tctx in prof_info_t	2019-12-06 09:47:51 -08:00
Yinan Zhang	5e0b090992	No need to pass usize to prof_tctx_set()	2019-12-06 09:47:51 -08:00
Yinan Zhang	6945371778	Change tsdn to tsd for profiling code path	2019-11-22 16:31:56 -08:00
Yinan Zhang	b55419f9b9	Restructure profiling Develop new data structure and code logic for holding profiling related information stored in the extent that may be needed after the extent is released, which in particular is the case for the reallocation code path (e.g. in `rallocx()` and `xallocx()`). The data structure is a generalization of `prof_tctx_t`: we previously only copy out the `prof_tctx` before the extent is released, but we may be in need of additional fields. Currently the only additional field is the allocation time field, but there may be more fields in the future. The restructuring also resolved a bug: `prof_realloc()` mistakenly passed the new `ptr` to `prof_free_sampled_object()`, but passing in the `old_ptr` would crash because it's already been released. Now the essential profiling information is collectively copied out early and safely passed to `prof_free_sampled_object()` after the extent is released.	2019-11-22 16:31:56 -08:00
Yinan Zhang	4fbbc817c1	Simplify time setting and getting for prof log	2019-10-16 09:24:52 -07:00

1 2

90 Commits