server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Kevin Svetlitski	589c63b424	Make eligible global variables `static` and/or `const` For better or worse, Jemalloc has a significant number of global variables. Making all eligible global variables `static` and/or `const` at least makes it slightly easier to reason about them, as these qualifications communicate to the programmer restrictions on their use without having to `grep` the whole codebase.	2023-07-06 14:15:12 -07:00
Qi Wang	e249d1a2a1	Remove unreachable code.	2023-07-06 12:06:06 -07:00
Qi Wang	602edd7566	Enabled -Wstrict-prototypes and fixed warnings.	2023-07-06 12:00:02 -07:00
Kevin Svetlitski	ebd7e99f5c	Add a test-case for small profiled allocations Validate that small allocations (i.e. those with `size <= SC_SMALL_MAXCLASS`) which are sampled for profiling maintain the expected invariants even though they now take up less space.	2023-07-03 16:19:06 -07:00
Kevin Svetlitski	5a858c64d6	Reduce the memory overhead of sampled small allocations Previously, small allocations which were sampled as part of heap profiling were rounded up to `SC_LARGE_MINCLASS`. This additional memory usage becomes problematic when the page size is increased, as noted in #2358. Small allocations are now rounded up to the nearest multiple of `PAGE` instead, reducing the memory overhead by a factor of 4 in the most extreme cases.	2023-07-03 16:19:06 -07:00
Kevin Svetlitski	e1338703ef	Address compiler warnings in the unit tests	2023-07-03 16:06:35 -07:00
Qi Wang	d131331310	Avoid eager purging on the dedicated oversize arena when using bg thds. We have observed new workload patterns (namely ML training type) that cycle through oversized allocations frequently, because 1) the dataset might be sparse which is faster to go through, and 2) GPU accelerated. As a result, the eager purging from the oversize arena becomes a bottleneck. To offer an easy solution, allow normal purging of the oversized extents when background threads are enabled.	2023-06-27 11:57:41 -07:00
Kevin Svetlitski	46e464a26b	Fix downloading LLVM in GitHub Action It turns out LLVM does not include a build for every platform in the assets for every release, just some of them. As such, I've pinned us to the latest release version with a corresponding build.	2023-06-23 14:30:49 -07:00
Kevin Svetlitski	f2e00d2fd3	Remove trailing whitespace Additionally, added a GitHub Action to ensure no more trailing whitespace will creep in again in the future. I'm excluding Markdown files from this check, since trailing whitespace is significant there, and also excluding `build-aux/install-sh` because there is significant trailing whitespace on the line that sets `defaultIFS`.	2023-06-23 11:58:18 -07:00
Kevin Svetlitski	05385191d4	Add GitHub action which runs static analysis Now that all of the various issues that static analysis uncovered have been fixed (#2431, #2432, #2433, #2436, #2437, #2446), I've added a GitHub action which will run static analysis for every PR going forward. When static analysis detects issues with your code, the GitHub action provides a link to download its findings in a form tailored for human consumption. Take a look at [this demonstration of what it looks like when static analysis issues are found](https://github.com/Svetlitski/jemalloc/actions/runs/5010245602) on my fork for an example (make sure to follow the instructions in the error message to download and inspect the results).	2023-06-23 11:55:43 -07:00
Kevin Svetlitski	bb0333e745	Fix remaining static analysis warnings Fix or suppress the remaining warnings generated by static analysis. This is a necessary step before we can incorporate static analysis into CI. Where possible, I've preferred to modify the code itself instead of just disabling the warning with a magic comment, so that if we decide to use different static analysis tools in the future we will be covered against them raising similar warnings.	2023-06-23 11:50:29 -07:00
Kevin Svetlitski	210f0d0b2b	Fix read of uninitialized data in `prof_free` In #2433, I inadvertently introduced a regression which causes the use of uninitialized data. Namely, the control path I added for the safety check in `arena_prof_info_get` neglected to set `prof_info->alloc_tctx` when the check fails, resulting in `prof_info.alloc_tctx` being uninitialized [when it is read at the end of `prof_free`](`90176f8a87/include/jemalloc/internal/prof_inlines.h (L272)`).	2023-06-15 18:30:05 -07:00
Kevin Svetlitski	90176f8a87	Fix segfault in rb `_tree_remove` Static analysis flagged this. It's possible to segfault in the `_tree_remove` function generated by `rb_gen`, as `nodep` may still be `NULL` after the initial for loop. I can confirm from reviewing the fleetwide coredump data that this was in fact being hit in production, primarily through `tctx_tree_remove`, and much more rarely through `gctx_tree_remove`.	2023-06-07 14:48:41 -07:00
Qi Wang	86eb49b478	Fix the arena selection for oversized allocations. Use the per-arena oversize_threshold, instead of the global setting.	2023-06-06 15:03:13 -07:00
Christos Zoulas	5832ef6589	Use a local variable to set the alignment for this particular allocation instead of changing mmap_flags which makes the change permanent. This was enforcing large alignments for allocations that did not need it causing fragmentation. Reported by Andreas Gustafsson.	2023-05-31 14:44:24 -07:00
Kevin Svetlitski	6d4aa33753	Extract the calculation of psset heap assignment for an hpdata into a common function This is in preparation for upcoming changes I plan to make to this logic. Extracting it into a common function will make this easier and less error-prone, and cleans up the existing code regardless.	2023-05-31 11:44:04 -07:00
Arne Welzel	c1d3ad4674	Prune je_malloc_default and do_rallocx in jeprof Running a simple Ruby and Python execution je_malloc_default and do_rallocx() in the resulting SVG / text output. Prune these, too. MALLOC_CONF='stats_print:true,lg_prof_sample:8,prof:true,prof_final:true' \ python3 -c '[x for x in range(10000000)]' MALLOC_CONF='stats_print:true,lg_prof_sample:8,prof:true,prof_final:true' \ ruby -e 'puts (0..1000).map{"0"}.join(" ")'	2023-05-31 11:41:09 -07:00
Arne Welzel	d59e30cbc9	Rename fallback_impl to fallbackNewImpl and prune in jeprof The existing fallback_impl name seemed a bit generic and given it's static probably okay to rename. Closes #2451	2023-05-31 11:41:09 -07:00
Qi Wang	d577e9b588	Explicitly cast to unsigned for MALLOCX_ARENA and _TCACHE defines.	2023-05-26 11:52:42 -07:00
Qi Wang	a2259f9fa6	Fix the include path of "jemalloc_internal_overrides.h".	2023-05-25 15:22:02 -07:00
Kevin Svetlitski	9c32689e57	Fix bug where hpa_shard was not being destroyed It appears that this was a simple mistake where `hpa_shard_disable` was being called instead of `hpa_shard_destroy`. At present `hpa_shard_destroy` is not called anywhere at all outside of test-cases, which further suggests that this is a bug. @davidtgoldblatt noted however that since HPA is disabled for manual arenas and we don't support destruction for auto arenas that presently there is no way to actually trigger this bug. Nonetheless, it should be fixed.	2023-05-18 14:17:38 -07:00
Kevin Svetlitski	4e6f1e9208	Allow overriding `LG_PAGE` This is useful for our internal builds where we override the configuration in the header files generated by autoconf.	2023-05-17 13:55:38 -07:00
Kevin Svetlitski	3e2ba7a651	Remove dead stores detected by static analysis None of these are harmful, and they are almost certainly optimized away by the compiler. The motivation for fixing them anyway is that we'd like to enable static analysis as part of CI, and the first step towards that is resolving the warnings it produces at present.	2023-05-11 20:27:49 -07:00
Kevin Svetlitski	0288126d9c	Fix possible `NULL` pointer dereference from `mallctl("prof.prefix", ...)` Static analysis flagged this issue. Here is a minimal program which causes a segfault within Jemalloc: ``` #include <jemalloc/jemalloc.h> const char *malloc_conf = "prof:true"; int main() { mallctl("prof.prefix", NULL, NULL, NULL, 0); } ``` Fixed by checking if `prefix` is `NULL`.	2023-05-11 14:47:50 -07:00
Qi Wang	d4a2b8bab1	Add the prof_sys_thread_name feature in the prof_recent unit test. This tests the combination of the prof_recent and thread_name features. Verified that it catches the issue being fixed in this PR. Also explicitly set thread name in test/unit/prof_recent. This fixes the name testing when no default thread name is set (e.g. FreeBSD).	2023-05-11 09:10:57 -07:00
Qi Wang	94ace05832	Fix the prof thread_name reference in prof_recent dump. As pointed out in #2434, the thread_name in prof_tdata_t was changed in #2407. This also requires an update for the prof_recent dump, specifically the emitter expects a "char **" which is fixed in this commit.	2023-05-11 09:10:57 -07:00
Qi Wang	6ea8a7e928	Add config detection for JEMALLOC_HAVE_PTHREAD_SET_NAME_NP. and use it on the background thread name setting.	2023-05-11 09:10:57 -07:00
auxten	5bac384970	If ptr present check if alloc_ctx.edata == NULL	2023-05-10 17:18:22 -07:00
auxten	019cccc293	Make arenas_lookup_ctl triable	2023-05-10 17:18:22 -07:00
Kevin Svetlitski	dc0a184f8d	Fix possible `NULL` pointer dereference in `VERIFY_READ` Static analysis flagged this. Fixed by simply checking `oldlenp` before dereferencing it.	2023-05-09 10:57:09 -07:00
Kevin Svetlitski	12311fe6c3	Fix segfault in `extent_try_coalesce_impl` Static analysis flagged this. `extent_record` was passing `NULL` as the value for `coalesced` to `extent_try_coalesce`, which in turn passes that argument to `extent_try_coalesce_impl`, where it is written to without checking if it is `NULL`. I can confirm from reviewing the fleetwide coredump data that this was in fact being hit in production.	2023-05-09 10:55:44 -07:00
Kevin Svetlitski	70344a2d38	Make eligible functions `static` The codebase is already very disciplined in making any function which can be `static`, but there are a few that appear to have slipped through the cracks.	2023-05-08 15:00:02 -07:00
Kevin Svetlitski	6841110bd6	Make `edata_cmp_summary_comp` 30% faster `edata_cmp_summary_comp` is one of the very hottest functions, taking up 3% of all time spent inside Jemalloc. I noticed that all existing callsites rely only on the sign of the value returned by this function, so I came up with this equivalent branchless implementation which preserves this property. After empirical measurement, I have found that this implementation is 30% faster, therefore representing a 1% speed-up to the allocator as a whole. At @interwq's suggestion, I've applied the same optimization to `edata_esnead_comp` in case this function becomes hotter in the future.	2023-05-04 09:59:17 -07:00
Amaury Séchet	f2b28906e6	Some nits in cache_bin.h	2023-05-01 10:21:17 -07:00
Kevin Svetlitski	fc680128e0	Remove errant `assert` in `arena_extent_alloc_large` This codepath may generate deferred work when the HPA is enabled. See also [@davidtgoldblatt's relevant comment on the PR which introduced this](https://github.com/jemalloc/jemalloc/pull/2107#discussion_r699770967) which prevented a similarly incorrect `assert` from being added elsewhere.	2023-05-01 10:00:30 -07:00
Eric Mueller	521970fb2e	Check for equality instead of assigning in asserts in hpa_from_pai. It appears like a simple typo means we're unconditionally overwriting some fields in hpa_from_pai when asserts are enabled. From hpa_shard_init, it looks like these fields have these values anyway, so this shouldn't cause bugs, but if something is wrong it seems better to have these asserts in place. See issue #2412.	2023-04-17 20:57:48 -07:00
guangli-dai	5f64ad60cd	Remove locked flag set in malloc_mutex_trylock As a hint flag of the lock, parameter locked should be set only when the lock is gained or freed.	2023-04-06 10:57:04 -07:00
Qi Wang	434a68e221	Disallow decay during reentrancy. Decay should not be triggered during reentrant calls (may cause lock order reversal / deadlocks). Added a delay_trigger flag to the tickers to bypass decay when rentrancy_level is not zero.	2023-04-05 10:16:37 -07:00
Qi Wang	e62aa478c7	Rearrange the bools in prof_tdata_t to save some bytes. This lowered the sizeof(prof_tdata_t) from 200 to 192 which is a round size class. Afterwards the tdata_t size remain unchanged with the last commit, which effectively inlined the storage of thread names for free.	2023-04-05 10:03:12 -07:00
Qi Wang	ce0b7ab6c8	Inline the storage for thread name in prof_tdata_t. The previous approach managed the thread name in a separate buffer, which causes races because the thread name update (triggered by new samples) can happen at the same time as prof dumping (which reads the thread names) -- these two operations are under separate locks to avoid blocking each other. Implemented the thread name storage as part of the tdata struct, which resolves the lifetime issue and also avoids internal alloc / dalloc during prof_sample.	2023-04-05 10:03:12 -07:00
Qi Wang	6cab460a45	Add a multithreaded test for prof_sys_thread_name. Verified that this catches the issue being fixed in 5fd5583.	2023-04-05 10:03:12 -07:00
Amaury Séchet	5266152d79	Simplify the logic in ph_remove	2023-03-31 14:35:31 -07:00
Amaury Séchet	be6da4f663	Do not maintain root->prev in ph_remove.	2023-03-31 14:34:57 -07:00
Amaury Séchet	543e2d61e6	Simplify the logic in ph_insert Also fixes what looks like an off by one error in the lazy aux list merge part of the code that previously never touched the last node in the aux list.	2023-03-31 14:34:24 -07:00
guangli-dai	31e01a98f1	Fix the rdtscp detection bug and add prefix for the macro.	2023-03-23 11:16:19 -07:00
Qi Wang	8b64be3441	Explicit arena assignment in test_tcache_max. Otherwise the associated arena could change with percpu arena enabled.	2023-03-22 15:16:43 -07:00
Qi Wang	8e7353a19b	Explicit arena assignment in test_thread_idle. Otherwise the associated arena could change with percpu arena enabled.	2023-03-22 15:16:43 -07:00
Marvin Schmidt	45249cf5a9	Fix exception specification error for hosts using musl libc It turns out that the previous commit did not suffice since the JEMALLOC_SYS_NOTHROW definition also causes the same exception specification errors as JEMALLOC_USE_CXX_THROW did: ``` x86_64-pc-linux-musl-cc -std=gnu11 -Werror=unknown-warning-option -Wall -Wextra -Wshorten-64-to-32 -Wsign-compare -Wundef -Wno-format-zero-length -Wpointer- arith -Wno-missing-braces -Wno-missing-field-initializers -pipe -g3 -fvisibility=hidden -Wimplicit-fallthrough -O3 -funroll-loops -march=native -O2 -pipe -c -march=native -O2 -pipe -D_GNU_SOURCE -D_REENTRANT -Iinclude -Iinclude -o src/background_thread.o src/background_thread.c In file included from src/jemalloc_cpp.cpp:9: In file included from include/jemalloc/internal/jemalloc_preamble.h:27: include/jemalloc/internal/../jemalloc.h:254:32: error: exception specification in declaration does not match previous declaration void JEMALLOC_SYS_NOTHROW je_malloc(size_t size) ^ include/jemalloc/internal/../jemalloc.h:75:21: note: expanded from macro 'je_malloc' ^ /usr/x86_64-pc-linux-musl/include/stdlib.h:40:7: note: previous declaration is here void malloc (size_t); ^ ``` On systems using the musl C library we have to omit the exception specification on malloc function family like it's done for MacOS, FreeBSD and OpenBSD.	2023-03-16 12:11:40 -07:00
Marvin Schmidt	aba1645f2d	configure: Handle -linux-musl hosts properly This is the same as the `--linux*` case with the two exceptions that we don't set glibc=1 and don't define JEMALLOC_USE_CXX_THROW	2023-03-16 12:11:40 -07:00
Qi Wang	d503d72129	Add the missing descriptions in AC_DEFINE	2023-03-14 16:47:00 -07:00

1 2 3 4 5 ...

3378 Commits