server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
David Goldblatt	6f41ba55ee	Mutex: Make spin count configurable. Don't document it since we don't want to support this as a "real" setting, but it's handy for testing.	2021-08-05 10:13:53 -07:00
David Goldblatt	d93eef2f40	HPA: Introduce a redesigned hpa_central_t. For now, this only handles allocating virtual address space to shards, with no reuse. This is framework, though; it will change over time.	2021-07-23 21:59:59 -07:00
David Goldblatt	6630c59896	HPA: Hugification hysteresis. We wait a while after deciding a huge extent should get hugified to see if it gets purged before long. This avoids hugifying extents that might shortly get dehugified for purging. Rename and use the hpa_dehugification_threshold option support code for this, since it's now ignored.	2021-07-12 17:59:18 -07:00
David Goldblatt	113938b6f4	HPA: Pull out a hooks type. For now, this is a no-op change. In a subsequent commit, it will be useful for testing.	2021-07-12 17:59:18 -07:00
David Goldblatt	1d4a7666d5	HPA: Do deferred operations on background threads.	2021-07-12 17:59:18 -07:00
David Goldblatt	583284f2d9	Add HPA deferral functionality.	2021-07-12 17:59:18 -07:00
David Goldblatt	d202218e86	HPA: Fix typos with big performance implications. This fixes two simple but significant typos in the HPA: - The conf string parsing accidentally set a min value of PAGE for hpa_sec_batch_fill_extra; i.e. allocating 4096 extra pages every time we attempted to allocate a single page. This puts us over the SEC flush limit, so we then immediately flush all but one of them (probably triggering purging). - The HPA was using the default PAI batch alloc implementation, which meant it did not actually get any locking advantages. This snuck by because I did all the performance testing without using the PAI interface or config settings. When I cleaned it up and put everything behind nice interfaces, I only did correctness checks, and didn't try any performance ones.	2021-06-24 16:26:55 -07:00
David Goldblatt	4452a4812f	Add opt.experimental_infallible_new. This allows a guarantee that operator new never throws. Fix the .gitignore rules to include test/integration/cpp while we're here.	2021-06-24 12:22:51 -07:00
David Goldblatt	36c6bfb963	SEC: Allow arbitrarily many shards, cached sizes.	2021-05-22 08:17:41 -07:00
David Goldblatt	fb327368db	SEC: Expand option configurability. This change pulls the SEC options into a struct, which simplifies their handling across various modules (e.g. PA needs to forward on SEC options from the malloc_conf string, but it doesn't really need to know their names). While we're here, make some of the fixed constants configurable, and unify naming from the configuration options to the internals.	2021-02-19 15:10:54 -08:00
Qi Wang	a11be50332	Implement opt.cache_oblivious. Keep config.cache_oblivious for now to remain backward-compatible.	2021-02-11 11:32:01 -08:00
Qi Wang	041145c272	Report the correct and wrong sizes on sized dealloc bug detection.	2021-02-08 14:42:27 -08:00
Qi Wang	f3b2668b32	Report the offending pointer on sized dealloc bug detection.	2021-02-08 14:42:27 -08:00
David Goldblatt	edbfe6912c	Inline malloc fastpath into operator new. This saves a small but non-negligible amount of CPU in C++ programs.	2021-02-08 14:17:47 -08:00
David Goldblatt	79f81a3732	HPA: Make dirty_mult configurable.	2021-02-04 20:58:31 -08:00
David Goldblatt	32dd153796	HPA: Make dehugification threshold configurable.	2021-02-04 20:58:31 -08:00
David Goldblatt	4790db15ed	HPA: make the hugification threshold configurable.	2021-02-04 20:58:31 -08:00
David Goldblatt	b3df80bc79	Pull HPA options into a containing struct. Currently that just means max_alloc, but we're about to add more. While we're touching these lines anyways, tweak things to be more in line with testing.	2021-02-04 20:58:31 -08:00
David Goldblatt	c259323ab3	Use ticker_geom_t for arena tcache decay.	2021-02-04 14:10:43 -08:00
Azat Khuzhin	a943172b73	Add runtime detection for MADV_DONTNEED zeroes pages (mostly for qemu) qemu does not support this, yet [1], and you can get very tricky assert if you will run program with jemalloc in use under qemu: <jemalloc>: ../contrib/jemalloc/src/extent.c:1195: Failed assertion: "p[i] == 0" [1]: https://patchwork.kernel.org/patch/10576637/ Here is a simple example that shows the problem [2]: // Gist to check possible issues with MADV_DONTNEED // For example it does not supported by qemu user // There is a patch for this [1], but it hasn't been applied. // [1]: https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05422.html #include <sys/mman.h> #include <stdio.h> #include <stddef.h> #include <assert.h> #include <string.h> int main(int argc, char *argv) { void addr = mmap(NULL, 1<<16, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANONYMOUS, -1, 0); if (addr == MAP_FAILED) { perror("mmap"); return 1; } memset(addr, 'A', 1<<16); if (!madvise(addr, 1<<16, MADV_DONTNEED)) { puts("MADV_DONTNEED does not return error. Check memory."); for (int i = 0; i < 1<<16; ++i) { assert(((unsigned char )addr)[i] == 0); } } else { perror("madvise"); } if (munmap(addr, 1<<16)) { perror("munmap"); return 1; } return 0; } ### unpatched qemu $ qemu-x86_64-static /tmp/test-MADV_DONTNEED MADV_DONTNEED does not return error. Check memory. test-MADV_DONTNEED: /tmp/test-MADV_DONTNEED.c:19: main: Assertion `((unsigned char )addr)[i] == 0' failed. qemu: uncaught target signal 6 (Aborted) - core dumped Aborted (core dumped) ### patched qemu (by returning ENOSYS error) $ qemu-x86_64 /tmp/test-MADV_DONTNEED madvise: Success ### patch for qemu to return ENOSYS diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 897d20c076..5540792e0e 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -11775,7 +11775,7 @@ static abi_long do_syscall1(void cpu_env, int num, abi_long arg1, turns private file-backed mappings into anonymous mappings. This will break MADV_DONTNEED. This is a hint, so ignoring and returning success is ok. / - return 0; + return ENOSYS; #endif #ifdef TARGET_NR_fcntl64 case TARGET_NR_fcntl64: [2]: https://gist.github.com/azat/12ba2c825b710653ece34dba7f926ece v2: - review fixes - add opt_dont_trust_madvise v3: - review fixes - rename opt_dont_trust_madvise to opt_trust_madvise	2021-01-20 20:08:30 -08:00
Yinan Zhang	40fa4d29d3	Track per size class internal fragmentation	2021-01-07 20:39:49 -08:00
David Goldblatt	f9bb8dedef	Un-force-inline do_rallocx. The additional overhead of the function-call setup and flags checking is relatively small, but costs us the replication of the entire realloc pathway in terms of size.	2021-01-04 14:55:49 -08:00
Yinan Zhang	ea013d8fa4	Enforce realloc sizing stability	2020-12-18 11:41:52 -08:00
David Goldblatt	1e3b8636ff	HPA: Remove unused malloc_conf options.	2020-12-08 12:10:48 -08:00
David Goldblatt	fffcefed33	malloc_conf: Clarify HPA options.	2020-12-07 06:21:08 -08:00
David Goldblatt	43af63fff4	HPA: Manage whole hugepages at a time. This redesigns the HPA implementation to allow us to manage hugepages all at once, locally, without relying on a global fallback.	2020-12-07 06:21:08 -08:00
David Goldblatt	d438296b1f	narenas_ratio: Accept fractional values. With recent scalability improvements to the HPA, we're experimenting with much lower arena counts; this gets annoying when trying to test across different hardware configurations using only the narenas setting.	2020-12-04 23:48:19 -08:00
David Carlier	520b75fa2d	utrace support with label based signature.	2020-11-30 11:43:00 -08:00
Yinan Zhang	92e189be8b	Add some comments to the batch allocation logic flow	2020-11-16 20:58:01 -08:00
Yinan Zhang	d96e4525ad	Route batch allocation of small batch size to tcache	2020-11-16 20:58:01 -08:00
DC	ef6d51ed44	DragonFlyBSD build support.	2020-10-27 12:35:19 -07:00
David Goldblatt	6599651aee	PA: Use an SEC in fron of the HPA shard.	2020-10-23 11:14:34 -07:00
David Goldblatt	534504d4a7	HPA: add size-exclusion functionality. I.e. only allowing allocations under or over certain sizes.	2020-10-23 11:14:34 -07:00
David Goldblatt	bf025d2ec8	HPA: Make slab sizes and maxes configurable. This allows easy experimentation with them as tuning parameters.	2020-10-23 11:14:34 -07:00
David Goldblatt	1c7da33317	HPA: Tie components into a PAI implementation.	2020-10-23 11:14:34 -07:00
Qi Wang	c8209150f9	Switch from opt.lg_tcache_max to opt.tcache_max Though for convenience, keep parsing lg_tcache_max.	2020-10-22 20:40:41 -07:00
Qi Wang	3de19ba401	Eagerly detect double free and sized dealloc bugs for large sizes.	2020-10-15 10:03:16 -07:00
Qi Wang	a9aa6f6d0f	Fix the alloc_ctx check in free_fastpath. The sanity check requires a functional TSD, which free_fastpath only guarantees after the threshold branch. Move the check function to afterwards.	2020-10-12 19:02:27 -07:00
David Goldblatt	b971f7c4dd	Add "default" option to slab sizes. This comes in handy when overriding earlier settings to test alternate ones. We don't really include tests for this, but I claim that's OK here: - It's fairly straightforward - It's fairly hard to test well - This entire code path is undocumented and mostly for our internal experimentation in the first place. - I tested manually.	2020-10-07 12:54:29 -07:00
David Goldblatt	ab274a23b9	Add narenas_ratio. This allows setting arenas per cpu dynamically, rather than forcing the user to know the number of CPUs in advance if they want a particular CPU/space tradeoff.	2020-08-12 16:41:57 -07:00
David Goldblatt	eaed1e39be	Add sized-delete size-checking functionality. The existing checks are good at finding such issues (on tcache flush), but not so good at pinpointing them. Debug mode can find them, but sometimes debug mode slows down a program so much that hard-to-hit bugs can take a long time to crash. This commit adds functionality to keep programs mostly on their fast paths, while also checking every sized delete argument they get.	2020-08-05 19:34:05 -07:00
David Goldblatt	60993697d8	Prof: Add prof_unbias. This gives more accurate attribution of bytes and counts to stack traces, without introducing backwards incompatibilities in heap-profile parsing tools. We track the ideal reported (to the end user) number of bytes more carefully inside core jemalloc. When dumping heap profiles, insteading of outputting our counts directly, we output counts that will cause parsing tools to give a result close to the value we want. We retain the old version as an opt setting, to let users who are tracking values on a per-component basis to keep their metrics stable until they decide to switch.	2020-08-05 18:33:55 -07:00
Yinan Zhang	978f830ee3	Add batch allocation API	2020-07-31 09:16:50 -07:00
David Carlier	00f06c9beb	enabling mpss on solaris/illumos. reusing slighty linux configuration as possible, aligning the address range to HUGEPAGE.	2020-07-06 09:59:10 -07:00
Yinan Zhang	d460333efb	Improve naming for prof system thread name option	2020-06-24 14:32:01 -07:00
Yinan Zhang	24bbf376ce	Unify arena flag reading and selection	2020-06-19 11:06:05 -07:00
Yinan Zhang	e128b170a0	Do not fallback to auto arena when manual arena is requested	2020-06-19 11:06:05 -07:00
Yinan Zhang	95a59d2f72	Unify tcache flag reading and selection	2020-06-19 11:06:05 -07:00
Yinan Zhang	4b0c008489	Unify zero flag reading and setting	2020-06-19 11:06:05 -07:00
Yinan Zhang	2a84f9b8fc	Unify alignment flag reading and computation	2020-06-19 11:06:05 -07:00

1 2 3 4 5 ...

480 Commits