server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
David Goldblatt	68a1666e91	hpdata: Rename "dirty" to "touched". This matches the usage in the rest of the codebase.	2021-02-04 20:58:31 -08:00
David Goldblatt	be0d7a53f3	HPA: Don't track inactive pages. This is really only useful for human consumption. Correspondingly, emit it only in the human-readable stats, and let everybody else compute from the hugepage size and nactive.	2021-02-04 20:58:31 -08:00
David Goldblatt	55e0f60ca1	psset stats: Simplify handling. We can treat the huge and nonhuge cases uniformly using huge state as an array index.	2021-02-04 20:58:31 -08:00
David Goldblatt	94cd9444c5	HPA: Some minor reformattings.	2021-02-04 20:58:31 -08:00
David Goldblatt	b25ee5d88e	HPA: Add purge stats.	2021-02-04 20:58:31 -08:00
David Goldblatt	746ea3de6f	HPA stats: Allow some derived stats. However, we put them in their own struct, to avoid the messiness that the arena has (mixing derived and non-derived stats in the arena_stats_t).	2021-02-04 20:58:31 -08:00
David Goldblatt	30b9e8162b	HPA: Generalize purging. Previously, we would purge a hugepage only when it's completely empty. With this change, we can purge even when only partially empty. Although the heuristic here is still fairly primitive, this infrastructure can scale to become more advanced.	2021-02-04 20:58:31 -08:00
David Goldblatt	70692cfb13	hpdata: Add state changing helpers. We're about to allow hugepage subextent purging; get as much of our metadata handling ready as possible.	2021-02-04 20:58:31 -08:00
David Goldblatt	2ae966222f	hpdata: track per-page dirty state.	2021-02-04 20:58:31 -08:00
David Goldblatt	ff4086aa6b	hpdata: count active pages instead of free ones. This will be more consistent with later naming choices.	2021-02-04 20:58:31 -08:00
David Goldblatt	20140629b4	Bin: Move stats closer to the mutex. This is a slight cache locality optimization.	2021-02-04 14:10:43 -08:00
David Goldblatt	c259323ab3	Use ticker_geom_t for arena tcache decay.	2021-02-04 14:10:43 -08:00
David Goldblatt	8edfc5b170	Add ticker_geom_t. This lets a single ticker object drive events across a large number of different tick streams while sharing state.	2021-02-04 14:10:43 -08:00
David Goldblatt	3967329813	Arena: share bin offsets in a global. This saves us a cache miss when lookup up the arena bin offset in a remote arena during tcache flush. All arenas share the base offset, and so we don't need to look it up repeatedly for each arena. Secondarily, it shaves 288 bytes off the arena on, e.g., x86-64.	2021-02-04 14:10:43 -08:00
David Goldblatt	2fcbd18115	Cache bin: Don't reverse flush order. The items we pick to flush matter a lot, but the order in which they get flushed doesn't; just use forward scans. This simplifies the accessing code, both in terms of the C and the generated assembly (i.e. this speeds up the flush pathways).	2021-02-04 14:10:43 -08:00
David Goldblatt	4c46e11365	Cache an arena's index in the arena. This saves us a pointer hop down some perf-sensitive paths.	2021-02-04 14:10:43 -08:00
David Goldblatt	229994a204	Tcache flush: keep common path state in registers. By carefully force-inlining the division constants and the operation sum count, we can eliminate redundant operations in the arena-level dalloc function. Do so.	2021-02-04 14:10:43 -08:00
David Goldblatt	31a629c3de	Tcache flush: prefetch edata contents. This frontloads more of the miss latency. It also moves it to a pathway where we have not yet acquired any locks, so that it should (hopefully) reduce hold times.	2021-02-04 14:10:43 -08:00
David Goldblatt	9f9247a62e	Tcache fluhing: increase cache miss parallelism. In practice, many rtree_leaf_elm accesses are cache misses. By restructuring, we can make it more likely that these misses occur without blocking us from starting later lookups, taking more of those misses in parallel.	2021-02-04 14:10:43 -08:00
David Goldblatt	181ba7fd4d	Tcache flush: Add an emap "batch lookup" path. For now this is a no-op; but the interface is a little more flexible for our purposes.	2021-02-04 14:10:43 -08:00
David Goldblatt	c007c537ff	Tcache flush: Unify edata lookup path.	2021-02-04 14:10:43 -08:00
David CARLIER	35a8552605	Mac OS: Tag mapped pages. This can be used to help profiling tools (e.g. vmmap) identify the sources of mappings more specifically.	2021-02-03 15:05:53 -08:00
Yinan Zhang	f6699803e2	Fix duration in prof log	2021-01-25 16:38:38 -08:00
Azat Khuzhin	a943172b73	Add runtime detection for MADV_DONTNEED zeroes pages (mostly for qemu) qemu does not support this, yet [1], and you can get very tricky assert if you will run program with jemalloc in use under qemu: <jemalloc>: ../contrib/jemalloc/src/extent.c:1195: Failed assertion: "p[i] == 0" [1]: https://patchwork.kernel.org/patch/10576637/ Here is a simple example that shows the problem [2]: // Gist to check possible issues with MADV_DONTNEED // For example it does not supported by qemu user // There is a patch for this [1], but it hasn't been applied. // [1]: https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05422.html #include <sys/mman.h> #include <stdio.h> #include <stddef.h> #include <assert.h> #include <string.h> int main(int argc, char *argv) { void addr = mmap(NULL, 1<<16, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANONYMOUS, -1, 0); if (addr == MAP_FAILED) { perror("mmap"); return 1; } memset(addr, 'A', 1<<16); if (!madvise(addr, 1<<16, MADV_DONTNEED)) { puts("MADV_DONTNEED does not return error. Check memory."); for (int i = 0; i < 1<<16; ++i) { assert(((unsigned char )addr)[i] == 0); } } else { perror("madvise"); } if (munmap(addr, 1<<16)) { perror("munmap"); return 1; } return 0; } ### unpatched qemu $ qemu-x86_64-static /tmp/test-MADV_DONTNEED MADV_DONTNEED does not return error. Check memory. test-MADV_DONTNEED: /tmp/test-MADV_DONTNEED.c:19: main: Assertion `((unsigned char )addr)[i] == 0' failed. qemu: uncaught target signal 6 (Aborted) - core dumped Aborted (core dumped) ### patched qemu (by returning ENOSYS error) $ qemu-x86_64 /tmp/test-MADV_DONTNEED madvise: Success ### patch for qemu to return ENOSYS diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 897d20c076..5540792e0e 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -11775,7 +11775,7 @@ static abi_long do_syscall1(void cpu_env, int num, abi_long arg1, turns private file-backed mappings into anonymous mappings. This will break MADV_DONTNEED. This is a hint, so ignoring and returning success is ok. / - return 0; + return ENOSYS; #endif #ifdef TARGET_NR_fcntl64 case TARGET_NR_fcntl64: [2]: https://gist.github.com/azat/12ba2c825b710653ece34dba7f926ece v2: - review fixes - add opt_dont_trust_madvise v3: - review fixes - rename opt_dont_trust_madvise to opt_trust_madvise	2021-01-20 20:08:30 -08:00
David Goldblatt	a011c4c22d	cache_bin: Separate out local and remote accesses. This fixes an incorrect debug-mode assert: - T1 starts an arena stats update and reads stack_head from another thread's cache bin, when that cache bin has 1 item in it. - T2 allocates from that cache bin. The cache_bin's stack_head now points to a NULL pointer, since the cache bin is empty. - T1 Re-reads the cache_bin's stack_head to perform an assertion check (since it previously saw that the bin was empty, whatever stack_head points to should be non-NULL).	2021-01-08 14:18:08 -08:00
Yinan Zhang	14d689c0f9	Add prof stats mutex stats	2021-01-07 20:39:49 -08:00
Yinan Zhang	9f71b5779b	Output prof stats in stats print	2021-01-07 20:39:49 -08:00
Yinan Zhang	1f1a0231ed	Split macros for initializing stats headers	2021-01-07 20:39:49 -08:00
Yinan Zhang	54f3351f1f	Add mallctl for prof stats fetching	2021-01-07 20:39:49 -08:00
Yinan Zhang	40fa4d29d3	Track per size class internal fragmentation	2021-01-07 20:39:49 -08:00
Yinan Zhang	afa489c3c5	Record request size in prof info	2021-01-07 20:39:49 -08:00
David Goldblatt	f9bb8dedef	Un-force-inline do_rallocx. The additional overhead of the function-call setup and flags checking is relatively small, but costs us the replication of the entire realloc pathway in terms of size.	2021-01-04 14:55:49 -08:00
David Goldblatt	a9fa2defdb	Add JEMALLOC_COLD, and mark some functions cold. This hints to the compiler that it should care more about space than CPU (among other things). In cases where the compiler lacks profile-guided information, this can be a substantial space savings. For now, we mark the mallctl or atexit driven profiling and stats functions that take up the most space.	2021-01-04 14:55:49 -08:00
David Goldblatt	5d8e70ab26	prof_recent: cassert(config_prof) more often. This tells the compiler that these functions are never called, which lets them be optimized away in builds where profiling is disabled.	2021-01-04 14:55:49 -08:00
David Goldblatt	83cad746ae	prof_log: cassert(config_prof) in public functions This lets the compiler infer that the code is dead in builds where profiling is enabled, saving on space there.	2021-01-04 14:55:49 -08:00
David Goldblatt	526180b76d	Extent.c: Avoid an rtree NULL-check. The edge case in which pages_map returns (void *)PAGE can trigger an incorrect assertion failure. Avoid it.	2021-01-04 14:50:49 -08:00
Yinan Zhang	b35ac00d58	Do not bump to large size for page aligned request	2020-12-29 17:09:58 -08:00
Yinan Zhang	8a56d6b636	Add last-N mutex stats	2020-12-29 09:44:19 -08:00
Yinan Zhang	22d62d8cbd	Handle ending gap properly for HPA stats	2020-12-18 16:40:57 -08:00
Yinan Zhang	6c5a3a24dd	Omit bin stats rows with no data	2020-12-18 16:40:57 -08:00
Yinan Zhang	ea013d8fa4	Enforce realloc sizing stability	2020-12-18 11:41:52 -08:00
Yinan Zhang	74bd63b203	Optimize stats print using partial name-to-mib	2020-12-18 10:39:58 -08:00
Yinan Zhang	4557c0a67d	Enable ctl on partial mib and partial name	2020-12-18 10:39:58 -08:00
Yinan Zhang	006dd0414e	Add partial name-to-mib functionality	2020-12-18 10:39:58 -08:00
Yinan Zhang	f2e1a5be77	Do not fail on partial ctl path for ctl_nametomib() We do not fail on partial ctl path when the given `mib` array is shorter than the given name, and we should keep the behavior the same in the reverse case, which I feel is also the more natural way.	2020-12-18 10:39:58 -08:00
Yinan Zhang	6ab181d2b7	Extract node lookup given mib input	2020-12-18 10:39:58 -08:00
Yinan Zhang	3a627b9674	No need to record all nodes in ctl_lookup()	2020-12-18 10:39:58 -08:00
Yinan Zhang	91e006c4c2	Enable ctl_lookup() to start from arbitrary node	2020-12-18 10:39:58 -08:00
Jin Qian	4e3fe218e9	Use posix_madvise to purge pages when available	2020-12-18 10:05:59 -08:00
David Goldblatt	1e3b8636ff	HPA: Remove unused malloc_conf options.	2020-12-08 12:10:48 -08:00
Aditya Kumar	9522ae41d6	Move n_search outside of assert as reported by static analyzer	2020-12-07 06:49:27 -08:00
David Goldblatt	a559caf74a	hpdata: Strengthen assertions. Now that we have flat bitmap bit counting functions, we can easily assert that nfree is always correct. While we're tightening up this code, enforce consistency on API boundaries as well.	2020-12-07 06:21:08 -08:00
David Goldblatt	3ed0b4e8a3	HPA: Add an nevictions counter. I.e. the number of times we've purged a hugepage-sized region.	2020-12-07 06:21:08 -08:00
David Goldblatt	fffcefed33	malloc_conf: Clarify HPA options.	2020-12-07 06:21:08 -08:00
David Goldblatt	f7cf23aa4d	psset: Relegate alloc/dalloc to test code. This is no longer part of the "core" functionality; we only need the stub implementations as an end-to-end test of hpdata + psset interactions when metadata is being modified. Treat them accordingly.	2020-12-07 06:21:08 -08:00
David Goldblatt	f9299ca572	HPA: Use psset fit/insert/remove. This will let us remove alloc_new and alloc_reuse functions from the psset.	2020-12-07 06:21:08 -08:00
David Goldblatt	0971e1e4e3	hpdata: Use addr/size instead of begin/npages. This is easier for the users of the hpdata.	2020-12-07 06:21:08 -08:00
David Goldblatt	5228d869ee	psset: Use fit/insert/remove as basis functions. All other functionality can be implemented in terms of these; doing so (while retaining the same API) will be convenient for subsequent refactors.	2020-12-07 06:21:08 -08:00
David Goldblatt	089f8fa442	Move hpdata bitmap logic out of the psset.	2020-12-07 06:21:08 -08:00
David Goldblatt	ca30b5db2b	Introduce hpdata_t. Using an edata_t both for hugepages and the allocations within those hugepages was convenient at first, but has outlived its usefulness. Representing hugepages explicitly, with their own data structure, will make future development easier.	2020-12-07 06:21:08 -08:00
David Goldblatt	43af63fff4	HPA: Manage whole hugepages at a time. This redesigns the HPA implementation to allow us to manage hugepages all at once, locally, without relying on a global fallback.	2020-12-07 06:21:08 -08:00
David Goldblatt	c1b2a77933	psset: Move in stats. A later change will benefit from having these functions pulled into a psset-module set of functions.	2020-12-07 06:21:08 -08:00
David Goldblatt	d0a991d47b	psset: Add insert/remove functions. These will allow us to (for instance) move pageslabs from a psset dedicated to not-yet-hugeified pages to one dedicated to hugeified ones.	2020-12-07 06:21:08 -08:00
David Goldblatt	d438296b1f	narenas_ratio: Accept fractional values. With recent scalability improvements to the HPA, we're experimenting with much lower arena counts; this gets annoying when trying to test across different hardware configurations using only the narenas setting.	2020-12-04 23:48:19 -08:00
David Goldblatt	ecd39418ac	Add fxp: A fixed-point math library. This will be used in the next commit to allow non-integer values for narenas_ratio.	2020-12-04 23:48:19 -08:00
David Carlier	520b75fa2d	utrace support with label based signature.	2020-11-30 11:43:00 -08:00
Yinan Zhang	92e189be8b	Add some comments to the batch allocation logic flow	2020-11-16 20:58:01 -08:00
Yinan Zhang	d96e4525ad	Route batch allocation of small batch size to tcache	2020-11-16 20:58:01 -08:00
Yinan Zhang	566c4a8594	Slight changes to cache bin internal functions	2020-11-16 20:58:01 -08:00
Yinan Zhang	9545c2cd36	Add sample interval to prof last-N dump	2020-11-13 15:33:27 -08:00
David Goldblatt	cf2549a149	Add a per-arena oversize_threshold. This can let manual arenas trade off memory and CPU the way auto arenas do.	2020-11-13 13:45:35 -08:00
David Goldblatt	4ca3d91e96	Rename geom_grow -> exp_grow. This was promised in the review of the introduction of geom_grow, but would have been painful to do there because of the series that introduced it. Now that those are comitted, renaming is easier.	2020-11-13 13:42:33 -08:00
David Goldblatt	b4c37a6e81	Rename edata_tree_t -> edata_avail_t. This isn't a tree any more, and it mildly irritates me any time I see it.	2020-11-13 13:42:11 -08:00
David Carlier	95f0a77fde	Detect pthread_getname_np explicitly. At least one libc (musl) defines pthread_setname_np without defining pthread_getname_np. Detect the presence of each individually, rather than inferring both must be defined if set is.	2020-11-11 17:31:22 -08:00
David Goldblatt	589638182a	Use the edata_cache_small_t in the HPA.	2020-11-05 12:34:43 -08:00
David Goldblatt	03a6047111	Edata cache small: rewrite. In previous designs, this was intended to be a sort of cache that couldn't fail. In the current design, we want to use it just as a contention reduction mechanism. Rewrite it with those goals in mind.	2020-11-05 12:34:43 -08:00
David Goldblatt	c9757d9e3b	HPA: Don't disable shards that were never started.	2020-11-05 12:34:43 -08:00
David Goldblatt	1b3ee75667	Add experimental.thread.activity_callback. This (experimental, undocumented) functionality can be used by users to track various statistics of interest at a finer level of granularity than the thread.	2020-11-05 12:33:25 -08:00
David Carlier	d2d941017b	MADV_DO[NOT]DUMP support equivalence on FreeBSD.	2020-11-02 09:15:15 -08:00
DC	ef6d51ed44	DragonFlyBSD build support.	2020-10-27 12:35:19 -07:00
Qi Wang	bf72188f80	Allow opt.tcache_max to accept small size classes. Previously all the small size classes were cached. However this has downsides -- particularly when page size is greater than 4K (e.g. iOS), which will result in much higher SMALL_MAXCLASS. This change allows tcache_max to be set to lower values, to better control resources taken by tcache.	2020-10-24 20:43:44 -07:00
David Goldblatt	ea32060f9c	SEC: Implement thread affinity. For now, just have every thread pick a shard once and stick with it.	2020-10-23 11:14:34 -07:00
David Goldblatt	d16849c91d	psset: Do first-fit based on slab age. This functions more like the serial number strategy of the ecache and hpa_central_t. Longer-lived slabs are more likely to continue to live for longer in the future.	2020-10-23 11:14:34 -07:00
David Goldblatt	634ec6f50a	Edata: add an "age" field.	2020-10-23 11:14:34 -07:00
David Goldblatt	6599651aee	PA: Use an SEC in fron of the HPA shard.	2020-10-23 11:14:34 -07:00
David Goldblatt	ea51e97bb8	Add SEC module: a small extent cache. This can be used to take pressure off a more centralized, worse-sharded allocator without requiring a full break of the arena abstraction.	2020-10-23 11:14:34 -07:00
David Goldblatt	1964b08394	HPA: Add stats for the hpa_shard.	2020-10-23 11:14:34 -07:00
David Goldblatt	534504d4a7	HPA: add size-exclusion functionality. I.e. only allowing allocations under or over certain sizes.	2020-10-23 11:14:34 -07:00
David Goldblatt	484f04733e	HPA: Add central mutex contention stats.	2020-10-23 11:14:34 -07:00
David Goldblatt	bf025d2ec8	HPA: Make slab sizes and maxes configurable. This allows easy experimentation with them as tuning parameters.	2020-10-23 11:14:34 -07:00
David Goldblatt	1c7da33317	HPA: Tie components into a PAI implementation.	2020-10-23 11:14:34 -07:00
Qi Wang	c8209150f9	Switch from opt.lg_tcache_max to opt.tcache_max Though for convenience, keep parsing lg_tcache_max.	2020-10-22 20:40:41 -07:00
Yinan Zhang	5ba861715a	Add thread name in prof last-N records	2020-10-20 15:58:24 -07:00
Qi Wang	5e41ff9b74	Add a hard limit on tcache max size class. For locality reasons, tcache bins are integrated in TSD. Allowing all size classes to be cached has little benefit, but takes up much thread local storage. In addition, it complicates the layout which we try hard to optimize.	2020-10-16 13:49:51 -07:00
Qi Wang	3de19ba401	Eagerly detect double free and sized dealloc bugs for large sizes.	2020-10-15 10:03:16 -07:00
David Goldblatt	be9548f2be	Tcaches: Fix a subtle race condition. Without a lock held continuously between checking tcaches_past and incrementing it, it's possible for two threads to go down manual creation path simultaneously. If the number of tcaches is one less than the maximum, it's possible for both to create a tcache and increment tcaches_past, with the second thread returning a value larger than TCACHES_MAX.	2020-10-13 15:06:16 -07:00
Qi Wang	a9aa6f6d0f	Fix the alloc_ctx check in free_fastpath. The sanity check requires a functional TSD, which free_fastpath only guarantees after the threshold branch. Move the check function to afterwards.	2020-10-12 19:02:27 -07:00
David Goldblatt	b971f7c4dd	Add "default" option to slab sizes. This comes in handy when overriding earlier settings to test alternate ones. We don't really include tests for this, but I claim that's OK here: - It's fairly straightforward - It's fairly hard to test well - This entire code path is undocumented and mostly for our internal experimentation in the first place. - I tested manually.	2020-10-07 12:54:29 -07:00
David Goldblatt	21b70cb540	Add hpa_central module This will be the centralized component of the coming hugepage allocator; the source of larger chunks of memory from which smaller ones can be obtained.	2020-10-05 19:55:57 -07:00
David Goldblatt	1ed7ec369f	Emap: Add emap_assert_not_mapped. The counterpart to emap_assert_mapped, it lets callers check that some edata is not already in the emap.	2020-10-05 19:55:57 -07:00
David Goldblatt	259c5e3e8f	psset: Add stats	2020-09-18 12:39:25 -07:00
David Goldblatt	018b162d67	Add psset: a set of pageslabs. This introduces a new sort of edata_t; a pageslab, and a set to manage them. This is part of a series of a commits to implement a hugepage allocator; the pageset will be per-arena, and track small page allocations requests within a larger extent allocated from a centralized hugepage allocator.	2020-09-18 12:39:25 -07:00
David Goldblatt	e034500698	Edata: rename "ranged" bit to "pai". This better represents its intended purpose; the hugepage allocator design evolved away from needing contiguity of hugepage virtual address space.	2020-09-18 12:39:25 -07:00
Yinan Zhang	b549389e4a	Correct usize in prof last-N record	2020-09-09 13:31:35 -07:00
Yinan Zhang	202f01d4f8	Fix szind computation in profiling	2020-08-27 15:52:25 -07:00
Yinan Zhang	20f2479ed7	Do not create size class tables for non-prof builds	2020-08-24 20:10:02 -07:00
Yinan Zhang	8efcdc3f98	Move unbias data to prof_data	2020-08-24 20:10:02 -07:00
David Goldblatt	5e90fd006e	Geom_grow: Don't keep the mutex internal. We're about to use it in ways that will have external synchronization.	2020-08-19 16:53:21 -07:00
David Goldblatt	c57494879f	Geom_grow: Don't take tsdn at init. It's never used.	2020-08-19 16:53:21 -07:00
David Goldblatt	ffe552223c	Geom_grow: Move in advancing logic.	2020-08-19 16:53:21 -07:00
David Goldblatt	131b1b5338	Rename ecache_grow -> geom_grow. We're about to start using it outside of the ecaches, in the HPA central allocator.	2020-08-19 16:53:21 -07:00
David Goldblatt	7b187360e9	IO: Support 0-padding for unsigned numbers.	2020-08-13 10:03:15 -07:00
David Goldblatt	ab274a23b9	Add narenas_ratio. This allows setting arenas per cpu dynamically, rather than forcing the user to know the number of CPUs in advance if they want a particular CPU/space tradeoff.	2020-08-12 16:41:57 -07:00
Yinan Zhang	743021b63f	Fix size miscalculation bug in reallocation	2020-08-11 11:56:43 -07:00
David Goldblatt	eaed1e39be	Add sized-delete size-checking functionality. The existing checks are good at finding such issues (on tcache flush), but not so good at pinpointing them. Debug mode can find them, but sometimes debug mode slows down a program so much that hard-to-hit bugs can take a long time to crash. This commit adds functionality to keep programs mostly on their fast paths, while also checking every sized delete argument they get.	2020-08-05 19:34:05 -07:00
David Goldblatt	53084cc5c2	Safety check: Don't directly abort. The sized dealloc checks called the generic safety_check_fail, and then called abort. This means the failure case isn't mockable, hence not testable. Fix it in anticipation of a coming diff.	2020-08-05 19:34:05 -07:00
David Goldblatt	60993697d8	Prof: Add prof_unbias. This gives more accurate attribution of bytes and counts to stack traces, without introducing backwards incompatibilities in heap-profile parsing tools. We track the ideal reported (to the end user) number of bytes more carefully inside core jemalloc. When dumping heap profiles, insteading of outputting our counts directly, we output counts that will cause parsing tools to give a result close to the value we want. We retain the old version as an opt setting, to let users who are tracking values on a per-component basis to keep their metrics stable until they decide to switch.	2020-08-05 18:33:55 -07:00
David Goldblatt	81c2f841e5	Add a simple utility to detect profiling bias.	2020-08-05 18:33:55 -07:00
Yinan Zhang	f6cf5eb388	Add mallctl for batch allocation API	2020-07-31 09:16:50 -07:00
Yinan Zhang	978f830ee3	Add batch allocation API	2020-07-31 09:16:50 -07:00
Yinan Zhang	f805468957	Add zero option to arena batch allocation	2020-07-31 09:16:50 -07:00
Yinan Zhang	49e5c2fe7d	Add batch allocation from fresh slabs	2020-07-31 09:16:50 -07:00
Yinan Zhang	f28cc2bc87	Extract bin shard selection out of bin locking	2020-07-31 09:16:50 -07:00
David Goldblatt	1ed0288d9c	bit_util: Change ffs functions indexing. Making these 0-based instead of 1-based makes calling code simpler and will be more consistent with functions introduced in subsequent diffs.	2020-07-30 15:25:23 -07:00
Yinan Zhang	fb347dc618	Verify output space before doing heavy work in mallctl	2020-07-27 09:48:35 -07:00
Yinan Zhang	f5fb4e5a97	Modify mallctl output length when needed This is the only reason why `oldlenp` was designed to be in the form of a pointer.	2020-07-27 09:48:35 -07:00
Yinan Zhang	4258402047	Corrections for prof_log_start()	2020-07-22 13:34:49 -07:00
Yinan Zhang	e6cb7a1c9b	Shorten wait time for peak events	2020-07-14 09:00:33 -07:00
David Goldblatt	6107857b7b	PA->PAC: Move in PAI implementation.	2020-07-09 13:41:04 -07:00
David Goldblatt	6041aaba97	PA -> PAC: Move in destruction functions.	2020-07-09 13:41:04 -07:00
David Goldblatt	cbf096b05e	Arena: remove redundant bg inactivity check.	2020-07-09 13:41:04 -07:00
David Goldblatt	471eb5913c	PAC: Move in decay rate setting.	2020-07-09 13:41:04 -07:00
David Goldblatt	6a2774719f	PA->PAC: Move in decay functions.	2020-07-09 13:41:04 -07:00
David Goldblatt	4ee75be3a3	PA -> PAC: Move in decay_purge enum.	2020-07-09 13:41:04 -07:00
David Goldblatt	72435b0aba	PA->PAC: Make extent.c forget about PA.	2020-07-09 13:41:04 -07:00
David Goldblatt	dee5d1c42d	PA->PAC: Move in extent_sn.	2020-07-09 13:41:04 -07:00
David Goldblatt	7391382349	PA->PAC: Move in stats.	2020-07-09 13:41:04 -07:00
David Goldblatt	db211eefbf	PAC: Move in decay.	2020-07-09 13:41:04 -07:00
David Goldblatt	c81e389996	PAC: Move in ecache_grow.	2020-07-09 13:41:04 -07:00
David Goldblatt	65803171a7	PAC: move in emap	2020-07-09 13:41:04 -07:00
David Goldblatt	7efcb946c4	PAC: Add an init function.	2020-07-09 13:41:04 -07:00
David Goldblatt	722652222a	PAC: Move in edata_cache accesses.	2020-07-09 13:41:04 -07:00
David Goldblatt	777b0ba965	Add PAC: Page allocator classic. For now, this is just a stub containing the ecaches, with no surrounding code changed. Eventually all the core allocator bits will be moved in, in the subsequent stack of commits.	2020-07-09 13:41:04 -07:00
David Goldblatt	1b5f632e0f	Introduce PAI: Page allocator interface	2020-07-09 13:41:04 -07:00
David Goldblatt	f1f4ec315a	Tcache: Tweak nslots_max tuning parameter. In making these settings configurable, `634afc4124` unintentially changed a tuning parameter (reducing the "goal" max by a factor of 4). This commit undoes that change.	2020-07-09 08:58:05 -07:00
David Goldblatt	392f645f4d	Edata: split up different list linkage uses.	2020-07-08 13:20:59 -07:00
David Carlier	00f06c9beb	enabling mpss on solaris/illumos. reusing slighty linux configuration as possible, aligning the address range to HUGEPAGE.	2020-07-06 09:59:10 -07:00
Yinan Zhang	c2e7a06392	No need to intercept prof_dump_header() in tests	2020-06-29 14:27:50 -07:00
Yinan Zhang	f58ebdff7a	Generalize prof_cnt_all() for testing	2020-06-29 14:27:50 -07:00
Yinan Zhang	80d18c18c9	Pass prof dump parameters explicitly in prof_sys	2020-06-29 14:27:50 -07:00
Yinan Zhang	d4259ea53b	Simplify signatures for prof dump functions	2020-06-29 14:27:50 -07:00
Yinan Zhang	5d823f3a91	Consolidate struct definitions for prof dump parameters	2020-06-29 14:27:50 -07:00
Yinan Zhang	1f5fe3a3e3	Pass write callback explicitly in prof_data	2020-06-29 14:27:50 -07:00
Yinan Zhang	4556d3c0c8	Define structures for prof dump parameters	2020-06-29 14:27:50 -07:00
Yinan Zhang	1c6742e6a0	Migrate prof dumping to use buffered writer	2020-06-29 14:27:50 -07:00
Yinan Zhang	dad821bb22	Move unwind to prof_sys	2020-06-29 14:27:50 -07:00
Yinan Zhang	d128efcb6a	Relocate a few prof utilities to the right modules	2020-06-29 14:27:50 -07:00
Yinan Zhang	4736fb4fc9	Move file handling logic in prof_data to prof_sys	2020-06-29 14:27:50 -07:00
Yinan Zhang	767a2e1790	Move file handling logic in prof to prof_sys	2020-06-29 14:27:50 -07:00
Yinan Zhang	03ae509f32	Create prof_sys module for reading system thread name	2020-06-29 14:27:50 -07:00
Yinan Zhang	adfd9d7b1d	Change tsdn to tsd for thread name allocation	2020-06-29 14:27:50 -07:00
Yinan Zhang	841af2b426	Move thread name handling to prof_data module	2020-06-29 14:27:50 -07:00
Yinan Zhang	c8683bee80	Unify printing for prof counts object	2020-06-29 14:27:50 -07:00
Yinan Zhang	5d292b5660	Push error handling logic out of core dumping logic	2020-06-29 14:27:50 -07:00
Yinan Zhang	354183b10d	Define prof dump buffer size centrally	2020-06-29 14:27:50 -07:00
Yinan Zhang	7455813e57	Make dump file writing replaceable in test	2020-06-29 14:27:50 -07:00
Yinan Zhang	21e44c45d9	Make maps file opening replaceable in test	2020-06-29 14:27:50 -07:00
Yinan Zhang	4bb4037dbe	Extract utility function for opening maps file	2020-06-29 14:27:50 -07:00
Yinan Zhang	f307b25804	Only replace the dump file opening function in test	2020-06-29 14:27:50 -07:00
Yinan Zhang	d460333efb	Improve naming for prof system thread name option	2020-06-24 14:32:01 -07:00
Yinan Zhang	092fcac0b4	Remove unnecessary source files	2020-06-19 12:15:44 -07:00
Yinan Zhang	a795b19327	Remove beginning define in source files ``` sed -i "/^#define JEMALLOC_[A-Z_]_C_$/d" src/.c; ```	2020-06-19 12:15:44 -07:00
Yinan Zhang	24bbf376ce	Unify arena flag reading and selection	2020-06-19 11:06:05 -07:00
Yinan Zhang	e128b170a0	Do not fallback to auto arena when manual arena is requested	2020-06-19 11:06:05 -07:00
Yinan Zhang	95a59d2f72	Unify tcache flag reading and selection	2020-06-19 11:06:05 -07:00
Yinan Zhang	4b0c008489	Unify zero flag reading and setting	2020-06-19 11:06:05 -07:00
Yinan Zhang	2a84f9b8fc	Unify alignment flag reading and computation	2020-06-19 11:06:05 -07:00
Yinan Zhang	b7858abfc0	Expose prof testing internal functions	2020-06-19 09:16:51 -07:00
Yinan Zhang	40fa6674a9	Fix prof timestamp conf reading	2020-06-17 16:02:51 -07:00
David Goldblatt	40672b0b78	Remove duplicate logging in malloc.	2020-06-16 10:33:55 -07:00
Jon Haslam	4aea743279	High Resolution Timestamps for Profiling	2020-06-15 12:12:49 -07:00
David Goldblatt	d82a164d0d	Add thread.peak.[read\|reset] mallctls. These can be used to track net allocator activity on a per-thread basis.	2020-06-11 13:54:22 -07:00
Yinan Zhang	3e19ebd2ea	Add lock to protect prof last-N dumping	2020-06-09 17:03:05 -07:00
Yinan Zhang	a835d9cf85	Make prof last-N dumping non-blocking	2020-06-09 17:03:05 -07:00
Yinan Zhang	fc8bc4b5c0	Increase dump buffer for prof last-N list	2020-06-09 17:03:05 -07:00
Yinan Zhang	264d89d641	Extract restore and async cleanup functions for prof last-N list	2020-06-09 17:03:05 -07:00
Yinan Zhang	857ebd3daf	Make edata pointer on prof recent record an atomic fence	2020-06-09 17:03:05 -07:00
Yinan Zhang	730658f72f	Extract alloc/dalloc utility for last-N nodes	2020-06-09 17:03:05 -07:00
Yinan Zhang	035be44867	Separate out dumping for each prof recent record	2020-06-09 17:03:05 -07:00
David Goldblatt	8da0896b79	Tcache: Make an integer conversion explicit.	2020-05-28 15:52:40 -07:00
David Goldblatt	6cdac3c573	Tcache: Make flush fractions configurable.	2020-05-16 13:34:23 -07:00
David Goldblatt	7503b5b33a	Stats, CTL: Expose new tcache settings.	2020-05-16 13:34:23 -07:00
David Goldblatt	ee72bf1cfd	Tcache: Add tcache gc delay option. This can reduce flushing frequency for small size classes.	2020-05-16 13:34:23 -07:00
David Goldblatt	d338dd45d7	Tcache: Make incremental gc bytes configurable.	2020-05-16 13:34:23 -07:00
David Goldblatt	ec0b579563	Tcache: Privatize opt_lg_tcache_max default.	2020-05-16 13:34:23 -07:00
David Goldblatt	181093173d	Tcache: make slot sizing configurable.	2020-05-16 13:34:23 -07:00
David Goldblatt	b58dea8d1b	Cache bin: expose ncached_max publicly.	2020-05-16 13:34:23 -07:00
David Goldblatt	634afc4124	Tcache: Make size computation configurable.	2020-05-16 13:34:23 -07:00
David Goldblatt	eda9c2858f	Edata: zero stack edatas before initializing. This avoids some UB. No compilers take advantage of it for now, but no sense in tempting fate.	2020-05-14 10:30:20 -07:00
David Goldblatt	5dead37a9d	Allow narenas:default. This can be useful when you know you want to override some lower-priority configuration setting with its default value, but don't know what that value would be.	2020-05-14 10:30:08 -07:00
Yinan Zhang	75dae934a1	Always initialize TE counters in TSD init	2020-05-12 09:16:16 -07:00
Yinan Zhang	b06dfb9ccc	Push event handlers to constituent modules	2020-05-12 09:16:16 -07:00
Yinan Zhang	381c97caa4	Treat postponed prof sample event as new event	2020-05-12 09:16:16 -07:00
Yinan Zhang	abd4674931	Extract out per event postponed wait time fetching	2020-05-12 09:16:16 -07:00
Yinan Zhang	f72014d097	Only compute thread event threshold once per trigger	2020-05-12 09:16:16 -07:00
Yinan Zhang	7324c4f85f	Break down event init and handler functions	2020-05-12 09:16:16 -07:00
Yinan Zhang	6de77799de	Move thread event wait time update to local	2020-05-12 09:16:16 -07:00
Yinan Zhang	733ae918f0	Extract out per event new wait time fetching	2020-05-12 09:16:16 -07:00
Yinan Zhang	1e2524e15a	Do not reset sample wait time when re-initing tdata	2020-05-12 09:16:16 -07:00
Yinan Zhang	fc052ff728	Migrate counter to use locked int	2020-05-12 08:23:15 -07:00
Yinan Zhang	f533ab6da6	Add forking handling for stats	2020-05-11 15:35:06 -07:00
Yinan Zhang	508303077b	Add forking handling for prof idump counter	2020-05-11 15:35:06 -07:00
Yinan Zhang	4d970f8bfc	Add forking handling for counter module	2020-05-11 15:35:06 -07:00
Yinan Zhang	2097e1945b	Unify write callback signature	2020-05-11 14:51:24 -07:00
Yinan Zhang	8be5584494	Initialize prof idump counter once rather than once per arena	2020-05-11 12:24:56 -07:00
Yinan Zhang	e10e5059e8	Make prof_idump_accum() non-inline	2020-05-11 12:24:56 -07:00
Yinan Zhang	039bfd4e30	Do not rollback prof idump counter in arena_prof_promote()	2020-05-11 12:24:56 -07:00
Yinan Zhang	0295aa38a2	Deduplicate entries in witness error message	2020-05-11 12:04:02 -07:00
David Goldblatt	f1f8a75496	Let opt.zero propagate to core allocation. I.e. set dopts->zero early on if opt.zero is true, rather than leaving it set by the entry-point function (malloc, calloc, etc.) and then memsetting. This avoids situations where we zero once in the large-alloc pathway and then again via memset.	2020-05-04 12:36:45 -07:00
David Goldblatt	46471ea327	SC: Name the max lookup constant.	2020-05-04 12:27:07 -07:00
David Goldblatt	cd29ebefd0	Tcache: treat small and large cache bins uniformly	2020-04-14 15:20:19 -07:00
David Goldblatt	a13fbad374	Tcache: split up fast and slow path data.	2020-04-14 15:20:19 -07:00
David Goldblatt	7099c66205	Arena: fill in terms of cache_bins.	2020-04-14 15:20:19 -07:00
David Goldblatt	40e7aed59e	TSD: Move in some of the tcache fields. We had put these in the tcache for cache optimization reasons. After the previous diff, these no longer apply.	2020-04-14 15:20:19 -07:00
David Goldblatt	3589571bfd	SC: use SC_LG_NGROUP instead of its value. This magic constant introduces inconsistencies. We should be able to change its value solely by adjusting the definition in the header.	2020-04-13 10:01:30 -07:00
David Goldblatt	79ae7f9211	Rtree: Remove the per-field accessors. We instead split things into "edata" and "metadata".	2020-04-10 13:12:47 -07:00
David Goldblatt	bb6a418523	Emap: Drop szind/slab splitting parameters. After the previous diff, these are constants.	2020-04-10 13:12:47 -07:00
David Goldblatt	50289750b3	Extent: Remove szind/slab knowledge.	2020-04-10 13:12:47 -07:00
David Goldblatt	dc26b30094	Rtree: Clean up compact/non-compact split.	2020-04-10 13:12:47 -07:00
David Goldblatt	93b99dd140	Extent: Stop passing an edata_cache everywhere. We already pass the pa_shard_t around everywhere; we can just use that.	2020-04-10 13:12:47 -07:00
David Goldblatt	a4759a1911	Ehooks: avoid touching arena_emap_global in tests. That breaks our ability to test custom emaps in isolation.	2020-04-10 13:12:47 -07:00
David Goldblatt	11c47cb133	Extent: Take "bool zero" over "bool *zero".	2020-04-10 13:12:47 -07:00
David Goldblatt	1a1124462e	PA: Take zero as a bool rather than as a bool *. Now that we've moved junking to a higher level of the allocation stack, we don't care about this performance optimization (which only occurred in debug modes).	2020-04-10 13:12:47 -07:00
David Goldblatt	294b276fc7	PA: Parameterize emap. Move emap_global to arena. This lets us test the PA module without interfering with the global emap used by the real allocator (the one not under test).	2020-04-10 13:12:47 -07:00
David Goldblatt	f730577277	Eset: Parameterize last globals accesses. I.e. opt_retain and maps_coalesce.	2020-04-10 13:12:47 -07:00
David Goldblatt	7bb6e2dc0d	Eset: take opt_lg_max_active_fit as a parameter. This breaks its dependence on the global.	2020-04-10 13:12:47 -07:00
David Goldblatt	883ab327cc	Emap: Move out last edata state touching.	2020-04-10 13:12:47 -07:00
David Goldblatt	0c96a2f03b	Emap: Move out remaining edata modifications.	2020-04-10 13:12:47 -07:00
David Goldblatt	dfef0df71a	Emap: Move edata modification out of emap_remap.	2020-04-10 13:12:47 -07:00
David Goldblatt	12eb888e54	Edata: Add a ranged bit. We steal the dumpable bit, which we ended up not needing.	2020-04-10 13:12:47 -07:00
David Goldblatt	bd4fdf295e	Rtree: Pull leaf contents into their own struct.	2020-04-10 13:12:47 -07:00
David Goldblatt	faec7219b2	PA: Move in decay initialization.	2020-04-10 13:12:47 -07:00
David Goldblatt	45671e4a27	PA: Move in retain growth limit setting.	2020-04-10 13:12:47 -07:00
David Goldblatt	daefde88fe	PA: Move in mutex stats reading.	2020-04-10 13:12:47 -07:00
David Goldblatt	07675840a5	PA: Move in some more internals accesses.	2020-04-10 13:12:47 -07:00
David Goldblatt	238f3c7430	PA: Move in full stats merging.	2020-04-10 13:12:47 -07:00
David Goldblatt	81c6027592	Arena stats: Give it its own "mapped". This distinguishes it from the PA mapped stat, which is now named "pa_mapped" to avoid confusion. The (derived) arena stat includes base memory, and the PA stat is no longer partially derived.	2020-04-10 13:12:47 -07:00
David Goldblatt	506d907e40	PA: Move in basic stats merging.	2020-04-10 13:12:47 -07:00
David Goldblatt	f29f6090f5	PA: Add pa_extra.c and put PA forking there.	2020-04-10 13:12:47 -07:00
David Goldblatt	8164fad404	Stats: Fix edata_cache size merging. Previously, we assigned to the output rather than incrementing it.	2020-04-10 13:12:47 -07:00

... 3 4 5 6 7 ...

1873 Commits