server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Dave Watson	8b14f3abc0	background_thread: add max thread count config Looking at the thread counts in our services, jemalloc's background thread is useful, but mostly idle. Add a config option to tune down the number of threads.	2018-04-10 14:01:45 -07:00
Qi Wang	4be74d5112	Consolidate the two memory loads in rtree_szind_slab_read(). szind and slab bits are read on fast path, where compiler generated two memory loads separately for them before this diff. Manually operate on the bits to avoid the extra memory load.	2018-04-10 10:18:46 -07:00
Qi Wang	d3e0976a2c	Fix type warning on Windows. Add cast since read / write has unsigned return type on windows.	2018-04-09 16:50:30 -07:00
Qi Wang	2dccf45640	Control idump and gdump with prof_active.	2018-04-09 16:35:14 -07:00
David Goldblatt	86c61d4a57	Stats printing: Move global mutex stats to use emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	ebe0b5f828	Emitter: Add support for row-based output in table mode. This is needed for things like mutex stats in table mode.	2018-03-09 11:47:17 -08:00
David Goldblatt	27a8fe6780	Introduce the emitter module. The emitter can be used to produce structured json or tabular output. For now it has no uses; in subsequent commits, I'll begin transitioning stats printing code over.	2018-03-09 11:47:17 -08:00
Qi Wang	e4f090e8df	Add opt.thp which allows explicit hugepage usage. "always" marks all user mappings as MADV_HUGEPAGE; while "never" marks all mappings as MADV_NOHUGEPAGE. The default setting "default" does not change any settings. Note that all the madvise calls are part of the default extent hooks by design, so that customized extent hooks have complete control over the mappings including hugepage settings.	2018-03-08 13:08:06 -08:00
Qi Wang	efa40532dc	Remove config.thp which wasn't in use.	2018-03-08 13:08:06 -08:00
David T. Goldblatt	dd7e283b6f	Tweak the ticker paths to help GCC generate better code. GCC on its own isn't quite able to turn the ticker subtract into a memory operation followed by a js.	2018-02-21 16:04:23 -08:00
rustyx	83aa9880b7	Make generated headers usable in both x86 and x64 mode in Visual Studio	2018-01-30 13:11:41 -08:00
Christopher Ferris	f78d4ca3fb	Modify configure to determine return value of strerror_r. On glibc and Android's bionic, strerror_r returns char* when _GNU_SOURCE is defined. Add a configure check for this rather than assume glibc is the only libc that behaves this way.	2018-01-10 21:01:18 -08:00
Qi Wang	41790f4fa4	Check tsdn_null before reading reentrancy level.	2018-01-05 13:05:17 -08:00
Qi Wang	91b247d311	In iallocztm, check lock rank only when not in reentrancy.	2018-01-05 13:05:17 -08:00
Rajeev Misra	72bdbc35e3	extent_t bitpacking logic refactoring	2018-01-04 11:11:04 -08:00
Rajeev Misra	f47e39d11a	handle 32 bit mutex counters	2018-01-04 11:08:17 -08:00
David Goldblatt	21f7c13d0b	Add the div module, which allows fast division by dynamic values.	2017-12-21 14:25:43 -08:00
David T. Goldblatt	7f1b02e3fa	Split up and standardize naming of stats code. The arena-associated stats are now all prefixed with arena_stats_, and live in their own file. Likewise, malloc_bin_stats_t -> bin_stats_t, also in its own file.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	901d94a2b0	Rename cache_alloc_easy to cache_bin_alloc_easy. This lives in the cache_bin module; just a typo.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	8aafa270fd	Move bin stats code from arena to bin module.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	48bb4a056b	Move bin forking code from arena to bin module.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	a8dd8876fb	Move bin initialization from arena module to bin module.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	4bf4a1c4ea	Pull out arena_bin_info_t and arena_bin_t into their own file. In the process, kill arena_bin_index, which is unused. To follow are several diffs continuing this separation.	2017-12-18 16:29:10 -08:00
Qi Wang	740bdd68b1	Over purge by 1 extent always. When purging, large allocations are usually the ones that cross the npages_limit threshold, simply because they are "large". This means we often leave the large extent around for a while, which has the downsides of: 1) high RSS and 2) more chance of them getting fragmented. Given that they are not likely to be reused very soon (LRU), let's over purge by 1 extent (which is often large and not reused frequently).	2017-12-18 12:57:07 -08:00
Ed Schouten	749caf14ae	Also use __riscv to detect builds for RISC-V CPUs. According to the RISC-V toolchain conventions, __riscv__ is the old spelling of this definition. __riscv should be used going forward. https://github.com/riscv/riscv-toolchain-conventions#cc-preprocessor-definitions	2017-12-09 10:10:42 -08:00
Qi Wang	eb1b08daae	Fix an extent coalesce bug. When coalescing, we should take both extents off the LRU list; otherwise decay can grab the existing outer extent through extents_evict.	2017-11-16 15:32:02 -08:00
Qi Wang	fac706836f	Add opt.lg_extent_max_active_fit When allocating from dirty extents (which we always prefer if available), large active extents can get split even if the new allocation is much smaller, in which case the introduced fragmentation causes high long term damage. This new option controls the threshold to reuse and split an existing active extent. We avoid using a large extent for much smaller sizes, in order to reduce fragmentation. In some workload, adding the threshold improves virtual memory usage by >10x.	2017-11-16 15:32:02 -08:00
Dave Watson	d6feed6e66	Use tsd offset_state instead of atomic While working on #852, I noticed the prng state is atomic. This is the only atomic use of prng in all of jemalloc. Instead, use a threadlocal prng state if possible to avoid unnecessary cache line contention.	2017-11-14 08:58:18 -08:00
Qi Wang	cb3b72b975	Fix base allocator THP auto mode locking and stats. Added proper synchronization for switching to using THP in auto mode. Also fixed stats for number of THPs used.	2017-11-09 16:14:12 -08:00
Qi Wang	b5d071c266	Fix unbounded increase in stash_decayed. Added an upper bound on how many pages we can decay during the current run. Without this, decay could have unbounded increase in stashed, since other threads could add new pages into the extents.	2017-11-08 16:33:30 -08:00
Qi Wang	e422fa8e7e	Add arena.i.retain_grow_limit This option controls the max size when grow_retained. This is useful when we have customized extent hooks reserving physical memory (e.g. 1G huge pages). Without this feature, the default increasing sequence could result in fragmented and wasted physical memory.	2017-11-03 13:53:33 -07:00
Qi Wang	58eba024c0	metadata_thp: auto mode adjustment for a0. We observed that arena 0 can have much more metadata allocated comparing to other arenas. Tune the auto mode to only switch to huge page on the 5th block (instead of 3 previously) for a0.	2017-11-01 13:52:06 -07:00
David Goldblatt	d14bbf8d81	Add a "dumpable" bit to the extent state. Currently, this is unused (i.e. all extents are always marked dumpable). In the future, we'll begin using this functionality.	2017-10-16 15:35:49 -07:00
David Goldblatt	bbaa72422b	Add pages_dontdump and pages_dodump. This will, eventually, enable us to avoid dumping eden regions.	2017-10-16 15:35:49 -07:00
David Goldblatt	ccd09050aa	Add configure-time detection for madvise(..., MADV_DO[NT]DUMP)	2017-10-16 15:35:49 -07:00
Qi Wang	31ab38be5f	Define MADV_FREE on our own when needed. On x86 Linux, we define our own MADV_FREE if madvise(2) is available, but no MADV_FREE is detected. This allows the feature to be built in and enabled with runtime detection.	2017-10-11 15:49:22 -07:00
David Goldblatt	1245faae90	Power: disable the CPU_SPINWAIT macro. Quoting from https://github.com/jemalloc/jemalloc/issues/761 : [...] reading the Power ISA documentation[1], the assembly in [the CPU_SPINWAIT macro] isn't correct anyway (as @marxin points out): the setting of the program-priority register is "sticky", and we never undo the lowering. We could do something similar, but given that we don't have testing here in the first place, I'm inclined to simply not try. I'll put something up reverting the problematic commit tomorrow. [1] Book II, chapter 3 of the 2.07B or 3.0B ISA documents.	2017-10-04 18:37:23 -07:00
Dave Watson	7c6c99b829	Use ph instead of rb tree for extents_avail_ There does not seem to be any overlap between usage of extent_avail and extent_heap, so we can use the same hook. The only remaining usage of rb trees is in the profiling code, which has some 'interesting' iteration constraints. Fixes #888	2017-10-04 12:23:03 -07:00
David Goldblatt	8a7ee3014c	Logging: capitalize log macro. Dodge a name-conflict with the math.h logarithm function. D'oh.	2017-10-02 20:44:43 -07:00
David Goldblatt	7a8bc7172b	ARM: Don't extend bit LG_VADDR to compute high address bits. In userspace ARM on Linux, zero-ing the high bits is the correct way to do this. This doesn't fix the fact that we currently set LG_VADDR to 48 on ARM, when in fact larger virtual address sizes are coming soon. We'll cross that bridge when we come to it.	2017-10-02 14:54:46 -07:00
Qi Wang	3959a9fe19	Avoid left shift by negative values. Fix warnings on -Wshift-negative-value.	2017-09-25 15:38:58 -07:00
Qi Wang	56f0e57844	Add "falls through" comment explicitly. Fix warnings by -Wimplicit-fallthrough.	2017-09-25 15:38:58 -07:00
Qi Wang	d60f3bac12	Add missing field in initializer for rtree cache. Fix a warning by -Wmissing-field-initializers.	2017-09-21 12:18:10 -07:00
Qi Wang	a315688be0	Relax constraints on reentrancy for extent hooks. If we guarantee no malloc activity in extent hooks, it's possible to make customized hooks working on arena 0. Remove the non-a0 assertion to enable such use cases.	2017-08-31 11:03:34 -07:00
Qi Wang	e55c3ca267	Add stats for metadata_thp. Report number of THPs used in arena and aggregated stats.	2017-08-30 16:47:32 -07:00
Qi Wang	47b20bb654	Change opt.metadata_thp to [disabled,auto,always]. To avoid the high RSS caused by THP + low usage arena (i.e. THP becomes a significant percentage), added a new "auto" option which will only start using THP after a base allocator used up the first THP region. Starting from the second hugepage (in a single arena), "auto" behaves the same as "always", i.e. madvise hugepage right away.	2017-08-30 16:47:32 -07:00
David Goldblatt	ea91dfa58e	Document the ialloc function abbreviations. In the jemalloc_internal_inlines files, we have a lot of somewhat terse function names. This commit adds some documentation to aid in translation.	2017-08-16 17:48:44 -07:00
David Goldblatt	9c0549007d	Make arena stats collection go through cache bins. This eliminates the need for the arena stats code to "know" about tcaches; all that it needs is a cache_bin_array_descriptor_t to tell it where to find cache_bins whose stats it should aggregate.	2017-08-16 17:48:44 -07:00
David Goldblatt	f3170baa30	Pull out caching for a bin into its own file. This is the first step towards breaking up the tcache and arena (since they interact primarily at the bin level). It should also make a future arena caching implementation more straightforward.	2017-08-16 17:48:44 -07:00
Faidon Liambotis	82d1a3fb31	Add support for m68k, nios2, SH3 architectures Add minimum alignment for three more architectures, as requested by Debian users or porters (see Debian bugs #807554, #816236, #863424).	2017-08-11 16:35:44 -07:00

1 2 3 4 5 ...

809 Commits