server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Qi Wang	e422fa8e7e	Add arena.i.retain_grow_limit This option controls the max size when grow_retained. This is useful when we have customized extent hooks reserving physical memory (e.g. 1G huge pages). Without this feature, the default increasing sequence could result in fragmented and wasted physical memory.	2017-11-03 13:53:33 -07:00
Edward Tomasz Napierala	9f455e2786	Try to use sysctl(3) instead of sysctlbyname(3). This attempts to use VM_OVERCOMMIT OID - newly introduced in -CURRENT few days ago, specifically for this purpose - instead of querying the sysctl by its string name. Due to how syctlbyname(3) works, this means we do one syscall during binary startup instead of two. Signed-off-by: Edward Tomasz Napierala <trasz@FreeBSD.org>	2017-11-03 08:25:39 -07:00
Edward Tomasz Napierala	d591df05c8	Use getpagesize(3) under FreeBSD. This avoids sysctl(2) syscall during binary startup, using the value passed in the ELF aux vector instead. Signed-off-by: Edward Tomasz Napierala <trasz@FreeBSD.org>	2017-11-03 08:25:39 -07:00
Qi Wang	58eba024c0	metadata_thp: auto mode adjustment for a0. We observed that arena 0 can have much more metadata allocated comparing to other arenas. Tune the auto mode to only switch to huge page on the 5th block (instead of 3 previously) for a0.	2017-11-01 13:52:06 -07:00
Qi Wang	47203d5f42	Output all counters for bin mutex stats. The saved space is not worth the trouble of missing counters.	2017-10-19 16:31:54 -07:00
David Goldblatt	d14bbf8d81	Add a "dumpable" bit to the extent state. Currently, this is unused (i.e. all extents are always marked dumpable). In the future, we'll begin using this functionality.	2017-10-16 15:35:49 -07:00
David Goldblatt	bbaa72422b	Add pages_dontdump and pages_dodump. This will, eventually, enable us to avoid dumping eden regions.	2017-10-16 15:35:49 -07:00
David Goldblatt	211b1f3c7d	Factor out extent-splitting core from extent lifetime management. Before this commit, extent_recycle_split intermingles the splitting of an extent and the return of parts of that extent to a given extents_t. After it, that logic is separated. This will enable splitting extents that don't live in any extents_t (as the grow retained region soon will).	2017-10-16 15:35:49 -07:00
David Goldblatt	5bad01c38e	Document some of the internal extent functions.	2017-10-16 15:35:49 -07:00
Qi Wang	31ab38be5f	Define MADV_FREE on our own when needed. On x86 Linux, we define our own MADV_FREE if madvise(2) is available, but no MADV_FREE is detected. This allows the feature to be built in and enabled with runtime detection.	2017-10-11 15:49:22 -07:00
Qi Wang	7e74093c96	Set isthreaded manually. Avoid relying pthread_once which creates dependency during init.	2017-10-05 22:57:56 -07:00
Qi Wang	a2e6eb2c22	Delay background_thread_ctl_init to right before thread creation. ctl_init sets isthreaded, which means it should be done without holding any locks.	2017-10-05 22:57:56 -07:00
Qi Wang	79e83451ff	Enable a0 metadata thp on the 3rd base block. Since we allocate rtree nodes from a0's base, it's pushed to over 1 block on initialization right away, which makes the auto thp mode less effective on a0. We change a0 to make the switch on the 3rd block instead.	2017-10-05 13:39:03 -07:00
David Goldblatt	1245faae90	Power: disable the CPU_SPINWAIT macro. Quoting from https://github.com/jemalloc/jemalloc/issues/761 : [...] reading the Power ISA documentation[1], the assembly in [the CPU_SPINWAIT macro] isn't correct anyway (as @marxin points out): the setting of the program-priority register is "sticky", and we never undo the lowering. We could do something similar, but given that we don't have testing here in the first place, I'm inclined to simply not try. I'll put something up reverting the problematic commit tomorrow. [1] Book II, chapter 3 of the 2.07B or 3.0B ISA documents.	2017-10-04 18:37:23 -07:00
Dave Watson	7c6c99b829	Use ph instead of rb tree for extents_avail_ There does not seem to be any overlap between usage of extent_avail and extent_heap, so we can use the same hook. The only remaining usage of rb trees is in the profiling code, which has some 'interesting' iteration constraints. Fixes #888	2017-10-04 12:23:03 -07:00
David Goldblatt	8a7ee3014c	Logging: capitalize log macro. Dodge a name-conflict with the math.h logarithm function. D'oh.	2017-10-02 20:44:43 -07:00
Qi Wang	0720192a32	Add runtime detection of lazy purging support. It's possible to build with lazy purge enabled but depoly to systems without such support. In this case, rely on the boot time detection instead of keep making unnecessary madvise calls (which all returns EINVAL).	2017-09-26 17:26:22 -07:00
Qi Wang	eaa58a5026	Put static keyword first. Fix a warning by -Wold-style-declaration.	2017-09-21 12:18:10 -07:00
Qi Wang	9b20a4bf70	Clear cache bin ql postfork. This fixes a regression in `9c05490`, which introduced the new cache bin ql. The list needs to be cleaned up after fork, same as tcache_ql.	2017-09-12 16:16:12 -07:00
Qi Wang	a315688be0	Relax constraints on reentrancy for extent hooks. If we guarantee no malloc activity in extent hooks, it's possible to make customized hooks working on arena 0. Remove the non-a0 assertion to enable such use cases.	2017-08-31 11:03:34 -07:00
Qi Wang	e55c3ca267	Add stats for metadata_thp. Report number of THPs used in arena and aggregated stats.	2017-08-30 16:47:32 -07:00
Qi Wang	47b20bb654	Change opt.metadata_thp to [disabled,auto,always]. To avoid the high RSS caused by THP + low usage arena (i.e. THP becomes a significant percentage), added a new "auto" option which will only start using THP after a base allocator used up the first THP region. Starting from the second hugepage (in a single arena), "auto" behaves the same as "always", i.e. madvise hugepage right away.	2017-08-30 16:47:32 -07:00
David Goldblatt	9c0549007d	Make arena stats collection go through cache bins. This eliminates the need for the arena stats code to "know" about tcaches; all that it needs is a cache_bin_array_descriptor_t to tell it where to find cache_bins whose stats it should aggregate.	2017-08-16 17:48:44 -07:00
David Goldblatt	f3170baa30	Pull out caching for a bin into its own file. This is the first step towards breaking up the tcache and arena (since they interact primarily at the bin level). It should also make a future arena caching implementation more straightforward.	2017-08-16 17:48:44 -07:00
Qi Wang	3ec279ba1c	Fix test/unit/pages. As part of the metadata_thp support, We now have a separate swtich (JEMALLOC_HAVE_MADVISE_HUGE) for MADV_HUGEPAGE availability. Use that instead of JEMALLOC_THP (which doesn't guard pages_huge anymore) in tests.	2017-08-11 15:57:12 -07:00
Qi Wang	8fdd9a5797	Implement opt.metadata_thp This option enables transparent huge page for base allocators (require MADV_HUGEPAGE support).	2017-08-11 14:51:20 -07:00
Ryan Libby	048c6679cd	Remove external linkage for spin_adaptive The external linkage for spin_adaptive was not used, and the inline declaration of spin_adaptive that was used caused a probem on FreeBSD where CPU_SPINWAIT is implemented as a call to a static procedure for x86 architectures.	2017-08-08 10:30:21 -07:00
Qi Wang	1ab2ab294c	Only read szind if ptr is not paged aligned in sdallocx. If ptr is not page aligned, we know the allocation was not sampled. In this case use the size passed into sdallocx directly w/o accessing rtree. This improve sdallocx efficiency in the common case (not sampled && small allocation).	2017-07-31 15:47:48 -07:00
Qi Wang	3800e55a2c	Bypass extent_alloc_wrapper_hard for no_move_expand. When retain is enabled, we should not attempt mmap for in-place expansion (large_ralloc_no_move), because it's virtually impossible to succeed, and causes unnecessary syscalls (which can cause lock contention under load).	2017-07-31 14:04:17 -07:00
David Goldblatt	e6aeceb606	Logging: log using the log var names directly. Currently we have to log by writing something like: static log_var_t log_a_b_c = LOG_VAR_INIT("a.b.c"); log (log_a_b_c, "msg"); This is sort of annoying. Let's just write: log("a.b.c", "msg");	2017-07-24 14:55:54 -07:00
Qinfan Wu	b28f31e7ed	Split out cold code path in newImpl I noticed that the whole newImpl is inlined. Since OOM handling code is rarely executed, we should only inline the hot path.	2017-07-24 13:37:02 -07:00
David Goldblatt	a9f7732d45	Logging: allow logging with empty varargs. Currently, the log macro requires at least one argument after the format string, because of the way the preprocessor handles varargs macros. We can hide some of that irritation by pushing the extra arguments into a varargs function.	2017-07-22 09:38:19 -07:00
Y. T. Chung	aa6c282137	Validates fd before calling fcntl	2017-07-22 07:46:30 -07:00
David T. Goldblatt	e215a7bc18	Add entry and exit logging to all core functions. I.e. mallloc, free, the allocx API, the posix extensions.	2017-07-20 17:58:37 -07:00
David T. Goldblatt	9761b449c8	Add a logging facility. This sets up a hierarchical logging facility, so that we can add logging statements liberally, and turn them on in a fine-grained manner.	2017-07-20 17:58:37 -07:00
Y. T. Chung	0975b88dfd	Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable. Older Linux systems don't have O_CLOEXEC. If that's the case, we fcntl immediately after open, to minimize the length of the racy period in which an operation in another thread can leak a file descriptor to a child.	2017-07-20 14:13:33 -07:00
David Goldblatt	0a4f5a7eea	Fix deadlock in multithreaded fork in OS X. On OS X, we rely on the zone machinery to call our prefork and postfork handlers. In zone_force_unlock, we call jemalloc_postfork_child, reinitializing all our mutexes regardless of state, since the mutex implementation will assert if the tid of the unlocker is different from that of the locker. This has the effect of unlocking the mutexes, but also fails to wake any threads waiting on them in the parent. To fix this, we track whether or not we're the parent or child after the fork, and unlock or reinit as appropriate. This resolves #895.	2017-07-10 18:17:12 -07:00
Qi Wang	cb032781bd	Add extent_grow_mtx in pre_ / post_fork handlers. This fixed the issue that could cause the child process to stuck after fork.	2017-06-29 17:01:18 -07:00
Qi Wang	aa363f9388	Fix pthread_sigmask() usage to block all signals.	2017-06-26 11:27:21 -07:00
Qi Wang	57beeb2fcb	Switch ctl to explicitly use tsd instead of tsdn.	2017-06-23 13:27:53 -07:00
Qi Wang	425463a446	Check arena in current context in pre_reentrancy.	2017-06-23 13:27:53 -07:00
Qi Wang	d6eb8ac8f3	Set reentrancy when invoking customized extent hooks. Customized extent hooks may malloc / free thus trigger reentry. Support this behavior by adding reentrancy on hook functions.	2017-06-23 13:27:53 -07:00
Jason Evans	d49ac4c709	Fix assertion typos. Reported by Conrad Meyer.	2017-06-23 11:48:00 -07:00
Qi Wang	a3f4977217	Add thread name for background threads.	2017-06-23 10:54:54 -07:00
Qi Wang	52fc887b49	Avoid inactivity_check within background threads. Passing is_background_thread down the decay path, so that background thread itself won't attempt inactivity_check. This fixes an issue with background thread doing trylock on a mutex it already owns.	2017-06-22 16:53:58 -07:00
Jason Evans	37f3fa0941	Mask signals during background thread creation. This prevents signals from being inadvertently delivered to background threads.	2017-06-20 17:47:38 -07:00
Qi Wang	d35c037e03	Clear tcache_ql after fork in child.	2017-06-19 21:53:07 -07:00
Qi Wang	9b1befabbb	Add minimal initialized TSD. We use the minimal_initilized tsd (which requires no cleanup) for free() specifically, if tsd hasn't been initialized yet. Any other activity will transit the state from minimal to normal. This is to workaround the case where a thread has no malloc calls in its lifetime until during thread termination, free() happens after tls destructors.	2017-06-15 17:55:53 -07:00
Qi Wang	ae93fb08e2	Pass tsd to tcache_flush().	2017-06-15 17:55:53 -07:00
Qi Wang	84f6c2cae0	Log decay->nunpurged before purging. During purging, we may unlock decay->mtx. Therefore we should finish logging decay related counters before attempt to purge.	2017-06-14 20:18:02 -07:00

... 2 3 4 5 6 ...

1100 Commits