server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Jason Evans	650c070e10	Remove rtree support for 0 (NULL) keys. NULL can never actually be inserted in practice, and removing support allows a branch to be removed from the fast path.	2017-02-08 18:50:03 -08:00
Jason Evans	f5cf9b19c8	Determine rtree levels at compile time. Rather than dynamically building a table to aid per level computations, define a constant table at compile time. Omit both high and low insignificant bits. Use one to three tree levels, depending on the number of significant bits.	2017-02-08 18:50:03 -08:00
Jason Evans	ff4db5014e	Remove rtree leading 0 bit optimization. A subsequent change instead ignores insignificant high bits.	2017-02-08 18:50:03 -08:00
Jason Evans	cdc240d501	Make non-essential inline rtree functions static functions.	2017-02-08 18:50:03 -08:00
Jason Evans	c511a44e99	Split rtree_elm_lookup_hard() out of rtree_elm_lookup(). Anything but a hit in the first element of the lookup cache is expensive enough to negate the benefits of inlining.	2017-02-08 18:50:03 -08:00
Jason Evans	c0cc5db871	Replace tabs following #define with spaces. This resolves #564.	2017-01-20 21:45:53 -08:00
Jason Evans	f408643a4c	Remove extraneous parens around return arguments. This resolves #540.	2017-01-20 21:43:07 -08:00
Jason Evans	c4c2592c83	Update brace style. Add braces around single-line blocks, and remove line breaks before function-opening braces. This resolves #537.	2017-01-20 21:43:07 -08:00
Jason Evans	ffbb7dac3d	Remove leading blank lines from function bodies. This resolves #535.	2017-01-13 14:49:24 -08:00
Jason Evans	a0dd3a4483	Implement per arena base allocators. Add/rename related mallctls: - Add stats.arenas.<i>.base . - Rename stats.arenas.<i>.metadata to stats.arenas.<i>.internal . - Add stats.arenas.<i>.resident . Modify the arenas.extend mallctl to take an optional (extent_hooks_t *) argument so that it is possible for all base allocations to be serviced by the specified extent hooks. This resolves #463.	2016-12-26 18:08:28 -08:00
Dave Watson	25f7bbcf28	Fix long spinning in rtree_node_init rtree_node_init spinlocks the node, allocates, and then sets the node. This is under heavy contention at the top of the tree if many threads start to allocate at the same time. Instead, take a per-rtree sleeping mutex to reduce spinning. Tested both pthreads and osx OSSpinLock, and both reduce spinning adequately Previous benchmark time: ./ttest1 500 100 ~15s New benchmark time: ./ttest1 500 100 .57s	2016-11-02 20:30:53 -07:00
Jason Evans	68e14c9884	Fix over-sized allocation of rtree leaf nodes. Use the correct level metadata when allocating child nodes so that leaf nodes don't end up over-sized (2^16 elements vs 2^4 elements).	2016-10-28 00:16:55 -07:00
Jason Evans	e5effef428	Add/use adaptive spinning. Add spin_t and spin_{init,adaptive}(), which provide a simple abstraction for adaptive spinning. Adaptively spin during busy waits in bootstrapping and rtree node initialization.	2016-10-13 14:55:39 -07:00
Jason Evans	6f29a83924	Add rtree lookup path caching. rtree-based extent lookups remain more expensive than chunk-based run lookups, but with this optimization the fast path slowdown is ~3 CPU cycles per metadata lookup (on Intel Core i7-4980HQ), versus ~11 cycles prior. The path caching speedup tends to degrade gracefully unless allocated memory is spread far apart (as is the case when using a mixture of sbrk() and mmap()).	2016-06-05 20:59:57 -07:00
Jason Evans	7be2ebc23f	Make tsd cleanup functions optional, remove noop cleanup functions.	2016-06-05 20:42:24 -07:00
Jason Evans	e75e9be130	Add rtree element witnesses.	2016-06-03 12:27:41 -07:00
Jason Evans	8c9be3e837	Refactor rtree to always use base_alloc() for node allocation.	2016-06-03 12:27:41 -07:00
Jason Evans	2d2b4e98c9	Add element acquire/release capabilities to rtree. This makes it possible to acquire short-term "ownership" of rtree elements so that it is possible to read an extent pointer and read the extent's contents with a guarantee that the element will not be modified until the ownership is released. This is intended as a mechanism for resolving rtree read/write races rather than as a way to lock extents.	2016-06-03 12:27:33 -07:00
Jason Evans	6c460ad91b	Optimize rtree_get(). Specialize fast path to avoid code that cannot execute for dependent loads. Manually unroll.	2016-03-22 17:54:35 -07:00
Jason Evans	fbd8d773ad	Fix unsigned comparison underflow. These bugs only affected tests and debug builds.	2015-03-11 23:14:50 -07:00
Jason Evans	8d0e04d42f	Refactor rtree to be lock-free. Recent huge allocation refactoring associates huge allocations with arenas, but it remains necessary to quickly look up huge allocation metadata during reallocation/deallocation. A global radix tree remains a good solution to this problem, but locking would have become the primary bottleneck after (upcoming) migration of chunk management from global to per arena data structures. This lock-free implementation uses double-checked reads to traverse the tree, so that in the steady state, each read or write requires only a single atomic operation. This implementation also assures that no more than two tree levels actually exist, through a combination of careful virtual memory allocation which makes large sparse nodes cheap, and skipping the root node on x64 (possible because the top 16 bits are all 0 in practice).	2015-02-04 16:51:53 -08:00
Jason Evans	5460aa6f66	Convert all tsd variables to reside in a single tsd structure.	2014-09-23 02:36:08 -07:00
Richard Diamond	9c3a10fdf6	Try to use __builtin_ffsl if ffsl is unavailable. Some platforms (like those using Newlib) don't have ffs/ffsl. This commit adds a check to configure.ac for __builtin_ffsl if ffsl isn't found. __builtin_ffsl performs the same function as ffsl, and has the added benefit of being available on any platform utilizing Gcc-compatible compiler. This change does not address the used of ffs in the MALLOCX_ARENA() macro.	2014-06-02 07:44:50 -07:00
Jason Evans	b954bc5d3a	Convert rtree from (void *) to (uint8_t) storage. Reduce rtree memory usage by storing booleans (1 byte each) rather than pointers. The rtree code is only used to record whether jemalloc manages a chunk of memory, so there's no need to store pointers in the rtree. Increase rtree node size to 64 KiB in order to reduce tree depth from 13 to 3 on 64-bit systems. The conversion to more compact leaf nodes was enough by itself to make the rtree depth 1 on 32-bit systems; due to the fact that root nodes are smaller than the specified node size if possible, the node size change has no impact on 32-bit systems (assuming default chunk size).	2014-01-02 17:36:38 -08:00
Jason Evans	b980cc774a	Add rtree unit tests.	2014-01-02 16:17:15 -08:00
Jason Evans	20f1fc95ad	Fix fork(2)-related deadlocks. Add a library constructor for jemalloc that initializes the allocator. This fixes a race that could occur if threads were created by the main thread prior to any memory allocation, followed by fork(2), and then memory allocation in the child process. Fix the prefork/postfork functions to acquire/release the ctl, prof, and rtree mutexes. This fixes various fork() child process deadlocks, but one possible deadlock remains (intentionally) unaddressed: prof backtracing can acquire runtime library mutexes, so deadlock is still possible if heap profiling is enabled during fork(). This deadlock is known to be a real issue in at least the case of libgcc-based backtracing. Reported by tfengjun.	2012-10-09 15:21:46 -07:00
Jason Evans	7427525c28	Move repo contents in jemalloc/ to top level.	2011-03-31 20:36:17 -07:00

27 Commits