server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
David Goldblatt	bf2dc7e678	Header refactoring: ticker module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
David Goldblatt	fa3ad730c4	Header refactoring: prng module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
David Goldblatt	4d2e4bf5eb	Get rid of most of the various inline macros.	2017-04-24 10:33:21 -07:00
David Goldblatt	425253e2cd	Enable -Wundef, when supported. This can catch bugs in which one header defines a numeric constant, and another uses it without including the defining header. Undefined preprocessor symbols expand to '0', so that this will compile fine, silently doing the math wrong.	2017-04-21 17:03:56 -07:00
Jason Evans	3823effe12	Remove --enable-ivsalloc. Continue to use ivsalloc() when --enable-debug is specified (and add assertions to guard against 0 size), but stop providing a documented explicit semantics-changing band-aid to dodge undefined behavior in sallocx() and malloc_usable_size(). ivsalloc() remains compiled in, unlike when #211 restored --enable-ivsalloc, and if JEMALLOC_FORCE_IVSALLOC is defined during compilation, sallocx() and malloc_usable_size() will still use ivsalloc(). This partially resolves #580.	2017-04-21 14:34:35 -07:00
Jason Evans	b2a8453a3f	Remove --disable-tls. This option is no longer useful, because TLS is correctly configured automatically on all supported platforms. This partially resolves #580.	2017-04-21 11:12:29 -07:00
Jim Chen	ae248a2160	Use openat syscall if available Some architectures like AArch64 may not have the open syscall because it was superseded by the openat syscall, so check and use SYS_openat if SYS_open is not available. Additionally, Android headers for AArch64 define SYS_open to __NR_open, even though __NR_open is undefined. Undefine SYS_open in that case so SYS_openat is used.	2017-04-21 10:58:42 -07:00
Jason Evans	4403c9ab44	Remove --disable-tcache. Simplify configuration by removing the --disable-tcache option, but replace the testing for that configuration with --with-malloc-conf=tcache:false. Fix the thread.arena and thread.tcache.flush mallctls to work correctly if tcache is disabled. This partially resolves #580.	2017-04-21 10:06:12 -07:00
Qi Wang	5aa46f027d	Bypass extent tracking for auto arenas. Tracking extents is required by arena_reset. To support this, the extent linkage was used for tracking 1) large allocations, and 2) full slabs. However modifying the extent linkage could be an expensive operation as it likely incurs cache misses. Since we forbid arena_reset on auto arenas, let's bypass the linkage operations for auto arenas.	2017-04-21 00:29:18 -07:00
Jason Evans	fed9a880c8	Trim before commit in extent_recycle(). This avoids creating clean committed pages as a side effect of aligned allocation. For configurations that decommit memory, purged pages are decommitted, and decommitted extents cannot be coalesced with committed extents. Unless the clean committed pages happen to be selected during allocation, they cause unnecessary permanent extent fragmentation. This resolves #766.	2017-04-19 21:05:12 -07:00
Qi Wang	acf4c8ae33	Output 4 counters for bin mutexes instead of just 2.	2017-04-19 14:53:32 -07:00
Jason Evans	da4cff0279	Support --with-lg-page values larger than system page size. All mappings continue to be PAGE-aligned, even if the system page size is smaller. This change is primarily intended to provide a mechanism for supporting multiple page sizes with the same binary; smaller page sizes work better in conjunction with jemalloc's design. This resolves #467.	2017-04-18 19:01:04 -07:00
Jason Evans	45f087eb03	Revert "Remove BITMAP_USE_TREE." Some systems use a native 64 KiB page size, which means that the bitmap for the smallest size class can be 8192 bits, not just 512 bits as when the page size is 4 KiB. Linear search in bitmap_{sfu,ffu}() is unacceptably slow for such large bitmaps. This reverts commit 7c00f04ff40a34627e31488d02ff1081c749c7ba.	2017-04-18 19:01:04 -07:00
David Goldblatt	38e847c1c5	Header refactoring: unify spin.h and move it out of the catch-all.	2017-04-18 18:35:03 -07:00
David Goldblatt	418d96a86c	Header refactoring: unify nstime.h and move it out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	7ebc83894f	Header refactoring: move jemalloc_internal_types.h out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	d9ec36e22d	Header refactoring: move assert.h out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	f692e6c214	Header refactoring: move util.h out of the catchall	2017-04-18 18:35:03 -07:00
David Goldblatt	54373be084	Header refactoring: move malloc_io.h out of the catchall	2017-04-18 18:35:03 -07:00
David Goldblatt	22366518b7	Move CPP_PROLOGUE and CPP_EPILOGUE to the .cpp This lets us avoid having to specify them in every C file.	2017-04-18 18:35:03 -07:00
Qi Wang	855c127348	Remove the function alignment of prof_backtrace. This was an attempt to avoid triggering slow path in libunwind, however turns out to be ineffective.	2017-04-17 16:19:32 -07:00
Jason Evans	881fbf762f	Prefer old/low extent_t structures during reuse. Rather than using a LIFO queue to track available extent_t structures, use a red-black tree, and always choose the oldest/lowest available during reuse.	2017-04-17 14:47:45 -07:00
Jason Evans	76b35f4b2f	Track extent structure serial number (esn) in extent_t. This enables stable sorting of extent_t structures.	2017-04-17 14:47:45 -07:00
Jason Evans	69aa552809	Allocate increasingly large base blocks. Limit the total number of base block by leveraging the exponential size class sequence, similarly to extent_grow_retained().	2017-04-17 14:47:45 -07:00
Jason Evans	675701660c	Update base_unmap() to match extent_dalloc_wrapper(). Reverse the order of forced versus lazy purging attempts in base_unmap(), in order to match the order in extent_dalloc_wrapper(), which was reversed by 64e458f5cdd64f9b67cb495f177ef96bf3ce4e0e (Implement two-phase decay-based purging.).	2017-04-17 14:47:45 -07:00
Qi Wang	3c9c41edb2	Improve rtree cache with a two-level cache design. Two levels of rcache is implemented: a direct mapped cache as L1, combined with a LRU cache as L2. The L1 cache offers low cost on cache hit, but could suffer collision under circumstances. This is complemented by the L2 LRU cache, which is slower on cache access (overhead from linear search + reordering), but solves collison of L1 rather well.	2017-04-17 12:05:23 -07:00
Qi Wang	c2fcf9c2cf	Switch to fine-grained reentrancy support. Previously we had a general detection and support of reentrancy, at the cost of having branches and inc / dec operations on fast paths. To avoid taxing fast paths, we move the reentrancy operations onto tsd slow state, and only modify reentrancy level around external calls (that might trigger reentrancy).	2017-04-14 19:48:06 -07:00
Qi Wang	b348ba29bb	Bundle 3 branches on fast path into tsd_state. Added tsd_state_nominal_slow, which on fast path malloc() incorporates tcache_enabled check, and on fast path free() bundles both malloc_slow and tcache_enabled branches.	2017-04-14 16:58:08 -07:00
Qi Wang	ccfe68a916	Pass alloc_ctx down profiling path. With this change, when profiling is enabled, we avoid doing redundant rtree lookups. Also changed dalloc_atx_t to alloc_atx_t, as it's now used on allocation path as well (to speed up profiling).	2017-04-12 13:55:39 -07:00
Qi Wang	f35213bae4	Pass dalloc_ctx down the sdalloc path. This avoids redundant rtree lookups.	2017-04-12 13:55:39 -07:00
David Goldblatt	e709fae1d7	Header refactoring: move atomic.h out of the catch-all	2017-04-11 11:52:30 -07:00
David Goldblatt	743d940dc3	Header refactoring: Split up jemalloc_internal.h This is a biggy. jemalloc_internal.h has been doing multiple jobs for a while now: - The source of system-wide definitions. - The catch-all include file. - The module header file for jemalloc.c This commit splits up this functionality. The system-wide definitions responsibility has moved to jemalloc_preamble.h. The catch-all include file is now jemalloc_internal_includes.h. The module headers for jemalloc.c are now in jemalloc_internal_[externs\|inlines\|types].h, just as they are for the other modules.	2017-04-11 11:52:30 -07:00
David Goldblatt	2f00ce4da7	Header refactoring: break out ph.h dependencies	2017-04-11 11:52:30 -07:00
Qi Wang	bfa530b75b	Pass dealloc_ctx down free() fast path. This gets rid of the redundent rtree lookup down fast path.	2017-04-11 09:58:12 -07:00
Qi Wang	04ef218d87	Move reentrancy_level to the beginning of TSD.	2017-04-07 16:25:43 -07:00
David Goldblatt	b407a65401	Add basic reentrancy-checking support, and allow arena_new to reenter. This checks whether or not we're reentrant using thread-local data, and, if we are, moves certain internal allocations to use arena 0 (which should be properly initialized after bootstrapping). The immediate thing this allows is spinning up threads in arena_new, which will enable spinning up background threads there.	2017-04-07 14:10:27 -07:00
David Goldblatt	0a0fcd3e6a	Add hooking functionality This allows us to hook chosen functions and do interesting things there (in particular: reentrancy checking).	2017-04-07 14:10:27 -07:00
Qi Wang	36bd90b962	Optimizing TSD and thread cache layout. 1) Re-organize TSD so that frequently accessed fields are closer to the beginning and more compact. Assuming 64-bit, the first 2.5 cachelines now contains everything needed on tcache fast path, expect the tcache struct itself. 2) Re-organize tcache and tbins. Take lg_fill_div out of tbin, and reduce tbin to 24 bytes (down from 32). Split tbins into tbins_small and tbins_large, and place tbins_small close to the beginning.	2017-04-07 14:06:17 -07:00
Qi Wang	4dec507546	Bypass witness_fork in TSD when !config_debug. With the tcache change, we plan to leave some blank space when !config_debug (unused tbins, witnesses) at the end of the tsd. Let's not touch the memory.	2017-04-07 14:06:17 -07:00
Qi Wang	0fba57e579	Get rid of tcache_enabled_t as we have runtime init support.	2017-04-07 10:42:29 -07:00
Qi Wang	fde3e20cc0	Integrate auto tcache into TSD. The embedded tcache is initialized upon tsd initialization. The avail arrays for the tbins will be allocated / deallocated accordingly during init / cleanup. With this change, the pointer to the auto tcache will always be available, as long as we have access to the TSD. tcache_available() (called in tcache_get()) is provided to check if we should use tcache.	2017-04-07 09:55:14 -07:00
David Goldblatt	074f2256ca	Make prof's cum_gctx a C11-style atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	5dcc13b342	Make the mutex n_waiting_thds field a C11-style atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	492a941f49	Convert extent module to use C11-style atomcis	2017-04-05 16:25:37 -07:00
David Goldblatt	30d74db08e	Convert accumbytes in prof_accum_t to C11 atomics, when possible	2017-04-05 16:25:37 -07:00
David Goldblatt	55d992c48c	Make extent_dss use C11-style atomics	2017-04-05 16:25:37 -07:00
David Goldblatt	92aafb0efe	Make base_t's extent_hooks field C11-atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	56b72c7b17	Transition arena struct fields to C11 atomics	2017-04-05 16:25:37 -07:00
David Goldblatt	bc32ec3503	Move arena-tracking atomics in jemalloc.c to C11-style	2017-04-05 16:25:37 -07:00
David Goldblatt	7da04a6b09	Convert prng module to use C11-style atomics	2017-04-04 16:45:52 -07:00

1 2 3 4 5 ...

933 Commits