server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Jason Evans	1a4ad3c0fa	Refactor out arena_compute_npurge(). Refactor out arena_compute_npurge() by integrating its logic into arena_stash_dirty() as an incremental computation.	2016-02-19 20:32:37 -08:00
Jason Evans	4985dc681e	Refactor arena_ralloc_no_move(). Refactor early return logic in arena_ralloc_no_move() to return early on failure rather than on success.	2016-02-19 20:32:37 -08:00
Jason Evans	578cd16581	Refactor arena_malloc_hard() out of arena_malloc().	2016-02-19 20:32:32 -08:00
Jason Evans	34676d3369	Refactor prng* from cpp macros into inline functions. Remove 32-bit variant, convert prng64() to prng_lg_range(), and add prng_range().	2016-02-19 20:29:06 -08:00
Qi Wang	f4a0f32d34	Fast-path improvement: reduce # of branches and unnecessary operations. - Combine multiple runtime branches into a single malloc_slow check. - Avoid calling arena_choose / size2index / index2size on fast path. - A few micro optimizations.	2015-11-10 14:28:34 -08:00
Joshua Kahn	13b4015531	Allow const keys for lookup Signed-off-by: Steve Dougherty <sdougherty@barracuda.com> This resolves #281.	2015-11-09 15:48:05 -08:00
Mike Hommey	f97298bfc1	Remove arena_run_dalloc_decommit(). This resolves #284.	2015-11-09 15:38:30 -08:00
Jason Evans	a784e411f2	Fix a xallocx(..., MALLOCX_ZERO) bug. Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large allocations that have been randomly assigned an offset of 0 when --enable-cache-oblivious configure option is enabled. This addresses a special case missed in `d260f442ce` (Fix xallocx(..., MALLOCX_ZERO) bugs.).	2015-09-24 22:21:55 -07:00
Jason Evans	d260f442ce	Fix xallocx(..., MALLOCX_ZERO) bugs. Zero all trailing bytes of large allocations when --enable-cache-oblivious configure option is enabled. This regression was introduced by `8a03cf039c` (Implement cache index randomization for large allocations.). Zero trailing bytes of huge allocations when resizing from/to a size class that is not a multiple of the chunk size.	2015-09-24 16:38:45 -07:00
Jason Evans	e56b24e3a2	Make arena_dalloc_large_locked_impl() static.	2015-09-20 09:58:10 -07:00
Jason Evans	9a505b768c	Centralize xallocx() size[+extra] overflow checks.	2015-09-15 14:39:58 -07:00
Jason Evans	676df88e48	Rename arena_maxclass to large_maxclass. arena_maxclass is no longer an appropriate name, because arenas also manage huge allocations.	2015-09-11 20:50:20 -07:00
Jason Evans	560a4e1e01	Fix xallocx() bugs. Fix xallocx() bugs related to the 'extra' parameter when specified as non-zero.	2015-09-11 20:40:34 -07:00
Dmitry-Me	a306a60651	Reduce variables scope	2015-09-04 10:42:33 -07:00
Jason Evans	d01fd19755	Rename index_t to szind_t to avoid an existing type on Solaris. This resolves #256.	2015-08-19 15:21:32 -07:00
Jason Evans	5ef33a9f2b	Don't bitshift by negative amounts. Don't bitshift by negative amounts when encoding/decoding run sizes in chunk header maps. This affected systems with page sizes greater than 8 KiB. Reported by Ingvar Hagelund <ingvar@redpill-linpro.com>.	2015-08-19 14:16:30 -07:00
Jason Evans	1f27abc1b1	Refactor arena_mapbits_{small,large}_set() to not preserve unzeroed. Fix arena_run_split_large_helper() to treat newly committed memory as zeroed.	2015-08-11 16:45:47 -07:00
Jason Evans	45186f0c07	Refactor arena_mapbits unzeroed flag management. Only set the unzeroed flag when initializing the entire mapbits entry, rather than mutating just the unzeroed bit. This simplifies the possible mapbits state transitions.	2015-08-10 23:03:34 -07:00
Jason Evans	de249c8679	Arena chunk decommit cleanups and fixes. Decommit arena chunk header during chunk deallocation if the rest of the chunk is decommitted.	2015-08-10 17:13:59 -07:00
Jason Evans	8fadb1a8c2	Implement chunk hook support for page run commit/decommit. Cascade from decommit to purge when purging unused dirty pages, so that it is possible to decommit cleaned memory rather than just purging. For non-Windows debug builds, decommit runs rather than purging them, since this causes access of deallocated runs to segfault. This resolves #251.	2015-08-07 00:50:58 -07:00
Jason Evans	5716d97f75	Fix an in-place growing large reallocation regression. Fix arena_ralloc_large_grow() to properly account for large_pad, so that in-place large reallocation succeeds when possible, rather than always failing. This regression was introduced by `8a03cf039c` (Implement cache index randomization for large allocations.)	2015-08-06 23:45:45 -07:00
Jason Evans	b49a334a64	Generalize chunk management hooks. Add the "arena.<i>.chunk_hooks" mallctl, which replaces and expands on the "arena.<i>.chunk.{alloc,dalloc,purge}" mallctls. The chunk hooks allow control over chunk allocation/deallocation, decommit/commit, purging, and splitting/merging, such that the application can rely on jemalloc's internal chunk caching and retaining functionality, yet implement a variety of chunk management mechanisms and policies. Merge the chunks_[sz]ad_{mmap,dss} red-black trees into chunks_[sz]ad_retained. This slightly reduces how hard jemalloc tries to honor the dss precedence setting; prior to this change the precedence setting was also consulted when recycling chunks. Fix chunk purging. Don't purge chunks in arena_purge_stashed(); instead deallocate them in arena_unstash_purged(), so that the dirty memory linkage remains valid until after the last time it is used. This resolves #176 and #201.	2015-08-03 21:49:02 -07:00
Jason Evans	50883deb6e	Change arena_palloc_large() parameter from size to usize. This change merely documents that arena_palloc_large() always receives usize as its argument.	2015-07-23 17:13:18 -07:00
Jason Evans	5fae7dc1b3	Fix MinGW-related portability issues. Create and use FMT* macros that are equivalent to the PRI* macros that inttypes.h defines. This allows uniform use of the Unix-specific format specifiers, e.g. "%zu", as well as avoiding Windows-specific definitions of e.g. PRIu64. Add ffs()/ffsl() support for compiling with gcc. Extract compatibility definitions of ENOENT, EINVAL, EAGAIN, EPERM, ENOMEM, and ENORANGE into include/msvc_compat/windows_extra.h and use the file for tests as well as for core jemalloc code.	2015-07-23 13:56:25 -07:00
Jason Evans	aa2826621e	Revert to first-best-fit run/chunk allocation. This effectively reverts `97c04a9383` (Use first-fit rather than first-best-fit run/chunk allocation.). In some pathological cases, first-fit search dominates allocation time, and it also tends not to converge as readily on a steady state of memory layout, since precise allocation order has a bigger effect than for first-best-fit.	2015-07-15 17:15:19 -07:00
Jason Evans	0313607e66	Fix MinGW build warnings. Conditionally define ENOENT, EINVAL, etc. (was unconditional). Add/use PRIzu, PRIzd, and PRIzx for use in malloc_printf() calls. gcc issued (harmless) warnings since e.g. "%zu" should be "%Iu" on Windows, and the alternative to this workaround would have been to disable the function attributes which cause gcc to look for type mismatches in formatted printing function calls.	2015-07-07 20:10:28 -07:00
Jason Evans	bce61d61bb	Move a variable declaration closer to its use.	2015-07-07 09:32:05 -07:00
Jason Evans	0a9f9a4d51	Convert arena_maybe_purge() recursion to iteration. This resolves #235.	2015-06-22 18:50:58 -07:00
Jason Evans	5154175cf1	Fix performance regression in arena_palloc(). Pass large allocation requests to arena_malloc() when possible. This regression was introduced by `155bfa7da1` (Normalize size classes.).	2015-05-19 17:42:31 -07:00
Jason Evans	8a03cf039c	Implement cache index randomization for large allocations. Extract szad size quantization into {extent,run}_quantize(), and . quantize szad run sizes to the union of valid small region run sizes and large run sizes. Refactor iteration in arena_run_first_fit() to use run_quantize{,_first,_next(), and add support for padded large runs. For large allocations that have no specified alignment constraints, compute a pseudo-random offset from the beginning of the first backing page that is a multiple of the cache line size. Under typical configurations with 4-KiB pages and 64-byte cache lines this results in a uniform distribution among 64 page boundary offsets. Add the --disable-cache-oblivious option, primarily intended for performance testing. This resolves #13.	2015-05-06 13:27:39 -07:00
Jason Evans	65db63cf3f	Fix in-place shrinking huge reallocation purging bugs. Fix the shrinking case of huge_ralloc_no_move_similar() to purge the correct number of pages, at the correct offset. This regression was introduced by `8d6a3e8321` (Implement dynamic per arena control over dirty page purging.). Fix huge_ralloc_no_move_shrink() to purge the correct number of pages. This bug was introduced by `9673983443` (Purge/zero sub-chunk huge allocations as necessary.).	2015-03-25 19:10:06 -07:00
Jason Evans	562d266511	Add the "stats.arenas.<i>.lg_dirty_mult" mallctl.	2015-03-24 16:41:38 -07:00
Jason Evans	bd16ea49c3	Fix signed/unsigned comparison in arena_lg_dirty_mult_valid().	2015-03-24 15:59:28 -07:00
Jason Evans	8d6a3e8321	Implement dynamic per arena control over dirty page purging. Add mallctls: - arenas.lg_dirty_mult is initialized via opt.lg_dirty_mult, and can be modified to change the initial lg_dirty_mult setting for newly created arenas. - arena.<i>.lg_dirty_mult controls an individual arena's dirty page purging threshold, and synchronously triggers any purging that may be necessary to maintain the constraint. - arena.<i>.chunk.purge allows the per arena dirty page purging function to be replaced. This resolves #93.	2015-03-18 18:55:33 -07:00
Jason Evans	bc45d41d23	Fix a declaration-after-statement regression.	2015-03-11 16:50:40 -07:00
Jason Evans	f5c8f37259	Normalize rdelm/rd structure field naming.	2015-03-10 18:29:49 -07:00
Jason Evans	38e42d311c	Refactor dirty run linkage to reduce sizeof(extent_node_t).	2015-03-10 18:15:40 -07:00
Jason Evans	97c04a9383	Use first-fit rather than first-best-fit run/chunk allocation. This tends to more effectively pack active memory toward low addresses. However, additional tree searches are required in many cases, so whether this change stands the test of time will depend on real-world benchmarks.	2015-03-06 20:21:41 -08:00
Jason Evans	5707d6f952	Quantize szad trees by size class. Treat sizes that round down to the same size class as size-equivalent in trees that are used to search for first best fit, so that there are only as many "firsts" as there are size classes. This comes closer to the ideal of first fit.	2015-03-06 20:21:41 -08:00
Jason Evans	99bd94fb65	Fix chunk cache races. These regressions were introduced by `ee41ad409a` (Integrate whole chunks into unused dirty page purging machinery.).	2015-02-18 16:40:53 -08:00
Jason Evans	738e089a2e	Rename "dirty chunks" to "cached chunks". Rename "dirty chunks" to "cached chunks", in order to avoid overloading the term "dirty". Fix the regression caused by `339c2b23b2` (Fix chunk_unmap() to propagate dirty state.), and actually address what that change attempted, which is to only purge chunks once, and propagate whether zeroed pages resulted into chunk_record().	2015-02-18 01:15:50 -08:00
Jason Evans	339c2b23b2	Fix chunk_unmap() to propagate dirty state. Fix chunk_unmap() to propagate whether a chunk is dirty, and modify dirty chunk purging to record this information so it can be passed to chunk_unmap(). Since the broken version of chunk_unmap() claimed that all chunks were clean, this resulted in potential memory corruption for purging implementations that do not zero (e.g. MADV_FREE). This regression was introduced by `ee41ad409a` (Integrate whole chunks into unused dirty page purging machinery.).	2015-02-17 22:25:56 -08:00
Jason Evans	47701b22ee	arena_chunk_dirty_node_init() --> extent_node_dirty_linkage_init()	2015-02-17 22:23:10 -08:00
Jason Evans	a4e1888d1a	Simplify extent_node_t and add extent_node_init().	2015-02-17 15:13:52 -08:00
Jason Evans	ee41ad409a	Integrate whole chunks into unused dirty page purging machinery. Extend per arena unused dirty page purging to manage unused dirty chunks in aaddtion to unused dirty runs. Rather than immediately unmapping deallocated chunks (or purging them in the --disable-munmap case), store them in a separate set of trees, chunks_[sz]ad_dirty. Preferrentially allocate dirty chunks. When excessive unused dirty pages accumulate, purge runs and chunks in ingegrated LRU order (and unmap chunks in the --enable-munmap case). Refactor extent_node_t to provide accessor functions.	2015-02-16 21:02:17 -08:00
Jason Evans	2195ba4e1f	Normalize _link and link_ fields to all be *_link.	2015-02-15 16:43:52 -08:00
Jason Evans	88fef7ceda	Refactor huge_() calls into arena internals. Make redirects to the huge_() API the arena code's responsibility, since arenas now take responsibility for all allocation sizes.	2015-02-12 14:06:37 -08:00
Jason Evans	cbf3a6d703	Move centralized chunk management into arenas. Migrate all centralized data structures related to huge allocations and recyclable chunks into arena_t, so that each arena can manage huge allocations and recyclable virtual memory completely independently of other arenas. Add chunk node caching to arenas, in order to avoid contention on the base allocator. Use chunks_rtree to look up huge allocations rather than a red-black tree. Maintain a per arena unsorted list of huge allocations (which will be needed to enumerate huge allocations during arena reset). Remove the --enable-ivsalloc option, make ivsalloc() always available, and use it for size queries if --enable-debug is enabled. The only practical implications to this removal are that 1) ivsalloc() is now always available during live debugging (and the underlying radix tree is available during core-based debugging), and 2) size query validation can no longer be enabled independent of --enable-debug. Remove the stats.chunks.{current,total,high} mallctls, and replace their underlying statistics with simpler atomically updated counters used exclusively for gdump triggering. These statistics are no longer very useful because each arena manages chunks independently, and per arena statistics provide similar information. Simplify chunk synchronization code, now that base chunk allocation cannot cause recursive lock acquisition.	2015-02-12 00:15:56 -08:00
Jason Evans	1cb181ed63	Implement explicit tcache support. Add the MALLOCX_TCACHE() and MALLOCX_TCACHE_NONE macros, which can be used in conjunction with the *allocx() API. Add the tcache.create, tcache.flush, and tcache.destroy mallctls. This resolves #145.	2015-02-09 17:44:48 -08:00
Mike Hommey	6505733012	Make opt.lg_dirty_mult work as documented The documentation for opt.lg_dirty_mult says: Per-arena minimum ratio (log base 2) of active to dirty pages. Some dirty unused pages may be allowed to accumulate, within the limit set by the ratio (or one chunk worth of dirty pages, whichever is greater) (...) The restriction in parentheses currently doesn't happen. This makes jemalloc aggressively madvise(), which in turns increases the amount of page faults significantly. For instance, this resulted in several(!) hundred(!) milliseconds startup regression on Firefox for Android. This may require further tweaking, but starting with actually doing what the documentation says is a good start.	2015-02-04 07:16:55 +09:00

1 2 3

145 Commits