server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
David Goldblatt	81c6027592	Arena stats: Give it its own "mapped". This distinguishes it from the PA mapped stat, which is now named "pa_mapped" to avoid confusion. The (derived) arena stat includes base memory, and the PA stat is no longer partially derived.	2020-04-10 13:12:47 -07:00
David Goldblatt	506d907e40	PA: Move in basic stats merging.	2020-04-10 13:12:47 -07:00
David Goldblatt	f29f6090f5	PA: Add pa_extra.c and put PA forking there.	2020-04-10 13:12:47 -07:00
David Goldblatt	565045ef71	Arena: Make more derived stats non-atomic/locked.	2020-04-10 13:12:47 -07:00
David Goldblatt	d0c43217b5	Arena stats: Move retained to PA, use plain ints. Retained is a property of the allocated pages. The derived fields no longer require any locking; they're computed on demand.	2020-04-10 13:12:47 -07:00
David Goldblatt	e2cf3fb1a3	PA: Move in all modifications of mapped.	2020-04-10 13:12:47 -07:00
David Goldblatt	436789ad96	PA: Make mapped stat atomic. We always have atomic_zu_t, and mapped/unmapped transitions are always expensive enough that trying to piggyback on a lock is a waste of time.	2020-04-10 13:12:47 -07:00
David Goldblatt	3c28aa6f17	PA: Move edata_avail stat in, make it non-atomic.	2020-04-10 13:12:47 -07:00
David Goldblatt	f6bfa3dcca	Move extent stats to the PA module. While we're at it, make them non-atomic -- they are purely derived statistics (and in fact aren't even in the arena_t or pa_shard_t).	2020-04-10 13:12:47 -07:00
David Goldblatt	527dd4cdb8	PA: Move in nactive counter.	2020-04-10 13:12:47 -07:00
David Goldblatt	c075fd0bcb	PA: Minor cleanups and comment fixes.	2020-04-10 13:12:47 -07:00
David Goldblatt	46a9d7fc0b	PA: Move in rest of purging.	2020-04-10 13:12:47 -07:00
David Goldblatt	2d6eec7b5c	PA: Move in decay-all pathway.	2020-04-10 13:12:47 -07:00
David Goldblatt	65698b7f2e	PA: Remove public visibility of some internals.	2020-04-10 13:12:47 -07:00
David Goldblatt	f012c43be0	PA: Move in decay_to_limit	2020-04-10 13:12:47 -07:00
David Goldblatt	3034f4a508	PA: Move in decay_stashed.	2020-04-10 13:12:47 -07:00
David Goldblatt	aef28b2f8f	PA: Move in stash_decayed.	2020-04-10 13:12:47 -07:00
David Goldblatt	71fc0dc968	PA: Move in remaining page allocation functions.	2020-04-10 13:12:47 -07:00
David Goldblatt	74958567a4	PA: have expand take sizes instead of new usize. This avoids involving usize, which makes some of the stats modifications more intuitively correct.	2020-04-10 13:12:47 -07:00
David Goldblatt	5bcc2c2ab9	PA: Have expand take szind and slab. This isn't really necessary, but having a uniform API will help us later.	2020-04-10 13:12:47 -07:00
David Goldblatt	0880c2ab97	PA: Have large expands use it.	2020-04-10 13:12:47 -07:00
David Goldblatt	9f93625c14	PA: Move in arena large allocation functionality.	2020-04-10 13:12:47 -07:00
David Goldblatt	7624043a41	PA: Add ehook-getting support.	2020-04-10 13:12:47 -07:00
David Goldblatt	eba35e2e48	Remove extent knowledge of arena.	2020-04-10 13:12:47 -07:00
David Goldblatt	e77f47a85a	Move arena decay getters to PA.	2020-04-10 13:12:47 -07:00
David Goldblatt	f77cec311e	Decay: Take current time as an argument. This better facilitates testing.	2020-04-10 13:12:47 -07:00
David Goldblatt	d1d7e1076b	Decay: move in some background_thread accesses.	2020-04-10 13:12:47 -07:00
David Goldblatt	cdb916ed3f	Decay: Add comments for the public API.	2020-04-10 13:12:47 -07:00
David Goldblatt	8f2193dc8d	Decay: Move in arena decay functions.	2020-04-10 13:12:47 -07:00
David Goldblatt	7b62885476	Introduce decay module and put decay objects in PA	2020-04-10 13:12:47 -07:00
David Goldblatt	497836dbc8	Arena stats: mark edata_avail as derived. The true number is in the edata_cache itself.	2020-04-10 13:12:47 -07:00
David Goldblatt	3192d6b77d	Extents: Have extent_dalloc_gap take ehooks. We're almost to the point where the extent code doesn't know about arenas at all. In that world, we shouldn't pull them out of the arena.	2020-04-10 13:12:47 -07:00
David Goldblatt	22a0a7b93a	Move arena_decay_extent to extent module.	2020-04-10 13:12:47 -07:00
David Goldblatt	70d12ffa05	PA: Move mapped into pa stats.	2020-04-10 13:12:47 -07:00
David Goldblatt	6ca918d0cf	PA: Add a stats comment.	2020-04-10 13:12:47 -07:00
David Goldblatt	ce8c0d6c09	PA: Move in arena extent_sn counter. Just another step towards making PA self-contained.	2020-04-10 13:12:47 -07:00
David Goldblatt	1ad368c8b7	PA: Move in decay stats.	2020-04-10 13:12:47 -07:00
David Goldblatt	356aaa7dc6	Introduce lockedint module. This pulls out the various abstractions where some stats counter is sometimes an atomic, sometimes a plain variable, sometimes always protected by a lock, sometimes protected by reads but not writes, etc. With this change, these cases are treated consistently, and access patterns tagged. In the process, we fix a few missed-update bugs (where one caller assumes "protected-by-a-lock" semantics and another does not).	2020-04-10 13:12:47 -07:00
David Goldblatt	acd0bf6a26	PA: move in ecache_grow.	2020-04-10 13:12:47 -07:00
David Goldblatt	32cb7c2f0b	PA: Add a stats type.	2020-04-10 13:12:47 -07:00
David Goldblatt	688fb3eb89	PA: Move in the arena edata_cache.	2020-04-10 13:12:47 -07:00
David Goldblatt	8433ad84ea	PA: move in shard initialization.	2020-04-10 13:12:47 -07:00
David Goldblatt	a24faed569	PA: Move in the ecache_t objects.	2020-04-10 13:12:47 -07:00
David Goldblatt	585f925055	Move cache index randomization out of extent. This is logically at a higher level of the stack; extent should just allocate things at the page-level; it shouldn't care exactly why the callers wants a given number of pages.	2020-04-10 13:12:47 -07:00
David Goldblatt	12be9f5727	Add a stub PA module -- a page allocator.	2020-04-10 13:12:47 -07:00
Yinan Zhang	c4e9ea8cc6	Get rid of locks in prof recent test	2020-04-07 17:22:24 -07:00
Yinan Zhang	2deabac079	Get rid of custom iterator for last-N records	2020-04-07 17:22:24 -07:00
Yinan Zhang	a5ddfa7d91	Use ql for prof last-N list	2020-04-07 17:22:24 -07:00
Yinan Zhang	ce17af4221	Better structure ql module	2020-04-06 09:50:27 -07:00
Yinan Zhang	4b66297ea0	Add move constructor to ql module	2020-04-06 09:50:27 -07:00
Yinan Zhang	a62b7ed928	Add emptiness checking to ql module	2020-04-06 09:50:27 -07:00
Yinan Zhang	1dd24ca6d2	Add rotate functionality to ql module	2020-04-06 09:50:27 -07:00
Yinan Zhang	0dc95a882f	Add concat and split functionality to ql module	2020-04-06 09:50:27 -07:00
Yinan Zhang	1ad06aa53b	deduplicate insert and delete logic in qr module	2020-04-06 09:50:27 -07:00
Yinan Zhang	c9d56cddf2	Optimize meld in qr module The goal of `qr_meld()` is to change the following four fields `(a->prev, a->prev->next, b->prev, b->prev->next)` from the values `(a->prev, a, b->prev, b)` to `(b->prev, b, a->prev, a)`. This commit changes ``` a->prev->next = b; b->prev->next = a; temp = a->prev; a->prev = b->prev; b->prev = temp; ``` to ``` temp = a->prev; a->prev = b->prev; b->prev = temp; a->prev->next = a; b->prev->next = b; ``` The benefit is that we can use `b->prev->next` for `temp`, and so there's no need to pass in `a_type`. The restriction is that `b` cannot be a `qr_next()` macro, so users of `qr_meld()` must pay attention. (Before this change, neither `a` nor `b` could be a `qr_next()` macro.)	2020-04-06 09:50:27 -07:00
Yinan Zhang	f9aad7a49b	Add piping API to buffered writer	2020-04-01 09:41:20 -07:00
Yinan Zhang	09cd79495f	Encapsulate buffer allocation failure in buffered writer	2020-04-01 09:41:20 -07:00
David Goldblatt	3b4a03b92b	Mac: don't declare system functions as nothrow. This contradicts the system headers, which can lead to breakages.	2020-03-26 14:11:24 -07:00
Yinan Zhang	2256ef8961	Add option to fetch system thread name on each prof sample	2020-03-24 21:39:57 -07:00
Yinan Zhang	a5780598b3	Remove thread_event_rollback()	2020-03-12 13:55:00 -07:00
Yinan Zhang	ba783b3a0f	Remove prof -> thread_event dependency	2020-03-12 13:55:00 -07:00
Yinan Zhang	441d88d1c7	Rewrite profiling thread event	2020-03-12 13:55:00 -07:00
David Goldblatt	99b1291d17	Edata cache: add edata_cache_small_t. This can be used to amortize the synchronization costs of edata_cache accesses.	2020-03-12 11:58:09 -07:00
David Goldblatt	92485032b2	Cache bin: improve comments.	2020-03-12 11:54:19 -07:00
David Goldblatt	d701a085c2	Fast path: allow low-water mark changes. This lets us put more allocations on an "almost as fast" path after a flush. This results in around a 4% reduction in malloc cycles in prod workloads (corresponding to about a 0.1% reduction in overall cycles).	2020-03-12 11:54:19 -07:00
David Goldblatt	397da03865	Cache bin: rewrite to track more state. With this, we track all of the empty, full, and low water states together. This simplifies a lot of the tracking logic, since we now don't need the cache_bin_info_t for state queries (except for some debugging).	2020-03-12 11:54:19 -07:00
David Goldblatt	0a2fcfac01	Tcache: Hold cache bin allocation explicitly.	2020-03-12 11:54:19 -07:00
David Goldblatt	d498a4bb08	Cache bin: Add an emptiness assertion.	2020-03-12 11:54:19 -07:00
David Goldblatt	6a7aa46ef7	Cache bin: Add a debug method for init checking.	2020-03-12 11:54:19 -07:00
David Goldblatt	370c1ea007	Cache bin: Write the unit test in terms of the API I.e. stop allowing the unit test to have secret access to implementation internals.	2020-03-12 11:54:19 -07:00
David Goldblatt	7f5ebd211c	Cache bin: set low-water internally.	2020-03-12 11:54:19 -07:00
David Goldblatt	60113dfe3b	Cache bin: Move in initialization code.	2020-03-12 11:54:19 -07:00
David Goldblatt	44529da852	Cache-bin: Make flush modifications internal I.e. the tcache code just calls a cache-bin function to finish flush (and move pointers around, etc.). It doesn't directly access the cache-bin's owned memory any more.	2020-03-12 11:54:19 -07:00
David Goldblatt	ff6acc6ed5	Cache bin: simplify names and argument ordering. We always start with the cache bin, then its info (if necessary).	2020-03-12 11:54:19 -07:00
David Goldblatt	e1dcc557d6	Cache bin: Only take the relevant cache_bin_info_t Previously, we took an array of cache_bin_info_ts and an index, and dereferenced ourselves. But infos for other cache_bins aren't relevant to any particular cache bin, so that should be the caller's job.	2020-03-12 11:54:19 -07:00
David Goldblatt	1b00d808d7	cache_bin: Don't let arena see empty position.	2020-03-12 11:54:19 -07:00
David Goldblatt	d303f30796	cache_bin nflush -> n. We're going to use it on the fill pathway as well.	2020-03-12 11:54:19 -07:00
David Goldblatt	74d36d78ef	Cache bin: Make ncached_max a query on the info_t.	2020-03-12 11:54:19 -07:00
David Goldblatt	b66c0973cc	cache_bin: Don't allow direct internals access.	2020-03-12 11:54:19 -07:00
David Goldblatt	da68f73296	Move percpu_arena_update. It's not really part of the API of the arena; it changes which arena we're using that API on.	2020-03-12 11:54:19 -07:00
David Goldblatt	909c501b07	Cache_bin: Shouldn't know about tcache. Instead, have it take the cache_bin_info_ts to use by pointer. While we're here, add a src file for the cache bin.	2020-03-12 11:54:19 -07:00
David Goldblatt	79f1ee2fc0	Move junking out of arena/tcache code. This is debug only and we keep it off the fast path. Moving it here simplifies the internal logic. This never tries to junk on regions that were shrunk via xallocx. I think this is fine for two reasons: - The shrunk-with-xallocx case is rare. - We don't always do that anyway before this diff (it depends on the opt settings and extent hooks in effect).	2020-03-12 11:54:19 -07:00
David T. Goldblatt	6c3491ad31	Tcache: Unify bin flush logic. The small and large pathways share most of their logic, even if some of the individual operations are different. We pull out the common logic into a force-inlined function, and then specialize twice, once for each value of "small".	2020-02-25 10:21:03 -08:00
David T. Goldblatt	9f4fc27389	Ehooks: Fix a build warning. We wrote `return some_void_func()` in a function returning void, which is confusing and triggers warnings on MSVC.	2020-02-25 10:21:03 -08:00
David T. Goldblatt	162c2bcf31	Background thread: take base as a parameter.	2020-02-18 11:22:09 -08:00
David T. Goldblatt	29436fa056	Break prof and tcache knowledge of b0.	2020-02-18 11:22:09 -08:00
David T. Goldblatt	a0c1f4ac57	Rtree: take the base allocator as a parameter. This facilitates better testing by avoiding mixing of the "real" base with the base used by the rtree under test.	2020-02-18 11:22:09 -08:00
David T. Goldblatt	7013716aaa	Emap: Take (and propagate) a zeroed parameter. Rtree needs this, and we should really treat them similarly.	2020-02-18 11:22:09 -08:00
David T. Goldblatt	182192f83c	Base: Pull into a single header.	2020-02-18 11:22:09 -08:00
David T. Goldblatt	34b7165fde	Put szind_t, pszind_t in sz.h.	2020-02-18 11:22:09 -08:00
David Goldblatt	7e6c8a7286	Emap: Standardize naming. Namespace everything under emap_, always specify what it is we're looking up (emap_lookup -> emap_edata_lookup), and use "ctx" over "info".	2020-02-17 10:50:51 -08:00
David Goldblatt	ac50c1e44b	Emap: Remove direct access to emap internals. In the process, we do a few local cleanups and optimizations. In particular, the size safety check on tcache flush no longer does a redundant load.	2020-02-17 10:50:51 -08:00
David Goldblatt	06e42090f7	Make jemalloc.c use the emap interface. While we're here, we'll also clean up some style nits.	2020-02-17 10:50:51 -08:00
David Goldblatt	f7d9c6c42d	Emap: Move in alloc_ctx lookup functionality.	2020-02-17 10:50:51 -08:00
David Goldblatt	65a54d7714	Emap: Move in szind and slab modifications.	2020-02-17 10:50:51 -08:00
David Goldblatt	9b5d105fc3	Emap: Move in iealloc. This is logically scoped to the emap.	2020-02-17 10:50:51 -08:00
David Goldblatt	1d449bd9a6	Emap: Internal rtree context setting. The only time sharing an rtree context saves across extent operations isn't a no-op is when tsd is unavailable. But this happens only in situations like thread death or initialization, and we don't care about shaving off every possible cycle in such scenarios.	2020-02-17 10:50:51 -08:00
David Goldblatt	08eb1e6c31	Emap: Comments and cleanup Document some of the public interface, and hide the functions that are no longer used outside of the emap module.	2020-02-17 10:50:51 -08:00
David Goldblatt	231d1477e5	Rename emap_split_prepare_t -> emap_prepare_t. Both the split and merge functions use it.	2020-02-17 10:50:51 -08:00
David Goldblatt	0586a56f39	Emap: Move in merge functionality.	2020-02-17 10:50:51 -08:00
David Goldblatt	040eac77cc	Tell edatas their creation arena immediately. This avoids having to pass it in anywhere else.	2020-02-17 10:50:51 -08:00
David Goldblatt	7c7b702064	Emap: Move over metadata splitting logic.	2020-02-17 10:50:51 -08:00
David Goldblatt	44f5f53605	Emap: Move over deregistration functions.	2020-02-17 10:50:51 -08:00
David Goldblatt	6513d9d923	Emap: Move over deregistration boundary functions.	2020-02-17 10:50:51 -08:00
David Goldblatt	9b5ca0b09d	Emap: Move in slab interior registration.	2020-02-17 10:50:51 -08:00
David Goldblatt	d05b61db4a	Emap: Move extent boundary registration in.	2020-02-17 10:50:51 -08:00
David Goldblatt	ca21ce4071	Emap: Move in write_acquired from extent.	2020-02-17 10:50:51 -08:00
David Goldblatt	01f255161c	Add emap, for tracking extent locking.	2020-02-17 10:50:51 -08:00
Qi Wang	ba0e35411c	Rework the bin locking around tcache refill / flush. Previously, tcache fill/flush (as well as small alloc/dalloc on the arena) may potentially drop the bin lock for slab_alloc and slab_dalloc. This commit refactors the logic so that the slab calls happen in the same function / level as the bin lock / unlock. The main purpose is to be able to use flat combining without having to keep track of stack state. In the meantime, this change reduces the locking, especially for slab_dalloc calls, where nothing happens after the call.	2020-02-13 23:31:54 -08:00
Kamil Rytarowski	7fd22f7b2e	Fix Undefined Behavior in hash.h hash.h:200:27, left shift of 250 by 24 places cannot be represented in type 'int'	2020-02-13 12:25:26 -08:00
Yinan Zhang	9cac3fa8f5	Encapsulate buffer allocation in buffered writer	2020-02-04 13:21:58 -08:00
Yinan Zhang	bdc08b5158	Better naming buffered writer	2020-02-04 13:21:58 -08:00
Qi Wang	c6bfe55857	Update the tsd description.	2020-02-04 13:07:05 -08:00
Qi Wang	e896522616	Abbreviate thread-event to te.	2020-02-04 13:07:05 -08:00
Qi Wang	5e500523a0	Remove thread_event_boot().	2020-02-04 00:18:15 -08:00
Qi Wang	97dd79db6c	Implement deallocation events. Make the event module to accept two event types, and pass around the event context. Use bytes-based events to trigger tcache GC on deallocation, and get rid of the tcache ticker.	2020-02-04 00:18:15 -08:00
Qi Wang	974222c626	Add safety check on sdallocx slow / sampled path.	2020-01-31 00:04:22 -08:00
Qi Wang	88d9eca848	Enforce page alignment for sampled allocations. This allows sampled allocations to be checked through alignment, therefore enable sized deallocation regardless of cache_oblivious.	2020-01-31 00:04:22 -08:00
Qi Wang	0f552ed673	Don't purge huge extents when decay is off.	2020-01-30 14:40:38 -08:00
Qi Wang	38a48e5741	Set reentrancy to 1 for tsd_state_purgatory. Reentrancy is already set for other non-nominal tsd states (reincarnated and minimal_initialized). Add purgatory to be safe and consistent.	2020-01-30 13:55:20 -08:00
Qi Wang	88b0e03a4e	Implement opt.stats_interval and the _opts options. Add options stats_interval and stats_interval_opts to allow interval based stats printing. This provides an easy way to collect stats without code changes, because opt.stats_print may not work (some binaries never exit).	2020-01-29 09:57:55 -08:00
Qi Wang	d71a145ec1	Chagne prof_accum_t to counter_accum_t for general purpose.	2020-01-29 09:57:55 -08:00
Yinan Zhang	f81341a48b	Fallback to unbuffered printing if OOM	2020-01-21 17:09:44 -08:00
David Goldblatt	bd3be8e0b1	Remove commit parameter to ecache functions. No caller ever wants uncommitted memory.	2020-01-17 10:54:56 -08:00
Qi Wang	dab81bd315	Rework and fix the assertions on malloc fastpath. The first half of the malloc fastpath may execute before malloc_init. Make the assertions work in that case.	2020-01-14 15:00:41 -08:00
Yinan Zhang	2b604a3016	Record request size in prof recent entries	2020-01-10 12:01:01 -08:00
Yinan Zhang	40a391408c	Define constructor for buffered writer argument	2020-01-10 11:59:02 -08:00
Yinan Zhang	6d8e616902	Make buffered writer an independent module	2020-01-10 11:59:02 -08:00
Yinan Zhang	6b6b4709b3	Unify buffered writer naming	2020-01-09 14:31:31 -08:00
Yinan Zhang	9a60cf54ec	Last-N profiling mode	2019-12-30 15:58:57 -08:00
Yinan Zhang	7a27a05940	Delete tdata states used for cleanup	2019-12-30 15:58:57 -08:00
Yinan Zhang	e98ddf7987	Fix unlikely condition in arena_prof_info_get()	2019-12-30 15:58:57 -08:00
Yinan Zhang	3fa142cf39	Remove _externs from prof internal header names	2019-12-23 11:14:15 -08:00
Yinan Zhang	112dc36dd5	Handle log_mtx during forking	2019-12-20 17:17:48 -08:00
Yinan Zhang	ea42174d07	Refactor profiling headers	2019-12-20 17:17:48 -08:00
David Goldblatt	6342da0970	Ehooks: Further optimize default merge case. This avoids the cost of an iealloc in cases where the user uses the default merge hook without using the default extent hooks.	2019-12-20 10:18:40 -08:00
David Goldblatt	e210ccc57e	Move extent2 -> extent. Eventually, we may fully break off the extent module; but not for some time. If it's going to live on in a non-transitory state, it might as well have the nicer name.	2019-12-20 10:18:40 -08:00
David Goldblatt	2f4fa80414	Rename extents -> ecache.	2019-12-20 10:18:40 -08:00
David Goldblatt	56cc56b692	Break extent split dependence on arena.	2019-12-20 10:18:40 -08:00
David Goldblatt	0aa9769fb0	Break commit functions' arena dependence	2019-12-20 10:18:40 -08:00
David Goldblatt	576d7047ab	Ecache: Should know its arena_ind. What we call an arena_ind is really the index associated with some particular set of ehooks; the arena is just the user-visible portion of that. Making this explicit, and reframing checks in terms of that, makes the code simpler and cleaner, and helps us avoid passing the arena itself all throughout extent code. This lets us put back an arena-specific assert.	2019-12-20 10:18:40 -08:00
David Goldblatt	372042a082	Remove merge dependence on the arena.	2019-12-20 10:18:40 -08:00
David Goldblatt	9cad5639ff	Ehooks: remove arena_ind parameter. This lives within the ehooks_t now, so that callers don't need to know it.	2019-12-20 10:18:40 -08:00
David Goldblatt	57fe99d4be	Move relevant index into the ehooks_t itself. It's always passed into the ehooks; keeping it colocated lets us avoid passing the arena everywhere.	2019-12-20 10:18:40 -08:00
David Goldblatt	c792f3e4ab	edata_cache: Remember the associated base_t. This will save us some trouble down the line when we stop passing arena pointers everywhere; we won't have to pass around a base_t pointer either.	2019-12-20 10:18:40 -08:00
David Goldblatt	ae23e5f426	Unify extent_alloc_wrapper with the other wrappers. Previously, it was really more like extents_alloc (it looks in an ecache for an extent to reuse as its primary allocation pathway). Make that pathway more explciitly like extents_alloc, and rename extent_alloc_wrapper_hard accordingly.	2019-12-20 10:18:40 -08:00
David Goldblatt	d8b0b66c6c	Put extent_state_t into ecache as well as eset.	2019-12-20 10:18:40 -08:00
David Goldblatt	98eb40e563	Move delay_coalesce from the eset to the ecache.	2019-12-20 10:18:40 -08:00
David Goldblatt	bb70df8e5b	Extent refactor: Introduce ecache module. This will eventually completely wrap the eset, and handle concurrency, allocation, and deallocation. For now, we only pull out the mutex from the eset.	2019-12-20 10:18:40 -08:00
David Goldblatt	0704516245	Ehooks: Add head tracking.	2019-12-20 10:18:40 -08:00
David Goldblatt	09475bf8ac	extent_may_dalloc -> ehooks_dalloc_will_fail	2019-12-20 10:18:40 -08:00
David Goldblatt	7859184179	Pull out edata_t caching into its own module.	2019-12-20 10:18:40 -08:00
David Goldblatt	a7862df616	Rename extent_t to edata_t. This frees us up from the unfortunate extent/extent2 naming collision.	2019-12-20 10:18:40 -08:00
David Goldblatt	865debda22	Rename extent.h -> edata.h. This name is slightly pithier; a full-on rename will come shortly.	2019-12-20 10:18:40 -08:00
David Goldblatt	a738a66b5c	Ehooks: Add some debug zero and addr checks. These help make sure that the ehooks return properly zeroed memory when required to.	2019-12-20 10:18:40 -08:00
David Goldblatt	4b2e5ee8b9	Ehooks: Add a "zero" ehook. This is the first API expansion. It lets the hooks pick where and how to purge within themselves.	2019-12-20 10:18:40 -08:00
David Goldblatt	403f2d1664	Extents: Split out introspection functionality. This isn't really part of the core extent allocation facilities. Especially as this module grows, having it in its own place may come in handy.	2019-12-20 10:18:40 -08:00
David Goldblatt	92a511d385	Make extent module hermetic. In the form of extent2.h. The naming leaves something to be desired, but I'll leave that for a later diff.	2019-12-20 10:18:40 -08:00
David Goldblatt	39fdc690a0	Ehooks comments and cleanup.	2019-12-20 10:18:40 -08:00
David Goldblatt	c8dae890c8	Extent -> Ehooks: Move over default hooks.	2019-12-20 10:18:40 -08:00
David Goldblatt	2fe5108263	Extent -> Ehooks: Move merge hook.	2019-12-20 10:18:40 -08:00
David Goldblatt	1fff4d2ee3	Extent -> Ehooks: Move split hook.	2019-12-20 10:18:40 -08:00
David Goldblatt	a5b42a1a10	Extent -> Ehooks: Move purge_forced hook.	2019-12-20 10:18:40 -08:00
David Goldblatt	368baa42ef	Extent -> Ehooks: Move purge_lazy hook.	2019-12-20 10:18:40 -08:00
David Goldblatt	d78fe241ac	Extent -> Ehooks: Move commit and decommit hooks.	2019-12-20 10:18:40 -08:00
David Goldblatt	5459ec9dae	Extent -> Ehooks: Move destroy hook.	2019-12-20 10:18:40 -08:00
David Goldblatt	bac8e2e5a6	Extent -> Ehooks: Move dalloc hook.	2019-12-20 10:18:40 -08:00
David Goldblatt	dc8b4e6e13	Extent -> Ehooks: Move alloc hook.	2019-12-20 10:18:40 -08:00
David Goldblatt	703fbc0ff5	Introduce unsafe reentrancy guards. We have to work to circumvent the safety checks in pre_reentrancy when going down extent hook pathways. Instead, let's explicitly have checked and unchecked guards.	2019-12-20 10:18:40 -08:00
David Goldblatt	ae0d8e8591	Move extent ehook calls into ehooks	2019-12-20 10:18:40 -08:00
David Goldblatt	ba8b9ecbcb	Add ehooks module	2019-12-20 10:18:40 -08:00
David Goldblatt	837119a948	base_structs.h: Remove some mid-line tabs.	2019-12-20 10:18:40 -08:00
David Goldblatt	9f6eb09585	Extents: Eagerly initialize extent hooks. When deferred initialization was added, initializing required copying sizeof(extent_hooks_t) bytes after a pointer chase. Today, it's just a single pointer loaded from the base_t. In subsequent diffs, we'll get rid of even that.	2019-12-20 10:18:40 -08:00
David Goldblatt	4278f84603	Move extent hook getters/setters to arena.c This is where they're logically scoped; they access arena data.	2019-12-20 10:18:40 -08:00
Qi Wang	d5031ea824	Allow dallocx and sdallocx after tsd destruction. After a thread turns into purgatory / reincarnated state, still allow dallocx and sdallocx to function normally.	2019-12-19 11:17:03 -08:00
Yinan Zhang	4afd709d1f	Restructure setters for profiling info Explicitly define three setters: - `prof_tctx_reset()`: set `prof_tctx` to `1U`, if we don't know in advance whether the allocation is large or not; - `prof_tctx_reset_sampled()`: set `prof_tctx` to `1U`, if we already know in advance that the allocation is large; - `prof_info_set()`: set a real `prof_tctx`, and also set other profiling info e.g. the allocation time. Code structure wise, the prof level is kept as a thin wrapper, the large level only provides low level setter APIs, and the arena level carries out the main logic.	2019-12-17 10:01:28 -08:00
Yinan Zhang	1d01e4c770	Initialization utilities for nstime	2019-12-16 16:08:56 -08:00
Qi Wang	dd649c9485	Optimize away the tsd_fast() check on fastpath. Fold the tsd_state check onto the event threshold check. The fast threshold is set to 0 when tsd switch to non-nominal. The fast_threshold can be reset by remote threads, to refect the non nominal tsd state change.	2019-12-11 23:44:20 -08:00
Yinan Zhang	45836d7fd3	Pass nstime_t pointer for profiling	2019-12-11 11:38:16 -08:00
Yinan Zhang	7d2bac5a38	Refactor destroy code path for prof_tctx	2019-12-10 16:31:05 -08:00
Yinan Zhang	055478cca8	Threshold is no longer updated before prof_realloc()	2019-12-10 16:31:05 -08:00
Yinan Zhang	7e3671911f	Get rid of old indentation style for prof	2019-12-06 09:47:51 -08:00
Yinan Zhang	dfdd46f6c1	Refactor prof_tctx_t creation	2019-12-06 09:47:51 -08:00
Yinan Zhang	aa1d71fb7a	Rename prof_tctx to alloc_tctx in prof_info_t	2019-12-06 09:47:51 -08:00
Yinan Zhang	5e0b090992	No need to pass usize to prof_tctx_set()	2019-12-06 09:47:51 -08:00
David Goldblatt	1b1e76acfe	Disable some spuriously-triggering warnings	2019-12-04 13:45:17 -08:00
Yinan Zhang	6945371778	Change tsdn to tsd for profiling code path	2019-11-22 16:31:56 -08:00
Yinan Zhang	b55419f9b9	Restructure profiling Develop new data structure and code logic for holding profiling related information stored in the extent that may be needed after the extent is released, which in particular is the case for the reallocation code path (e.g. in `rallocx()` and `xallocx()`). The data structure is a generalization of `prof_tctx_t`: we previously only copy out the `prof_tctx` before the extent is released, but we may be in need of additional fields. Currently the only additional field is the allocation time field, but there may be more fields in the future. The restructuring also resolved a bug: `prof_realloc()` mistakenly passed the new `ptr` to `prof_free_sampled_object()`, but passing in the `old_ptr` would crash because it's already been released. Now the essential profiling information is collectively copied out early and safely passed to `prof_free_sampled_object()` after the extent is released.	2019-11-22 16:31:56 -08:00
Mark Santaniello	8b2c2a596d	Support C++17 over-aligned allocation Summary: Add support for C++17 over-aligned allocation: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0035r4.html Supporting all 10 operators means we avoid thunking thru libstdc++-v3/libsupc++ and just call jemalloc directly. It's also worth noting that there is now an aligned and sized operator delete: ``` void operator delete(void* ptr, std::size_t size, std::align_val_t al) noexcept; ``` If JeMalloc did not provide this, the default implementation would ignore the size parameter entirely: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/del_opsa.cc#L30-L33 (I must also update ax_cxx_compile_stdcxx.m4 to a newer version with C++17 support.) Test Plan: Wrote a simple test that allocates and then deletes an over-aligned type: ``` struct alignas(32) Foo {}; Foo f; int main() { f = new Foo; delete f; } ``` Before this change, both new and delete go thru PLT, and we end up calling regular old free: ``` (gdb) disassemble Dump of assembler code for function main(): ... 0x00000000004029b7 <+55>: call 0x4022d0 <_ZnwmSt11align_val_t@plt> ... 0x00000000004029d5 <+85>: call 0x4022e0 <_ZdlPvmSt11align_val_t@plt> ... (gdb) s free (ptr=0x7ffff6408020) at /home/engshare/third-party2/jemalloc/master/src/jemalloc.git-trunk/src/jemalloc.c:2842 2842 if (!free_fastpath(ptr, 0, false)) { ``` After this change, we directly call new/delete and ultimately call sdallocx: ``` (gdb) disassemble Dump of assembler code for function main(): ... 0x0000000000402b77 <+55>: call 0x496ca0 <operator new(unsigned long, std::align_val_t)> ... 0x0000000000402b95 <+85>: call 0x496e60 <operator delete(void, unsigned long, std::align_val_t)> ... (gdb) s 116 je_sdallocx_noflags(ptr, size); ```	2019-11-22 10:14:16 -08:00
Qi Wang	9a7ae3c97f	Reduce footprint of bin_t. Avoid storing mutex_prof_data_t in bin_t. Added bin_stats_data_t which is used for reporting bin stats.	2019-11-21 11:08:36 -08:00
Yinan Zhang	73510dfd15	Revert "Fix bug in prof_realloc" This reverts commit `3b5eecf102`.	2019-11-15 15:13:39 -08:00
Yinan Zhang	3b5eecf102	Fix bug in prof_realloc We should pass in `old_ptr` rather than the new `ptr` to `prof_free_sampled_object()` when `old_ptr` points to a sampled allocation.	2019-11-15 13:28:33 -08:00
Leonardo Santagada	c462753cc8	Use __forceinline for JEMALLOC_ALWAYS_INLINE on msvc	2019-11-12 13:50:25 -08:00
Qi Wang	da50d8ce87	Refactor and optimize prof sampling initialization. Makes the prof sample prng use the tsd prng_state. This allows us to properly initialize the sample interval event, without having to create tdata. As a result, tdata will be created on demand (when a thread reaches the sample interval bytes allocated), instead of on the first allocation.	2019-11-11 10:35:37 -08:00
Qi Wang	bc774a3519	Rename tsd->offset_state to tsd->prng_state.	2019-11-11 10:35:37 -08:00
Qi Wang	19a51abf33	Avoid arena->offset_state when tsd not available for prng. Use stack locals and remove the offset_state in arena.	2019-11-11 10:35:37 -08:00
Nick Desaulniers	d01b425e5d	Add -Wimplicit-fallthrough checks if supported Clang since r369414 (clang-10) can now check -Wimplicit-fallthrough for C code, and use the GNU C style attribute to denote fallthrough. Move the test from header only to autoconf. The previous test used brittle version detection which did not work for newer clang that supported this feature. The attribute has to be its own statement, hence the added `;`. It also can only precede case statements, so the final cases should be explicitly terminated with break statements. Fixes commit `3d29d11ac2` ("Clean compilation -Wextra") Link: `1e0affb6e5` Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>	2019-11-08 13:03:03 -08:00
Yinan Zhang	43f0ce92d8	Define general purpose tsd_thread_event_init()	2019-11-04 16:07:56 -08:00
Yinan Zhang	97f93fa0f2	Pull tcache GC events into thread event handler	2019-11-04 16:07:56 -08:00
Yinan Zhang	198f02e797	Pull prof_accumbytes into thread event handler	2019-11-04 15:21:16 -08:00
Yinan Zhang	152c0ef954	Build a general purpose thread event handler	2019-11-04 11:15:50 -08:00
David T. Goldblatt	de81a4eada	Add stats counters for number of zero reallocs	2019-10-29 17:48:44 -07:00
David T. Goldblatt	9cfa805947	Realloc: Make behavior of realloc(ptr, 0) configurable.	2019-10-29 17:48:44 -07:00
Yinan Zhang	05681e387a	Optimize cache_bin_alloc_easy for malloc fast path `tcache_bin_info` is not accessed on malloc fast path but the compiler reserves a register for it, as well as an additional register for `tcache_bin_info[ind].stack_size`. The optimization gets rid of the need for the two registers.	2019-10-21 16:43:45 -07:00
Yinan Zhang	4fe50bc7d0	Fix amd64 MSVC warning	2019-10-18 10:16:29 -07:00
Yinan Zhang	4fbbc817c1	Simplify time setting and getting for prof log	2019-10-16 09:24:52 -07:00
Yinan Zhang	66e07f986d	Suppress tdata creation in reentrancy This change suppresses tdata initialization and prof sample threshold update in interrupting malloc calls. Interrupting calls have no need for tdata. Delaying tdata creation aligns better with our lazy tdata creation principle, and it also helps us gain control back from interrupting calls more quickly and reduces any risk of delegating tdata creation to an interrupting call.	2019-10-04 08:52:50 -07:00
Yinan Zhang	beb7c16e94	Guard prof_active reset by opt_prof Set `prof_active` to read-only when `opt_prof` is turned off.	2019-10-02 11:42:53 -07:00
David T. Goldblatt	3d84bd57f4	Arena: Add helper function arena_get_from_extent.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	c97d255752	Eset: Remove temporary declaration.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	ce5b128f10	Remove the undefined extent_size_quantize declarations.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	821dd53a1d	Extent -> Eset: Rename arena members.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	e144b21e4b	Extent -> Eset: Move fork handling.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	77bbb35a92	Extent -> Eset: Move extent fit functions.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	1210af9a4e	Extent -> Eset: Move insertion and removal.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	a42861540e	Extents -> Eset: Convert some stats getters.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	820f070c6b	Move page quantization to sz module.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	63d1b7a7a7	Extents -> Eset: move extents_state_get.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	b416b96a39	Extents -> Eset: rename/move extents_init.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	4e5e43f22e	Rename extents_t -> eset_t.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	723ccc6c27	Extents: Split out extent struct.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	41187bdfb0	Extents: Break extent-struct/arena interactions Specifically, the extent_arena_[g\|s]et functions and the address randomization. These are the only things that tie the extent struct itself to the arena code.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	529cfe2abc	Arena: rename arena_structs_b.h -> arena_structs.h arena_structs_a.h was removed in the previous commit.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	e7cf84a8dd	Rearrange slab data and constants The constants logically belong in the sc module. The slab data bitmap isn't really scoped to an arena; move it to its own module.	2019-09-23 23:06:27 -07:00
zhxchen17	b7c7df24ba	Add max_per_bg_thd stats for per background thread mutexes. Added a new stats row to aggregate the maximum value of mutex counters for each background threads. Given that the per bg thd mutex is not expected to be contended, this counter is mainly for sanity check / debugging.	2019-09-13 09:23:57 -07:00
zhxchen17	4b76c684bb	Add "prof.dump_prefix" to override filename prefixes for dumps.	2019-09-12 22:26:03 -07:00
zhxchen17	242af439b8	Rename "prof_dump_seq_mtx" to "prof_dump_filename_mtx".	2019-09-12 22:26:03 -07:00
Yinan Zhang	93d6151800	Pass tsd down to prof_backtrace()	2019-09-05 10:57:43 -07:00
Qi Wang	785b84e603	Make cache_bin_sz_t unsigned. The bin size type was made signed only because the low_water could go -1, which was already removed.	2019-09-04 13:37:07 -07:00
Qi Wang	23dc7a7fba	Fix index type for cache_bin_alloc_easy.	2019-09-04 13:37:07 -07:00
Yinan Zhang	57b81c078e	Pull thread_(de)allocated out of config_stats	2019-08-26 11:56:41 -07:00
Qi Wang	0043e68d4c	Track low_water == -1 case explicitly. The -1 value of low_water indicates if the cache has been depleted and refilled. Track the status explicitly in the tcache struct. This allows the fast path to check if (cur_ptr > low_water), instead of >=, which avoids reaching slow path when the last item is allocated.	2019-08-21 16:00:38 -07:00
Qi Wang	937ca1db9f	Store ncached_max * ptr_size in tcache_bin_info. With the cache bin metadata switched to pointers, ncached_max is usually accessed and timed by sizeof(ptr). Store the results in tcache_bin_info for direct access, and add a helper function for the ncached_max value.	2019-08-19 12:23:24 -07:00
Qi Wang	7599c82d48	Redesign the cache bin metadata for fast path. Implement the pointer-based metadata for tcache bins -- - 3 pointers are maintained to represent each bin; - 2 of the pointers are compressed on 64-bit; - is_full / is_empty done through pointer comparison; Comparing to the previous counter based design -- - fast-path speed up ~15% in benchmarks - direct pointer comparison and de-reference - no need to access tcache_bin_info in common case	2019-08-19 12:21:44 -07:00
Qi Wang	e2c7584361	Simplify / refactor tcache_dalloc_large.	2019-08-14 13:08:23 -07:00
Qi Wang	9c5c2a2c86	Unify the signature of tcache_flush small and large.	2019-08-14 13:08:23 -07:00
Yinan Zhang	8c8466fa6e	Add compact json option for emitter JSON format is largely meant for machine-machine communication, so adding the option to the emitter. According to local testing, the savings in terms of bytes outputted is around 50% for stats printing and around 25% for prof log printing.	2019-08-09 09:53:41 -07:00
Yinan Zhang	7fc6b1b259	Add buffered writer The buffered writer adopts a signature identical to `write_cb`, so that it can be plugged into anywhere `write_cb` appears.	2019-08-09 09:44:29 -07:00
Yinan Zhang	39343555d6	Report stats for tdatas_mtx and prof_dump_mtx	2019-08-09 09:24:16 -07:00
Yinan Zhang	07ce2434bf	Refactor profiling Refactored core profiling codebase into two logical parts: (a) `prof_data.c`: core internal data structure managing & dumping; (b) `prof.c`: mutexes & outward-facing APIs. Some internal functions had to be exposed out, but there are not that many of them if the modularization is (hopefully) clean enough.	2019-08-07 19:48:28 -07:00
Yinan Zhang	56126d0d2d	Refactor prof log Prof logging is conceptually seperate from core profiling, so split it out as a module of its own. There are a few internal functions that had to be exposed but I think it is a fair trade-off.	2019-08-07 13:53:45 -07:00
Yinan Zhang	56c8ecffc1	Correct tsd layout graph Augmented the tsd layout graph so that the two recently added fields, `offset_state` and `bytes_until_sample`, are properly reflected. As is shown, the cache footprint is 16 bytes larger than before.	2019-08-05 15:30:20 -07:00
Yinan Zhang	9344d25488	Workaround to address g++ unused variable warnings g++ 5.5.0+ complained `parameter ‘expected’ set but not used [-Werror=unused-but-set-parameter]` (despite that `expected` is in fact used).	2019-07-30 11:37:56 -07:00
Qi Wang	5742473cc8	Revert "Refactor prof log" This reverts commit `7618b0b8e4`.	2019-07-29 14:10:15 -07:00
Qi Wang	1a0503367b	Revert "Refactor profiling" This reverts commit `0b462407ae`.	2019-07-29 14:10:15 -07:00
Yinan Zhang	0b462407ae	Refactor profiling Refactored core profiling codebase into two logical parts: (a) `prof_data.c`: core internal data structure managing & dumping; (b) `prof.c`: mutexes & outward-facing APIs. Some internal functions had to be exposed out, but there are not that many of them if the modularization is (hopefully) clean enough.	2019-07-29 13:55:00 -07:00
Yinan Zhang	7618b0b8e4	Refactor prof log `prof.c` is growing too long, so trying to modularize it. There are a few internal functions that had to be exposed but I think it is a fair trade-off.	2019-07-29 13:55:00 -07:00
Qi Wang	a3fa597921	Refactor arena_dalloc() / _sdalloc().	2019-07-24 18:30:54 -07:00
Qi Wang	bc0998a905	Invoke arena_dalloc_promoted() properly w/o tcache. When tcache was disabled, the dalloc promoted case was missing.	2019-07-24 18:30:54 -07:00
Qi Wang	4e36ce34c1	Track the leaked VM space via the abandoned_vm counter. The counter is 0 unless metadata allocation failed (indicates OOM), and is mainly for sanity checking.	2019-07-24 11:24:22 -07:00
Qi Wang	9a86c65abc	Implement retain on Windows. The VirtualAlloc and VirtualFree APIs are different because MEM_DECOMMIT cannot be used across multiple VirtualAlloc regions. To properly support decommit, only allow merge / split within the same region -- this is done by tracking the "is_head" state of extents and not merging cross-region. Add a new state is_head (only relevant for retain && !maps_coalesce), which is true for the first extent in each VirtualAlloc region. Determine if two extents can be merged based on the head state, and use serial numbers for sanity checks.	2019-07-23 22:18:55 -07:00
Yinan Zhang	a2a693e722	Remove prof_accumbytes in arena `prof_accumbytes` was supposed to be replaced by `prof_accum` in https://github.com/jemalloc/jemalloc/pull/623.	2019-07-16 15:18:52 -07:00
Yinan Zhang	d26636d566	Fix logic in printing `cbopaque` can now be overriden without overriding `write_cb` in the first place. (Otherwise there would be no need to have the `cbopaque` parameter in `malloc_message`.)	2019-07-16 14:54:23 -07:00
Yinan Zhang	7720b6e385	Fix redzone setting and checking	2019-07-11 20:51:29 -07:00
Yinan Zhang	c92ac30601	Add confirm_conf option If the confirm_conf option is set, when the program starts, each of the four malloc_conf strings will be printed, and each option will be printed when being set.	2019-05-22 09:38:39 -07:00
Qi Wang	07c44847c2	Track nfills and nflushes for arenas.i.small / large. Small is added purely for convenience. Large flushes wasn't tracked before and can be useful in analysis. Large fill simply reports nmalloc, since there is no batch fill for large currently.	2019-05-15 10:05:09 -07:00
Doron Roberts-Kedes	7fc4f2a32c	Add nonfull_slabs to bin_stats_t. When config_stats is enabled track the size of bin->slabs_nonfull in the new nonfull_slabs counter in bin_stats_t. This metric should be useful for establishing an upper ceiling on the savings possible by meshing.	2019-04-29 13:35:02 -07:00
Yinan Zhang	ae124b8684	Improve size class header Mainly fixing typos. The only non-trivial change is in the computation for SC_NPSIZES, though the result wouldn't be any different when SC_NGROUP = 4 as is always the case at the moment.	2019-04-24 10:45:12 -07:00
Qi Wang	1aabab5fdc	Enforce TLS_MODEL attribute. Caught by @zoulasc in #1460. The attribute needs to be added in the headers as well.	2019-04-16 11:07:15 -07:00
David Goldblatt	33e1dad680	Safety checks: Add a redzoning feature.	2019-04-15 16:48:12 -07:00
David Goldblatt	b92c9a1a81	Safety checks: Indirect through a function. This will let us share code on failure pathways.pathways	2019-04-15 16:48:12 -07:00
David Goldblatt	f4d24f05e1	Move extra size checks behind a config flag. This will let us turn that flag into a generic "turn on runtime checks" flag that guards other functionality we have planned.	2019-04-15 16:48:12 -07:00
zoulasc	7f7935cf78	Add an autoconf feature test for format_arg and a jemalloc-specific macro for it.	2019-04-15 15:14:46 -07:00
zoulasc	14e4176758	Fix incorrect macro use. Compiling with warnings produces missing prototype warnings.	2019-04-15 15:14:46 -07:00
zoulasc	020b5dc7ac	Convert the format generator function to an annotated format function, so that the generated formats can be checked by the compiler.	2019-04-15 15:14:46 -07:00
mgrice	d3d7a8ef09	remove compare and branch in fast path for c++ operator delete[] Summary: sdallocx is checking a flag that will never be set (at least in the provided C++ destructor implementation). This branch will probably only rarely be mispredicted however it removes two instructions in sdallocx and one at the callsite (to zero out flags).	2019-04-08 10:59:05 -07:00
Yinan Zhang	9aab3f2be0	Add memory utilization analytics to mallctl The analytics tool is put under experimental.utilization namespace in mallctl. Input is one pointer or an array of pointers and the output is a list of memory utilization statistics.	2019-04-04 13:48:39 -07:00
Qi Wang	fb56766ca9	Eagerly purge oversized merged extents. This change improves memory usage slightly, at virtually no CPU cost.	2019-03-14 17:34:55 -07:00
Qi Wang	f6c30cbafa	Remove some unused comments.	2019-03-14 17:34:55 -07:00
Qi Wang	b804d0f019	Fallback to 32-bit when 8-bit atomics are missing for TSD. When it happens, this might cause a slowdown on the fast path operations. However such case is very rare.	2019-03-09 12:52:06 -08:00
Qi Wang	06f0850427	Detect if 8-bit atomics are available. In some rare cases (older compiler, e.g. gcc 4.2 w/ MIPS), 8-bit atomics might be unavailable. Detect such cases so that we can workaround.	2019-03-09 12:52:06 -08:00
Jason Evans	14d3686c9f	Do not use #pragma GCC diagnostic with gcc < 4.6. This regression was introduced by `3d29d11ac2` (Clean compilation -Wextra).	2019-03-09 12:10:30 -08:00
Jason Evans	775fe302a7	Remove JE_FORCE_SYNC_COMPARE_AND_SWAP_[48]. These macros have been unused since `d4ac7582f3` (Introduce a backport of C11 atomics).	2019-02-22 14:22:16 -08:00
Jason Evans	dca7060d5e	Avoid redefining tsd_t. This fixes a build failure when integrating with FreeBSD's libc. This regression was introduced by `d1e11d48d4` (Move tsd link and in_hook after tcache.).	2019-02-20 20:27:55 -08:00
Qi Wang	8e9a613122	Disable muzzy decay by default.	2019-02-04 14:38:54 -08:00
Qi Wang	e13400c919	Sanity check szind on tcache flush. This adds some overhead to the tcache flush path (which is one of the popular paths). Guard it behind a config option.	2019-02-01 12:31:34 -08:00
Qi Wang	e3db480f6f	Rename huge_threshold to oversize_threshold. The keyword huge tend to remind people of huge pages which is not relevent to the feature.	2019-01-25 13:15:45 -08:00
Qi Wang	350809dc5d	Set huge_threshold to 8M by default. This feature uses an dedicated arena to handle huge requests, which significantly improves VM fragmentation. In production workload we tested it often reduces VM size by >30%.	2019-01-24 13:29:23 -08:00
Qi Wang	bbe8e6a909	Avoid creating bg thds for huge arena lone. For low arena count settings, the huge threshold feature may trigger an unwanted bg thd creation. Given that the huge arena does eager purging by default, bypass bg thd creation when initializing the huge arena.	2019-01-15 16:00:34 -08:00
Qi Wang	f459454afe	Avoid potential issues on extent zero-out. When custom extent_hooks or transparent huge pages are in use, the purging semantics may change, which means we may not get zeroed pages on repopulating. Fixing the issue by manually memset for such cases.	2019-01-11 19:16:12 -08:00
Leonardo Santagada	daa0e436ba	implement malloc_getcpu for windows	2019-01-08 14:34:45 -08:00
Qi Wang	7241bf5b74	Only read arena index from extent on the tcache flush path. Add exten_arena_ind_get() to avoid loading the actual arena ptr in case we just need to check arena matching.	2018-12-18 15:19:30 -08:00
Alexander Zinoviev	36de5189c7	Add rate counters to stats	2018-12-18 09:59:41 -08:00
Qi Wang	98b56ab23d	Store the bin shard selection in TSD. This avoids having to choose bin shard on the fly, also will allow flexible bin binding for each thread.	2018-12-03 17:17:03 -08:00
Qi Wang	3f9f2833f6	Add opt.bin_shards to specify number of bin shards. The option uses the same format as "slab_sizes" to specify number of shards for each bin size.	2018-12-03 17:17:03 -08:00
Qi Wang	37b8913925	Add support for sharded bins within an arena. This makes it possible to have multiple set of bins in an arena, which improves arena scalability because the bins (especially the small ones) are always the limiting factor in production workload. A bin shard is picked on allocation; each extent tracks the bin shard id for deallocation. The shard size will be determined using runtime options.	2018-12-03 17:17:03 -08:00
Dave Watson	b23336af96	mutex: fix trylock spin wait contention If there are 3 or more threads spin-waiting on the same mutex, there will be excessive exclusive cacheline contention because pthread_trylock() immediately tries to CAS in a new value, instead of first checking if the lock is locked. This diff adds a 'locked' hint flag, and we will only spin wait without trylock()ing while set. I don't know of any other portable way to get the same behavior as pthread_mutex_lock(). This is pretty easy to test via ttest, e.g. ./ttest1 500 3 10000 1 100 Throughput is nearly 3x as fast. This blames to the mutex profiling changes, however, we almost never have 3 or more threads contending in properly configured production workloads, but still worth fixing.	2018-11-28 15:17:02 -08:00
Qi Wang	c4063ce439	Set the default number of background threads to 4. The setting has been tested in production for a while. No negative effect while we were able to reduce number of threads per process.	2018-11-16 09:35:12 -08:00
Qi Wang	43f3b1ad0c	Deprecate OSSpinLock.	2018-11-14 08:44:05 -08:00
Dave Watson	13c237c7ef	Add a fastpath for arena_slab_reg_alloc_batch Also adds a configure.ac check for __builtin_popcount, which is used in the new fastpath.	2018-11-14 07:09:11 -08:00
Dave Watson	17aa470760	add extent_nfree_sub	2018-11-14 07:09:11 -08:00
Qi Wang	1f56115704	Fix tcache_flush (follow up `cd2931a`). Also catch invalid tcache id.	2018-11-13 08:54:09 -08:00
Dave Watson	e2ab215324	refactor tcache_dalloc_small Add a cache_bin_dalloc_easy (to match the alloc_easy function), and use it in tcache_dalloc_small. It will also be used in the new free fastpath.	2018-11-12 13:20:37 -08:00
Dave Watson	5e795297b3	rtree: add rtree_szind_slab_read_fast For a free fastpath, we want something that will not make additional calls. Assume most free() calls will hit the L1 cache, and use a custom rtree function for this. Additionally, roll the ptr=NULL check in to the rtree cache check.	2018-11-12 13:20:37 -08:00
Justin Hibbits	be0749f591	Restrict lwsync to powerpc64 only Nearly all 32-bit powerpc hardware treats lwsync as sync, and some cores (Freescale e500) trap lwsync as an illegal instruction, which then gets emulated in the kernel. To avoid unnecessary traps on the e500, use sync on all 32-bit powerpc. This pessimizes 32-bit software running on 64-bit hardware, but those numbers should be slim.	2018-10-24 11:18:55 -07:00
Edward Tomasz Napierala	ceba1dde27	Make use of pthread_set_name_np(3) on FreeBSD.	2018-10-24 10:06:37 -07:00
Dave Watson	936bc2aa15	prof: Fix memory regression The diff 'refactor prof accum...' moved the bytes_until_sample subtraction before the load of tdata. If tdata is null, tdata_get(true) will overwrite bytes_until_sample, but we still sample the current allocation. Instead, do the subtraction and check logic again, to keep the previous behavior. blame-rev: `0ac524308d`	2018-10-23 12:39:57 -07:00
Dave Watson	0ec656eb71	ticker: add ticker_trytick For the fastpath, we want to tick, but undo the tick and jump to the slowpath if ticker would fire.	2018-10-18 08:32:19 -07:00
Dave Watson	ac34afb403	drop bump_empty_alloc option. Size class lookup support used instead.	2018-10-17 08:50:58 -07:00
Dave Watson	4edbb7c64c	sz: Support 0 size in size2index lookup/compute	2018-10-17 08:50:58 -07:00

... 4 5 6 7 8 ...

1438 Commits