This pulls out the various abstractions where some stats counter is sometimes an
atomic, sometimes a plain variable, sometimes always protected by a lock,
sometimes protected by reads but not writes, etc. With this change, these cases
are treated consistently, and access patterns tagged.
In the process, we fix a few missed-update bugs (where one caller assumes
"protected-by-a-lock" semantics and another does not).
This is logically at a higher level of the stack; extent should just allocate
things at the page-level; it shouldn't care exactly why the callers wants a
given number of pages.
Previously, large allocations in tcaches would have their sizes reduced during
stats estimation. Added a test, which fails before this change but passes now.
This fixes a bug introduced in 5934846612, which
was itself fixing a bug introduced in 9c0549007d.
This lets us put more allocations on an "almost as fast" path after a flush.
This results in around a 4% reduction in malloc cycles in prod workloads
(corresponding to about a 0.1% reduction in overall cycles).
With this, we track all of the empty, full, and low water states together. This
simplifies a lot of the tracking logic, since we now don't need the
cache_bin_info_t for state queries (except for some debugging).
I.e. the tcache code just calls a cache-bin function to finish flush (and move
pointers around, etc.). It doesn't directly access the cache-bin's owned memory
any more.
Previously, we took an array of cache_bin_info_ts and an index, and dereferenced
ourselves. But infos for other cache_bins aren't relevant to any particular
cache bin, so that should be the caller's job.
This is debug only and we keep it off the fast path. Moving it here simplifies
the internal logic.
This never tries to junk on regions that were shrunk via xallocx. I think this
is fine for two reasons:
- The shrunk-with-xallocx case is rare.
- We don't always do that anyway before this diff (it depends on the opt
settings and extent hooks in effect).
The small and large pathways share most of their logic, even if some of the
individual operations are different. We pull out the common logic into a
force-inlined function, and then specialize twice, once for each value of
"small".
The only time sharing an rtree context saves across extent operations isn't a
no-op is when tsd is unavailable. But this happens only in situations like
thread death or initialization, and we don't care about shaving off every
possible cycle in such scenarios.
Previously, tcache fill/flush (as well as small alloc/dalloc on the arena) may
potentially drop the bin lock for slab_alloc and slab_dalloc. This commit
refactors the logic so that the slab calls happen in the same function / level
as the bin lock / unlock. The main purpose is to be able to use flat combining
without having to keep track of stack state.
In the meantime, this change reduces the locking, especially for slab_dalloc
calls, where nothing happens after the call.
Check the is_head state before merging two extents. Disallow the merge if it's
crossing two separate mmap regions. This enforces first-fit (by not losing the
SN) at a very small cost.
Make the event module to accept two event types, and pass around the event
context. Use bytes-based events to trigger tcache GC on deallocation, and get
rid of the tcache ticker.
- NetBSD overcommits
- When mapping pages, use the maximum of the alignment requested and the
compiled-in PAGE constant which might be greater than the current kernel
pagesize, since we compile binaries with the maximum page size supported
by the architecture (so that they work with all kernels).
Add options stats_interval and stats_interval_opts to allow interval based stats
printing. This provides an easy way to collect stats without code changes,
because opt.stats_print may not work (some binaries never exit).