Commit Graph

1750 Commits

Author SHA1 Message Date
Alex Lapenkou
6e848a005e Remove opt_background_thread_hpa_interval_max_ms
Now that HPA can communicate the time until its deferred work should be done,
this option is not used anymore.
2021-09-17 16:56:41 -07:00
Alex Lapenkou
8229cc77c5 Wake up background threads on demand
This change allows every allocator conforming to PAI communicate that it
deferred some work for the future. Without it if a background thread goes into
indefinite sleep, there is no way to notify it about upcoming deferred work.
2021-09-17 16:56:41 -07:00
Alex Lapenkou
97da57c13a HPA: Add min_purge_interval_ms option
This rate limiting option is required to avoid purging too often.
2021-09-17 16:56:41 -07:00
Alex Lapenkou
b8b8027f19 Allow PAI to calculate time until deferred work
Previously the calculation of sleep time between wakeups was implemented within
background_thread. This resulted in some parts of decay and hpa specific
logic mixing with background thread implementation. In this change, background
thread delegates this calculation to arena and it, in turn, delegates it to PAI.
The next step is to implement the actual calculation of time until deferred work
in HPA.
2021-09-17 16:56:41 -07:00
Qi Wang
8b24cb8fdf Don't assume initialized arena in the default alloc hook.
Specifically, this change allows the default alloc hook to used during
arenas.create.  One use case is to invoke the default alloc hook in a customized
hook arena, i.e. the default hooks can be read out of a default arena, then
create customized ones based on these hooks.  Note that mixing the default with
customized hooks is not recommended, and should only be considered when the
customization is simple and straightforward.
2021-08-25 14:19:25 -07:00
Qi Wang
5884a076fb Rename prof.dump_prefix to prof.prefix
This better aligns with our naming convention.  The option has not been included
in any upstream release yet.
2021-08-12 23:04:29 -07:00
Alex Lapenkou
f58064b932 Verify that HPA is used before calling its functions
This change eliminates the possibility of PA calling functions of uninitialized
HPA.
2021-08-05 16:43:28 -07:00
David Goldblatt
27f71242b7 Mutex: Tweak internal spin count.
The recent pairing heap optimizations flattened the lock hold time profile.
This was a win for raw cycle counts, but ended up causing us to "just miss"
acquiring the mutex before sleeping more often.  Bump those counts.
2021-08-05 14:33:16 -07:00
David Goldblatt
6f41ba55ee Mutex: Make spin count configurable.
Don't document it since we don't want to support this as a "real" setting, but
it's handy for testing.
2021-08-05 10:13:53 -07:00
David Goldblatt
dcb7b83fac Eset: Cache summary information for heap edatas.
This lets us do a single array scan to find first fits, instead of taking a
cache miss per examined size class.
2021-08-02 15:02:49 -07:00
David Goldblatt
252e0942d0 Eset: Pull per-pszind data into structs.
We currently have one for stats and one for the data.  The data struct is just a
wrapper around the edata_heap_t, but this will change shortly.
2021-08-02 15:02:49 -07:00
David Goldblatt
08a4cc0969 Pairing heap: inline functions instead of macros.
By force-inlining everything that would otherwise be a macro, we get the same
effect (it's not clear in the first place that this is actually a good idea, but
it avoids making any changes to the existing performance profile).

This makes the code more maintainable (in anticipation of subsequent changes),
as well as making performance profiles and debug info more readable (we get
"real" line numbers, instead of making everything point to the macro definition
of all associated functions).
2021-08-02 15:02:49 -07:00
David Goldblatt
92a1e38f52 edata_cache: Allow unbounded fast caching.
The edata_cache_small had a fill/flush heuristic.  In retrospect, this was a
premature optimization; more testing indicates that an unbounded cache is
effectively fine here, and moreover we spend a nontrivial amount of time doing
unnecessary filling/flushing.

As the HPA takes on a larger and larger fraction of all allocations, any
theoretical differences in allocation patterns should shrink.  The HPA is more
efficient with its metadata in general, so it still comes out ahead on metadata
usage anyways.
2021-07-26 15:14:37 -07:00
David Goldblatt
d93eef2f40 HPA: Introduce a redesigned hpa_central_t.
For now, this only handles allocating virtual address space to shards, with no
reuse.  This is framework, though; it will change over time.
2021-07-23 21:59:59 -07:00
David Goldblatt
e09eac1d4e Remove hpa_central.
This is now dead code.
2021-07-23 21:59:59 -07:00
Alex Lapenkou
aaea4fd1e6 Add more documentation to decay.c
It took me a while to understand why some things are implemented the way they
are, so hopefully it will help future readers.
2021-07-22 23:19:09 -07:00
Alex Lapenkou
4b633b9a81 Clean up background thread sleep computation
Isolate the computation of purge interval from background thread logic and
move into more suitable file.
2021-07-22 23:19:09 -07:00
David Goldblatt
6630c59896 HPA: Hugification hysteresis.
We wait a while after deciding a huge extent should get hugified to see if it
gets purged before long.  This avoids hugifying extents that might shortly get
dehugified for purging.

Rename and use the hpa_dehugification_threshold option support code for this,
since it's now ignored.
2021-07-12 17:59:18 -07:00
David Goldblatt
113938b6f4 HPA: Pull out a hooks type.
For now, this is a no-op change.  In a subsequent commit, it will be useful for
testing.
2021-07-12 17:59:18 -07:00
David Goldblatt
1d4a7666d5 HPA: Do deferred operations on background threads. 2021-07-12 17:59:18 -07:00
David Goldblatt
583284f2d9 Add HPA deferral functionality. 2021-07-12 17:59:18 -07:00
David Goldblatt
ace329d11b HPA batch dalloc: Just do one deferred work check.
We only need to do one check per batch dalloc, not one check per dalloc in the
batch.
2021-07-12 17:59:18 -07:00
David Goldblatt
47d8a7e6b0 psset: Purge empty slabs first.
These are particularly good candidates for purging (listed in the diff).
2021-07-12 17:59:18 -07:00
David Goldblatt
41fd56605e HPA: Purge across retained extents.
This lets us cut down on the number of expensive system calls we perform.
2021-07-12 17:59:18 -07:00
David Goldblatt
d202218e86 HPA: Fix typos with big performance implications.
This fixes two simple but significant typos in the HPA:
- The conf string parsing accidentally set a min value of PAGE for
  hpa_sec_batch_fill_extra; i.e. allocating 4096 extra pages every time we
  attempted to allocate a single page.  This puts us over the SEC flush limit,
  so we then immediately flush all but one of them (probably triggering
  purging).
- The HPA was using the default PAI batch alloc implementation, which meant it
  did not actually get any locking advantages.

This snuck by because I did all the performance testing without using the PAI
interface or config settings.  When I cleaned it up and put everything behind
nice interfaces, I only did correctness checks, and didn't try any performance
ones.
2021-06-24 16:26:55 -07:00
David Goldblatt
4452a4812f Add opt.experimental_infallible_new.
This allows a guarantee that operator new never throws.

Fix the .gitignore rules to include test/integration/cpp while we're here.
2021-06-24 12:22:51 -07:00
David Goldblatt
0689448b1e Travis: Unbreak the builds.
In the hopes of future-proofing as much as possible, jump to the latest
distribution Travis supports.
2021-06-24 07:40:28 -07:00
David Goldblatt
36c6bfb963 SEC: Allow arbitrarily many shards, cached sizes. 2021-05-22 08:17:41 -07:00
David Goldblatt
aea91b8c33 Clean up some minor data structure inconsistencies
Namely, unify the include guard styling with the majority of the project, and do
flat_bitmap -> fb, to match its naming convention.
2021-05-12 11:14:23 -07:00
David Goldblatt
1f688490e1 Stats: Fix a printing bug when hpa_dirty_mult = -1
Missed a layer of indirection.
2021-05-05 19:45:25 -07:00
David Goldblatt
4f7cb3a413 Sized deallocation: fix a typo.
dealloction -> deallocation.
2021-05-04 16:46:15 -07:00
Qi Wang
9b523c6c15 Refactor the locking in extent_recycle().
Hold the ecache lock across extent_recycle_extract() and extent_recycle_split(),
so that the extent_deactivate after split can avoid re-take the ecache mutex.
2021-03-31 14:42:33 -07:00
Qi Wang
ce68f326b0 Avoid the release & re-acquire of the ecache locks around the merge hook. 2021-03-31 14:42:33 -07:00
Qi Wang
7dc77527ba Delete the mutex_pool module. 2021-03-29 17:19:53 -07:00
Qi Wang
03d95cba88 Remove the unnecessary arena_ind_set in base_alloc_edata().
All edata alloc sites are already followed with proper edata_init().
2021-03-29 17:19:53 -07:00
Qi Wang
3093d9455e Move the edata mergeability related functions to extent.h. 2021-03-29 17:19:53 -07:00
Qi Wang
7c964b0352 Add rtree_write_range(): writing the same content to multiple leaf elements.
Apply to emap_(de)register_interior which became noticeable in perf profiles.
2021-03-29 17:19:53 -07:00
Qi Wang
add636596a Stop checking head state in the merge hook.
Now that all merging go through try_acquire_edata_neighbor, the mergeablility
checks (including head state checking) are done before reaching the merge hook.
In other words, merge hook will never be called if the head state doesn't agree.
2021-03-29 17:19:53 -07:00
Qi Wang
49b7d7f0a4 Passing down the original edata on the expand path.
Instead of passing down the new_addr, pass down the active edata which allows us
to always use a neighbor-acquiring semantic.  In other words, this tells us both
the original edata and neighbor address.  With this change, only neighbors of a
"known" edata can be acquired, i.e. acquiring an edata based on an arbitrary
address isn't possible anymore.
2021-03-29 17:19:53 -07:00
Qi Wang
1784939688 Use rtree tracked states to protect edata outside of ecache locks.
This avoids the addr-based mutexes (i.e. the mutex_pool), and instead relies on
the metadata tracked in rtree leaf: the head state and extent_state.  Before
trying to access the neighbor edata (e.g. for coalescing), the states will be
verified first -- only neighbor edatas from the same arena and with the same
state will be accessed.
2021-03-29 17:19:53 -07:00
Qi Wang
4d8c22f9a5 Store edata->state in rtree leaf and make edata_t 128B aligned.
Verified that this doesn't result in any real increase of edata_t bytes
allocated.
2021-03-29 17:19:53 -07:00
Qi Wang
70d1541c5b Track extent is_head state in rtree leaf. 2021-03-29 17:19:53 -07:00
Qi Wang
862219e461 Add quiescence sync before deleting base during arena_destroy. 2021-03-29 17:19:53 -07:00
Qi Wang
61afb6a405 Fix locking on arena_i_destroy_ctl().
The ctl_mtx should be held to protect against concurrent arenas.create.
2021-03-22 23:18:52 -07:00
Qi Wang
3913077146 Mark head state during dss alloc.
Specifically, the extent_dalloc_gap relies on the correct head state to
coalesce.
2021-03-12 19:17:25 -08:00
Qi Wang
22be724af4 Set is_head in extent_alloc_wrapper w/ retain.
When retain is on, when extent_grow_retained failed (e.g. due to split hook
failures), we'll try extent_alloc_wrapper as the last resort.  Set the is_head
bit in that case to be consistent.  The allocated extent in that case will be
retained properly, but not merged with other extents.
2021-03-12 10:20:08 -08:00
David Goldblatt
73ca4b8ef8 HPA: Use dirtiest-first purging.
This seems to be practically beneficial, despite some pathological corner cases.
2021-02-19 15:10:54 -08:00
David Goldblatt
0f6c420f83 HPA: Make purging/hugifying more principled.
Before this change, purge/hugify decisions had several sharp edges that could
lead to pathological behavior if tuning parameters weren't carefully chosen.
It's the first of a series; this introduces basic "make every hugepage with
dirty pages purgeable" functionality, and the next commit expands that
functionality to have a smarter policy for picking hugepages to purge.

Previously, the dehugify logic would *never* dehugify a hugepage unless it was
dirtier than the dehugification threshold.  This can lead to situations in which
these pages (which themselves could never be purged) would push us above the
maximum allowed dirty pages in the shard.  This forces immediate purging of any
pages deallocated in non-hugified hugepages, which in turn places nonobvious
practical limitations on the relationships between various config settings.

Instead, we make our preference not to dehugify to purge a soft one rather than
a hard one.  We'll avoid purging them, but only so long as we can do so by
purging non-hugified pages.  If we need to purge them to satisfy our dirty page
limits, or to hugify other, more worthy candidates, we'll still do so.
2021-02-19 15:10:54 -08:00
David Goldblatt
6bddb92ad6 psset: Rename "bitmap" to "pageslab_bitmap".
It tracks pageslabs.  Soon, we'll have another bitmap (to track dirty pages)
that we want to disambiguate.

While we're here, fix an out-of-date comment.
2021-02-19 15:10:54 -08:00
David Goldblatt
154aa5fcc1 Use the flat bitmap for eset and psset bitmaps.
This is simpler (note that the eset field comment was actually incorrect!), and
slightly faster.
2021-02-19 15:10:54 -08:00