Commit Graph

1599 Commits

Author SHA1 Message Date
David Goldblatt
ff4086aa6b hpdata: count active pages instead of free ones.
This will be more consistent with later naming choices.
2021-02-04 20:58:31 -08:00
David Goldblatt
3624dd42ff hpdata: Add a comment for hpdata_consistent. 2021-02-04 20:58:31 -08:00
David Goldblatt
20140629b4 Bin: Move stats closer to the mutex.
This is a slight cache locality optimization.
2021-02-04 14:10:43 -08:00
David Goldblatt
c259323ab3 Use ticker_geom_t for arena tcache decay. 2021-02-04 14:10:43 -08:00
David Goldblatt
8edfc5b170 Add ticker_geom_t.
This lets a single ticker object drive events across a large number of different
tick streams while sharing state.
2021-02-04 14:10:43 -08:00
David Goldblatt
3967329813 Arena: share bin offsets in a global.
This saves us a cache miss when lookup up the arena bin offset in a remote
arena during tcache flush.  All arenas share the base offset, and so we don't
need to look it up repeatedly for each arena.  Secondarily, it shaves 288 bytes
off the arena on, e.g., x86-64.
2021-02-04 14:10:43 -08:00
David Goldblatt
2fcbd18115 Cache bin: Don't reverse flush order.
The items we pick to flush matter a lot, but the order in which they get flushed
doesn't; just use forward scans.  This simplifies the accessing code, both in
terms of the C and the generated assembly (i.e. this speeds up the flush
pathways).
2021-02-04 14:10:43 -08:00
David Goldblatt
4c46e11365 Cache an arena's index in the arena.
This saves us a pointer hop down some perf-sensitive paths.
2021-02-04 14:10:43 -08:00
David Goldblatt
229994a204 Tcache flush: keep common path state in registers.
By carefully force-inlining the division constants and the operation sum count,
we can eliminate redundant operations in the arena-level dalloc function.  Do
so.
2021-02-04 14:10:43 -08:00
David Goldblatt
31a629c3de Tcache flush: prefetch edata contents.
This frontloads more of the miss latency.  It also moves it to a pathway where
we have not yet acquired any locks, so that it should (hopefully) reduce hold
times.
2021-02-04 14:10:43 -08:00
David Goldblatt
9f9247a62e Tcache fluhing: increase cache miss parallelism.
In practice, many rtree_leaf_elm accesses are cache misses.  By restructuring,
we can make it more likely that these misses occur without blocking us from
starting later lookups, taking more of those misses in parallel.
2021-02-04 14:10:43 -08:00
David Goldblatt
181ba7fd4d Tcache flush: Add an emap "batch lookup" path.
For now this is a no-op; but the interface is a little more flexible for our
purposes.
2021-02-04 14:10:43 -08:00
David CARLIER
35a8552605 Mac OS: Tag mapped pages.
This can be used to help profiling tools (e.g. vmmap) identify the
sources of mappings more specifically.
2021-02-03 15:05:53 -08:00
Azat Khuzhin
a943172b73 Add runtime detection for MADV_DONTNEED zeroes pages (mostly for qemu)
qemu does not support this, yet [1], and you can get very tricky assert
if you will run program with jemalloc in use under qemu:

    <jemalloc>: ../contrib/jemalloc/src/extent.c:1195: Failed assertion: "p[i] == 0"

  [1]: https://patchwork.kernel.org/patch/10576637/

Here is a simple example that shows the problem [2]:

    // Gist to check possible issues with MADV_DONTNEED
    // For example it does not supported by qemu user
    // There is a patch for this [1], but it hasn't been applied.
    //   [1]: https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05422.html

    #include <sys/mman.h>
    #include <stdio.h>
    #include <stddef.h>
    #include <assert.h>
    #include <string.h>

    int main(int argc, char **argv)
    {
        void *addr = mmap(NULL, 1<<16, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
        if (addr == MAP_FAILED) {
            perror("mmap");
            return 1;
        }
        memset(addr, 'A', 1<<16);

        if (!madvise(addr, 1<<16, MADV_DONTNEED)) {
            puts("MADV_DONTNEED does not return error. Check memory.");
            for (int i = 0; i < 1<<16; ++i) {
                assert(((unsigned char *)addr)[i] == 0);
            }
        } else {
            perror("madvise");
        }

        if (munmap(addr, 1<<16)) {
            perror("munmap");
            return 1;
        }

        return 0;
    }

  ### unpatched qemu

      $ qemu-x86_64-static /tmp/test-MADV_DONTNEED
      MADV_DONTNEED does not return error. Check memory.
      test-MADV_DONTNEED: /tmp/test-MADV_DONTNEED.c:19: main: Assertion `((unsigned char *)addr)[i] == 0' failed.
      qemu: uncaught target signal 6 (Aborted) - core dumped
      Aborted (core dumped)

  ### patched qemu (by returning ENOSYS error)

      $ qemu-x86_64 /tmp/test-MADV_DONTNEED
      madvise: Success

  ### patch for qemu to return ENOSYS

      diff --git a/linux-user/syscall.c b/linux-user/syscall.c
      index 897d20c076..5540792e0e 100644
      --- a/linux-user/syscall.c
      +++ b/linux-user/syscall.c
      @@ -11775,7 +11775,7 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
                  turns private file-backed mappings into anonymous mappings.
                  This will break MADV_DONTNEED.
                  This is a hint, so ignoring and returning success is ok.  */
      -        return 0;
      +        return ENOSYS;
       #endif
       #ifdef TARGET_NR_fcntl64
           case TARGET_NR_fcntl64:

  [2]: https://gist.github.com/azat/12ba2c825b710653ece34dba7f926ece

v2:
- review fixes
- add opt_dont_trust_madvise
v3:
- review fixes
- rename opt_dont_trust_madvise to opt_trust_madvise
2021-01-20 20:08:30 -08:00
David Goldblatt
a011c4c22d cache_bin: Separate out local and remote accesses.
This fixes an incorrect debug-mode assert:
- T1 starts an arena stats update and reads stack_head from another thread's
  cache bin, when that cache bin has 1 item in it.
- T2 allocates from that cache bin.  The cache_bin's stack_head now points to a
  NULL pointer, since the cache bin is empty.
- T1 Re-reads the cache_bin's stack_head to perform an assertion check (since it
  previously saw that the bin was empty, whatever stack_head points to should be
  non-NULL).
2021-01-08 14:18:08 -08:00
Yinan Zhang
14d689c0f9 Add prof stats mutex stats 2021-01-07 20:39:49 -08:00
Yinan Zhang
40fa4d29d3 Track per size class internal fragmentation 2021-01-07 20:39:49 -08:00
Yinan Zhang
afa489c3c5 Record request size in prof info 2021-01-07 20:39:49 -08:00
David Goldblatt
a9fa2defdb Add JEMALLOC_COLD, and mark some functions cold.
This hints to the compiler that it should care more about space than CPU (among
other things).  In cases where the compiler lacks profile-guided information,
this can be a substantial space savings.

For now, we mark the mallctl or atexit driven profiling and stats functions that
take up the most space.
2021-01-04 14:55:49 -08:00
Yinan Zhang
b35ac00d58 Do not bump to large size for page aligned request 2020-12-29 17:09:58 -08:00
Yinan Zhang
8a56d6b636 Add last-N mutex stats 2020-12-29 09:44:19 -08:00
Yinan Zhang
74bd63b203 Optimize stats print using partial name-to-mib 2020-12-18 10:39:58 -08:00
Yinan Zhang
4557c0a67d Enable ctl on partial mib and partial name 2020-12-18 10:39:58 -08:00
Yinan Zhang
006dd0414e Add partial name-to-mib functionality 2020-12-18 10:39:58 -08:00
Jin Qian
26c1dc5a3a Support AutoConf for posix_madvise and POSIX_MADV_DONTNEED 2020-12-18 10:05:59 -08:00
Jin Qian
96a59c3bb5 Fix recursive malloc during bootstrap on QNX
pthread_key_create on QNX triggers recursive allocation during tsd
bootstrapping. Using tsd_init_check_recursion to detect that.

Before pthread_key_create, the address of tsd_boot_wrapper is returned
from tsd_get_wrapper instead of using TLS to store the pointer.
tsd_set_wrapper becomes a no-op. After that, the address of
tsd_boot_wrapper is written to TLS and bootstrap continues as before.

Signed-off-by: Jin Qian <jqian@aurora.tech>
2020-12-18 10:05:59 -08:00
David Goldblatt
1e3b8636ff HPA: Remove unused malloc_conf options. 2020-12-08 12:10:48 -08:00
David Goldblatt
a559caf74a hpdata: Strengthen assertions.
Now that we have flat bitmap bit counting functions, we can easily assert that
nfree is always correct.  While we're tightening up this code, enforce
consistency on API boundaries as well.
2020-12-07 06:21:08 -08:00
David Goldblatt
54c94c1679 flat bitmap: add scount / ucount functions.
These can compute the number or set or unset bits in a subrange of the bitmap.
2020-12-07 06:21:08 -08:00
David Goldblatt
e6c057ad35 fb: implement assign in terms of a visitor.
We'll reuse this visitor in the next commit.
2020-12-07 06:21:08 -08:00
David Goldblatt
734e72ce8f bit_util: Guarantee popcount's presence.
Implement popcount generically, so that we can rely on it being present.
2020-12-07 06:21:08 -08:00
David Goldblatt
d9f7e6c668 hpdata: Add a test.
We're about to make the functionality here more complicated; testing hpdata
directly (rather than relying on user's tests) will make debugging easier.
2020-12-07 06:21:08 -08:00
David Goldblatt
3ed0b4e8a3 HPA: Add an nevictions counter.
I.e. the number of times we've purged a hugepage-sized region.
2020-12-07 06:21:08 -08:00
David Goldblatt
f7cf23aa4d psset: Relegate alloc/dalloc to test code.
This is no longer part of the "core" functionality; we only need the stub
implementations as an end-to-end test of hpdata + psset interactions when
metadata is being modified.  Treat them accordingly.
2020-12-07 06:21:08 -08:00
David Goldblatt
0971e1e4e3 hpdata: Use addr/size instead of begin/npages.
This is easier for the users of the hpdata.
2020-12-07 06:21:08 -08:00
David Goldblatt
5228d869ee psset: Use fit/insert/remove as basis functions.
All other functionality can be implemented in terms of these; doing so (while
retaining the same API) will be convenient for subsequent refactors.
2020-12-07 06:21:08 -08:00
David Goldblatt
089f8fa442 Move hpdata bitmap logic out of the psset. 2020-12-07 06:21:08 -08:00
David Goldblatt
ca30b5db2b Introduce hpdata_t.
Using an edata_t both for hugepages and the allocations within those hugepages
was convenient at first, but has outlived its usefulness.  Representing
hugepages explicitly, with their own data structure, will make future
development easier.
2020-12-07 06:21:08 -08:00
David Goldblatt
43af63fff4 HPA: Manage whole hugepages at a time.
This redesigns the HPA implementation to allow us to manage hugepages all at
once, locally, without relying on a global fallback.
2020-12-07 06:21:08 -08:00
David Goldblatt
63677dde63 Pages: Statically detect if pages_huge may succeed 2020-12-07 06:21:08 -08:00
David Goldblatt
c1b2a77933 psset: Move in stats.
A later change will benefit from having these functions pulled into a
psset-module set of functions.
2020-12-07 06:21:08 -08:00
David Goldblatt
d0a991d47b psset: Add insert/remove functions.
These will allow us to (for instance) move pageslabs from a psset dedicated to
not-yet-hugeified pages to one dedicated to hugeified ones.
2020-12-07 06:21:08 -08:00
David Goldblatt
ecd39418ac Add fxp: A fixed-point math library.
This will be used in the next commit to allow non-integer values for
narenas_ratio.
2020-12-04 23:48:19 -08:00
David Carlier
520b75fa2d utrace support with label based signature. 2020-11-30 11:43:00 -08:00
Yinan Zhang
be5e49f4fa Add a batch mode for cache_bin_alloc() 2020-11-16 20:58:01 -08:00
Yinan Zhang
566c4a8594 Slight changes to cache bin internal functions 2020-11-16 20:58:01 -08:00
David Goldblatt
cf2549a149 Add a per-arena oversize_threshold.
This can let manual arenas trade off memory and CPU the way auto arenas do.
2020-11-13 13:45:35 -08:00
David Goldblatt
4ca3d91e96 Rename geom_grow -> exp_grow.
This was promised in the review of the introduction of geom_grow, but would have
been painful to do there because of the series that introduced it.  Now that
those are comitted, renaming is easier.
2020-11-13 13:42:33 -08:00
David Goldblatt
b4c37a6e81 Rename edata_tree_t -> edata_avail_t.
This isn't a tree any more, and it mildly irritates me any time I see it.
2020-11-13 13:42:11 -08:00
David Carlier
95f0a77fde Detect pthread_getname_np explicitly.
At least one libc (musl) defines pthread_setname_np without defining
pthread_getname_np. Detect the presence of each individually, rather than
inferring both must be defined if set is.
2020-11-11 17:31:22 -08:00
David Goldblatt
589638182a Use the edata_cache_small_t in the HPA. 2020-11-05 12:34:43 -08:00
David Goldblatt
03a6047111 Edata cache small: rewrite.
In previous designs, this was intended to be a sort of cache that couldn't fail.
In the current design, we want to use it just as a contention reduction
mechanism.  Rewrite it with those goals in mind.
2020-11-05 12:34:43 -08:00
David Goldblatt
1b3ee75667 Add experimental.thread.activity_callback.
This (experimental, undocumented) functionality can be used by users to track
various statistics of interest at a finer level of granularity than the thread.
2020-11-05 12:33:25 -08:00
David Carlier
d2d941017b MADV_DO[NOT]DUMP support equivalence on FreeBSD. 2020-11-02 09:15:15 -08:00
DC
ef6d51ed44 DragonFlyBSD build support. 2020-10-27 12:35:19 -07:00
Qi Wang
bf72188f80 Allow opt.tcache_max to accept small size classes.
Previously all the small size classes were cached.  However this has downsides
-- particularly when page size is greater than 4K (e.g. iOS), which will result
in much higher SMALL_MAXCLASS.

This change allows tcache_max to be set to lower values, to better control
resources taken by tcache.
2020-10-24 20:43:44 -07:00
David Goldblatt
ea32060f9c SEC: Implement thread affinity.
For now, just have every thread pick a shard once and stick with it.
2020-10-23 11:14:34 -07:00
David Goldblatt
d16849c91d psset: Do first-fit based on slab age.
This functions more like the serial number strategy of the ecache and
hpa_central_t.  Longer-lived slabs are more likely to continue to live for
longer in the future.
2020-10-23 11:14:34 -07:00
David Goldblatt
634ec6f50a Edata: add an "age" field. 2020-10-23 11:14:34 -07:00
David Goldblatt
6599651aee PA: Use an SEC in fron of the HPA shard. 2020-10-23 11:14:34 -07:00
David Goldblatt
ea51e97bb8 Add SEC module: a small extent cache.
This can be used to take pressure off a more centralized, worse-sharded
allocator without requiring a full break of the arena abstraction.
2020-10-23 11:14:34 -07:00
David Goldblatt
1964b08394 HPA: Add stats for the hpa_shard. 2020-10-23 11:14:34 -07:00
David Goldblatt
534504d4a7 HPA: add size-exclusion functionality.
I.e. only allowing allocations under or over certain sizes.
2020-10-23 11:14:34 -07:00
David Goldblatt
484f04733e HPA: Add central mutex contention stats. 2020-10-23 11:14:34 -07:00
David Goldblatt
bf025d2ec8 HPA: Make slab sizes and maxes configurable.
This allows easy experimentation with them as tuning parameters.
2020-10-23 11:14:34 -07:00
David Goldblatt
1c7da33317 HPA: Tie components into a PAI implementation. 2020-10-23 11:14:34 -07:00
Qi Wang
c8209150f9 Switch from opt.lg_tcache_max to opt.tcache_max
Though for convenience, keep parsing lg_tcache_max.
2020-10-22 20:40:41 -07:00
Qi Wang
5e41ff9b74 Add a hard limit on tcache max size class.
For locality reasons, tcache bins are integrated in TSD.  Allowing all size
classes to be cached has little benefit, but takes up much thread local storage.
In addition, it complicates the layout which we try hard to optimize.
2020-10-16 13:49:51 -07:00
Qi Wang
3de19ba401 Eagerly detect double free and sized dealloc bugs for large sizes. 2020-10-15 10:03:16 -07:00
David Goldblatt
21b70cb540 Add hpa_central module
This will be the centralized component of the coming hugepage allocator; the
source of larger chunks of memory from which smaller ones can be obtained.
2020-10-05 19:55:57 -07:00
David Goldblatt
1ed7ec369f Emap: Add emap_assert_not_mapped.
The counterpart to emap_assert_mapped, it lets callers check that some edata is
not already in the emap.
2020-10-05 19:55:57 -07:00
David Goldblatt
9e6aa77ab9 PRNG: Remove atomic functionality.
These had no uses and complicated the API.  As a rule we now expect to only use
thread-local randomization for contention-reduction reasons, so we only pay the
API costs and never get the functionality benefits.
2020-10-05 19:55:57 -07:00
David Goldblatt
0513047170 PRNG: Allow a a range argument of 1.
This is convenient when the range argument itself is generated from some
computation whose value we don't know in advance.
2020-10-05 19:55:57 -07:00
David Goldblatt
259c5e3e8f psset: Add stats 2020-09-18 12:39:25 -07:00
David Goldblatt
018b162d67 Add psset: a set of pageslabs.
This introduces a new sort of edata_t; a pageslab, and a set to manage them.
This is part of a series of a commits to implement a hugepage allocator; the
pageset will be per-arena, and track small page allocations requests within a
larger extent allocated from a centralized hugepage allocator.
2020-09-18 12:39:25 -07:00
David Goldblatt
ed99d300b9 Flat bitmap: Add longest-range computation.
This will come in handy in the (upcoming) page-slab set assertions.
2020-09-18 12:39:25 -07:00
David Goldblatt
e034500698 Edata: rename "ranged" bit to "pai".
This better represents its intended purpose; the hugepage allocator design
evolved away from needing contiguity of hugepage virtual address space.
2020-09-18 12:39:25 -07:00
David Goldblatt
7ad2f78663 Avoid a -Wundef warning on LG_SLAB_MAXREGS. 2020-09-17 10:05:40 -07:00
Hao Liu
1541ffc765 configure: add --with-lg-slab-maxregs configure option.
Specify the maximum number of regions in a slab, which is
(<lg-page> - <lg-tiny-min>) by default. This increases the limit of slab sizes
specified by "slab_sizes" in malloc_conf. This should never be less than
the default value. The max value of this option is related to LG_BITMAP_MAXBITS
(see more in bitmap.h).

For example, on a 4k page size system, if we:
  1) configure jemalloc with with --with-lg-slab-maxregs=12.
  2) export MALLOC_CONF="slab_sizes:9-16:4"
The slab size of 16 bytes is set to 4 pages. Previously, the default
lg-slab-maxregs is 9 (i.e. 12 - 3). The max slab size of 16 bytes is 2 pages
(i.e. (1<<9) * 16 bytes). By increasing the value from 9 to 12, the max slab
size can be set by MALLOC_CONF is 16 pages (i.e. (1<<12) * 16 bytes).
2020-09-16 13:58:38 -07:00
Yinan Zhang
b549389e4a Correct usize in prof last-N record 2020-09-09 13:31:35 -07:00
Yinan Zhang
866231fc61 Do not repeat reentrancy test in profiling 2020-08-25 16:49:32 -07:00
Yinan Zhang
20f2479ed7 Do not create size class tables for non-prof builds 2020-08-24 20:10:02 -07:00
Yinan Zhang
8efcdc3f98 Move unbias data to prof_data 2020-08-24 20:10:02 -07:00
David Goldblatt
5e90fd006e Geom_grow: Don't keep the mutex internal.
We're about to use it in ways that will have external synchronization.
2020-08-19 16:53:21 -07:00
David Goldblatt
c57494879f Geom_grow: Don't take tsdn at init.
It's never used.
2020-08-19 16:53:21 -07:00
David Goldblatt
ffe552223c Geom_grow: Move in advancing logic. 2020-08-19 16:53:21 -07:00
David Goldblatt
131b1b5338 Rename ecache_grow -> geom_grow.
We're about to start using it outside of the ecaches, in the HPA central
allocator.
2020-08-19 16:53:21 -07:00
David Goldblatt
9e18ae639f Config: safety checks don't imply size checks.
The commit introducing size checks accidentally enabled them whenever any safety
checks were on.  This ends up causing the regression that splitting up the
features was intended to avoid.  Fix the issue.
2020-08-12 13:00:19 -07:00
David Goldblatt
eaed1e39be Add sized-delete size-checking functionality.
The existing checks are good at finding such issues (on tcache flush), but not
so good at pinpointing them.  Debug mode can find them, but sometimes debug mode
slows down a program so much that hard-to-hit bugs can take a long time to
crash.

This commit adds functionality to keep programs mostly on their fast paths,
while also checking every sized delete argument they get.
2020-08-05 19:34:05 -07:00
David Goldblatt
60993697d8 Prof: Add prof_unbias.
This gives more accurate attribution of bytes and counts to stack traces,
without introducing backwards incompatibilities in heap-profile parsing tools.
We track the ideal reported (to the end user) number of bytes more carefully
inside core jemalloc.  When dumping heap profiles, insteading of outputting our
counts directly, we output counts that will cause parsing tools to give a result
close to the value we want.

We retain the old version as an opt setting, to let users who are tracking
values on a per-component basis to keep their metrics stable until they decide
to switch.
2020-08-05 18:33:55 -07:00
David Goldblatt
81c2f841e5 Add a simple utility to detect profiling bias. 2020-08-05 18:33:55 -07:00
Yinan Zhang
978f830ee3 Add batch allocation API 2020-07-31 09:16:50 -07:00
Yinan Zhang
c6f59e9bb4 Add surplus reading API for thread event lookahead 2020-07-31 09:16:50 -07:00
Yinan Zhang
f805468957 Add zero option to arena batch allocation 2020-07-31 09:16:50 -07:00
Yinan Zhang
49e5c2fe7d Add batch allocation from fresh slabs 2020-07-31 09:16:50 -07:00
Yinan Zhang
2bb8060d57 Add empty test and concat for typed list 2020-07-31 09:16:50 -07:00
Yinan Zhang
f28cc2bc87 Extract bin shard selection out of bin locking 2020-07-31 09:16:50 -07:00
David Goldblatt
ddb8dc4ad0 FB: Add range iteration support. 2020-07-30 15:25:23 -07:00
David Goldblatt
ceee823519 Add flat_bitmap.
The flat_bitmap module offers an extended API, at the cost of decreased
performance in the case of very large bitmaps.
2020-07-30 15:25:23 -07:00
David Goldblatt
22da836094 bit_util: Add fls_ functions; "find last set".
These simplify a lot of the bit_util module, which had grown bits and pieces of
this functionality across a variety of places over the years.

While we're here, kill off BIT_UTIL_INLINE and don't do reentrancy testing for
bit_util.
2020-07-30 15:25:23 -07:00
David Goldblatt
1ed0288d9c bit_util: Change ffs functions indexing.
Making these 0-based instead of 1-based makes calling code simpler and will be
more consistent with functions introduced in subsequent diffs.
2020-07-30 15:25:23 -07:00
David Goldblatt
6107857b7b PA->PAC: Move in PAI implementation. 2020-07-09 13:41:04 -07:00
David Goldblatt
6041aaba97 PA -> PAC: Move in destruction functions. 2020-07-09 13:41:04 -07:00
David Goldblatt
471eb5913c PAC: Move in decay rate setting. 2020-07-09 13:41:04 -07:00
David Goldblatt
6a2774719f PA->PAC: Move in decay functions. 2020-07-09 13:41:04 -07:00
David Goldblatt
4ee75be3a3 PA -> PAC: Move in decay_purge enum. 2020-07-09 13:41:04 -07:00
David Goldblatt
72435b0aba PA->PAC: Make extent.c forget about PA. 2020-07-09 13:41:04 -07:00
David Goldblatt
dee5d1c42d PA->PAC: Move in extent_sn. 2020-07-09 13:41:04 -07:00
David Goldblatt
7391382349 PA->PAC: Move in stats. 2020-07-09 13:41:04 -07:00
David Goldblatt
db211eefbf PAC: Move in decay. 2020-07-09 13:41:04 -07:00
David Goldblatt
c81e389996 PAC: Move in ecache_grow. 2020-07-09 13:41:04 -07:00
David Goldblatt
65803171a7 PAC: move in emap 2020-07-09 13:41:04 -07:00
David Goldblatt
7efcb946c4 PAC: Add an init function. 2020-07-09 13:41:04 -07:00
David Goldblatt
722652222a PAC: Move in edata_cache accesses. 2020-07-09 13:41:04 -07:00
David Goldblatt
777b0ba965 Add PAC: Page allocator classic.
For now, this is just a stub containing the ecaches, with no surrounding code
changed.  Eventually all the core allocator bits will be moved in, in the
subsequent stack of commits.
2020-07-09 13:41:04 -07:00
David Goldblatt
1b5f632e0f Introduce PAI: Page allocator interface 2020-07-09 13:41:04 -07:00
David Goldblatt
3cf19c6e5e atomic: add atomic_load_sub_store 2020-07-09 13:41:04 -07:00
David Goldblatt
ae541d3fab Edata: Reserve some space for hugepages. 2020-07-08 13:20:59 -07:00
David Goldblatt
392f645f4d Edata: split up different list linkage uses. 2020-07-08 13:20:59 -07:00
David Goldblatt
129b727058 Add typed-list module.
This gives some named convenience wrappers.
2020-07-08 13:20:59 -07:00
David Carlier
00f06c9beb enabling mpss on solaris/illumos.
reusing slighty linux configuration as possible, aligning the
 address range to HUGEPAGE.
2020-07-06 09:59:10 -07:00
Yinan Zhang
c2e7a06392 No need to intercept prof_dump_header() in tests 2020-06-29 14:27:50 -07:00
Yinan Zhang
f58ebdff7a Generalize prof_cnt_all() for testing 2020-06-29 14:27:50 -07:00
Yinan Zhang
d4259ea53b Simplify signatures for prof dump functions 2020-06-29 14:27:50 -07:00
Yinan Zhang
1f5fe3a3e3 Pass write callback explicitly in prof_data 2020-06-29 14:27:50 -07:00
Yinan Zhang
dad821bb22 Move unwind to prof_sys 2020-06-29 14:27:50 -07:00
Yinan Zhang
d128efcb6a Relocate a few prof utilities to the right modules 2020-06-29 14:27:50 -07:00
Yinan Zhang
4736fb4fc9 Move file handling logic in prof_data to prof_sys 2020-06-29 14:27:50 -07:00
Yinan Zhang
767a2e1790 Move file handling logic in prof to prof_sys 2020-06-29 14:27:50 -07:00
Yinan Zhang
03ae509f32 Create prof_sys module for reading system thread name 2020-06-29 14:27:50 -07:00
Yinan Zhang
adfd9d7b1d Change tsdn to tsd for thread name allocation 2020-06-29 14:27:50 -07:00
Yinan Zhang
841af2b426 Move thread name handling to prof_data module 2020-06-29 14:27:50 -07:00
Yinan Zhang
8118056c03 Expose prof_data testing internals only in prof tests 2020-06-29 14:27:50 -07:00
Yinan Zhang
f43ac8543e Correct prof header macro namings 2020-06-29 14:27:50 -07:00
Yinan Zhang
5d292b5660 Push error handling logic out of core dumping logic 2020-06-29 14:27:50 -07:00
Yinan Zhang
f541871f5d Reduce prof dump buffer size in debug build 2020-06-29 14:27:50 -07:00
Yinan Zhang
354183b10d Define prof dump buffer size centrally 2020-06-29 14:27:50 -07:00
Yinan Zhang
7455813e57 Make dump file writing replaceable in test 2020-06-29 14:27:50 -07:00
Yinan Zhang
21e44c45d9 Make maps file opening replaceable in test 2020-06-29 14:27:50 -07:00
Yinan Zhang
f307b25804 Only replace the dump file opening function in test 2020-06-29 14:27:50 -07:00
Yinan Zhang
d460333efb Improve naming for prof system thread name option 2020-06-24 14:32:01 -07:00
David T. Goldblatt
25e43c6022 Witness: Make ranks an enum.
This lets us avoid having to increment a bunch of values manually every time we
add a new sort of lock.
2020-06-19 18:05:08 -07:00
Yinan Zhang
b7858abfc0 Expose prof testing internal functions 2020-06-19 09:16:51 -07:00
Jon Haslam
4aea743279 High Resolution Timestamps for Profiling 2020-06-15 12:12:49 -07:00
David Goldblatt
d82a164d0d Add thread.peak.[read|reset] mallctls.
These can be used to track net allocator activity on a per-thread basis.
2020-06-11 13:54:22 -07:00
David Goldblatt
fe7108305a Add peak_t, for tracking allocator net max. 2020-06-11 13:54:22 -07:00
Yinan Zhang
3e19ebd2ea Add lock to protect prof last-N dumping 2020-06-09 17:03:05 -07:00
Yinan Zhang
857ebd3daf Make edata pointer on prof recent record an atomic fence 2020-06-09 17:03:05 -07:00
David Goldblatt
6cdac3c573 Tcache: Make flush fractions configurable. 2020-05-16 13:34:23 -07:00
David Goldblatt
ee72bf1cfd Tcache: Add tcache gc delay option.
This can reduce flushing frequency for small size classes.
2020-05-16 13:34:23 -07:00
David Goldblatt
d338dd45d7 Tcache: Make incremental gc bytes configurable. 2020-05-16 13:34:23 -07:00
David Goldblatt
ec0b579563 Tcache: Privatize opt_lg_tcache_max default. 2020-05-16 13:34:23 -07:00
David Goldblatt
10b96f6351 Tcache: Remove some unused gc constants. 2020-05-16 13:34:23 -07:00
David Goldblatt
181093173d Tcache: make slot sizing configurable. 2020-05-16 13:34:23 -07:00
David Goldblatt
b58dea8d1b Cache bin: expose ncached_max publicly. 2020-05-16 13:34:23 -07:00
David Goldblatt
634afc4124 Tcache: Make size computation configurable. 2020-05-16 13:34:23 -07:00
Brooks Davis
27f29e424b LQ_QUANTUM should be 4 on mips64 hardware.
This matches the ABI stack alignment requirements.
2020-05-14 10:30:37 -07:00
David Goldblatt
eda9c2858f Edata: zero stack edatas before initializing.
This avoids some UB. No compilers take advantage of it for now, but no sense in
tempting fate.
2020-05-14 10:30:20 -07:00
Yinan Zhang
dcea2c0f8b Get rid of TSD -> thread event dependency 2020-05-12 09:16:16 -07:00
Yinan Zhang
b06dfb9ccc Push event handlers to constituent modules 2020-05-12 09:16:16 -07:00
Yinan Zhang
abd4674931 Extract out per event postponed wait time fetching 2020-05-12 09:16:16 -07:00
Yinan Zhang
f72014d097 Only compute thread event threshold once per trigger 2020-05-12 09:16:16 -07:00
Yinan Zhang
6de77799de Move thread event wait time update to local 2020-05-12 09:16:16 -07:00
Yinan Zhang
733ae918f0 Extract out per event new wait time fetching 2020-05-12 09:16:16 -07:00
Yinan Zhang
1e2524e15a Do not reset sample wait time when re-initing tdata 2020-05-12 09:16:16 -07:00
Yinan Zhang
855d20f6f3 Remove outdated comments in thread event 2020-05-12 09:16:16 -07:00
Yinan Zhang
fc052ff728 Migrate counter to use locked int 2020-05-12 08:23:15 -07:00
Yinan Zhang
b543c20a94 Minor update to locked int 2020-05-12 08:23:15 -07:00
Yinan Zhang
f533ab6da6 Add forking handling for stats 2020-05-11 15:35:06 -07:00
Yinan Zhang
4d970f8bfc Add forking handling for counter module 2020-05-11 15:35:06 -07:00
Yinan Zhang
2097e1945b Unify write callback signature 2020-05-11 14:51:24 -07:00
Yinan Zhang
fef9abdcc0 Cleanup tcache allocation logic
The logic in tcache allocation no longer involves profiling or
filling.
2020-05-11 12:24:56 -07:00
Yinan Zhang
e6cb6919c0 Consolidate prof inline function headers
The prof inline functions are no longer involved in a circular
dependency, so consolidate the two headers into one.
2020-05-11 12:24:56 -07:00
Yinan Zhang
d454af90f1 Remove unused prof_accum field from arena 2020-05-11 12:24:56 -07:00
Yinan Zhang
8be5584494 Initialize prof idump counter once rather than once per arena 2020-05-11 12:24:56 -07:00
Yinan Zhang
e10e5059e8 Make prof_idump_accum() non-inline 2020-05-11 12:24:56 -07:00
Yinan Zhang
039bfd4e30 Do not rollback prof idump counter in arena_prof_promote() 2020-05-11 12:24:56 -07:00
David Goldblatt
46471ea327 SC: Name the max lookup constant. 2020-05-04 12:27:07 -07:00
David Goldblatt
79dd0c04ed SC: Simplify SC_NPSIZES computation.
Rather than taking all the sizes and subtracting out those that don't fit, we
instead just add up all the ones that do.
2020-05-04 12:27:07 -07:00
David Goldblatt
4f8efba824 TSD: Make rtree_ctx a slow-path field.
Performance-sensitive users will use sized deallocation facilities, so that
actually touching the rtree_ctx is unnecessary.  We make it the last element of
the slow data, so that it is for practical purposes almost-fast.
2020-04-14 15:20:19 -07:00
David Goldblatt
cd29ebefd0 Tcache: treat small and large cache bins uniformly 2020-04-14 15:20:19 -07:00
David Goldblatt
a13fbad374 Tcache: split up fast and slow path data. 2020-04-14 15:20:19 -07:00
David Goldblatt
7099c66205 Arena: fill in terms of cache_bins. 2020-04-14 15:20:19 -07:00
David Goldblatt
40e7aed59e TSD: Move in some of the tcache fields.
We had put these in the tcache for cache optimization reasons.  After the
previous diff, these no longer apply.
2020-04-14 15:20:19 -07:00
David Goldblatt
58a00df238 TSD: Put all fast-path data together. 2020-04-14 15:20:19 -07:00
David Goldblatt
877af247a8 QL, QR: Add documentation. 2020-04-11 10:32:11 -07:00
David Goldblatt
79ae7f9211 Rtree: Remove the per-field accessors.
We instead split things into "edata" and "metadata".
2020-04-10 13:12:47 -07:00
David Goldblatt
bb6a418523 Emap: Drop szind/slab splitting parameters.
After the previous diff, these are constants.
2020-04-10 13:12:47 -07:00
David Goldblatt
50289750b3 Extent: Remove szind/slab knowledge. 2020-04-10 13:12:47 -07:00
David Goldblatt
dc26b30094 Rtree: Clean up compact/non-compact split. 2020-04-10 13:12:47 -07:00
David Goldblatt
93b99dd140 Extent: Stop passing an edata_cache everywhere.
We already pass the pa_shard_t around everywhere; we can just use that.
2020-04-10 13:12:47 -07:00
David Goldblatt
11c47cb133 Extent: Take "bool zero" over "bool *zero". 2020-04-10 13:12:47 -07:00
David Goldblatt
1a1124462e PA: Take zero as a bool rather than as a bool *.
Now that we've moved junking to a higher level of the allocation stack, we don't
care about this performance optimization (which only occurred in debug modes).
2020-04-10 13:12:47 -07:00
David Goldblatt
294b276fc7 PA: Parameterize emap. Move emap_global to arena.
This lets us test the PA module without interfering with the global emap used by
the real allocator (the one not under test).
2020-04-10 13:12:47 -07:00
David Goldblatt
f730577277 Eset: Parameterize last globals accesses.
I.e. opt_retain and maps_coalesce.
2020-04-10 13:12:47 -07:00
David Goldblatt
7bb6e2dc0d Eset: take opt_lg_max_active_fit as a parameter.
This breaks its dependence on the global.
2020-04-10 13:12:47 -07:00
David Goldblatt
883ab327cc Emap: Move out last edata state touching. 2020-04-10 13:12:47 -07:00
David Goldblatt
12eb888e54 Edata: Add a ranged bit.
We steal the dumpable bit, which we ended up not needing.
2020-04-10 13:12:47 -07:00
David Goldblatt
bd4fdf295e Rtree: Pull leaf contents into their own struct. 2020-04-10 13:12:47 -07:00
David Goldblatt
faec7219b2 PA: Move in decay initialization. 2020-04-10 13:12:47 -07:00
David Goldblatt
45671e4a27 PA: Move in retain growth limit setting. 2020-04-10 13:12:47 -07:00
David Goldblatt
daefde88fe PA: Move in mutex stats reading. 2020-04-10 13:12:47 -07:00
David Goldblatt
07675840a5 PA: Move in some more internals accesses. 2020-04-10 13:12:47 -07:00
David Goldblatt
238f3c7430 PA: Move in full stats merging. 2020-04-10 13:12:47 -07:00
David Goldblatt
81c6027592 Arena stats: Give it its own "mapped".
This distinguishes it from the PA mapped stat, which is now named "pa_mapped" to
avoid confusion. The (derived) arena stat includes base memory, and the PA stat
is no longer partially derived.
2020-04-10 13:12:47 -07:00
David Goldblatt
506d907e40 PA: Move in basic stats merging. 2020-04-10 13:12:47 -07:00
David Goldblatt
f29f6090f5 PA: Add pa_extra.c and put PA forking there. 2020-04-10 13:12:47 -07:00
David Goldblatt
565045ef71 Arena: Make more derived stats non-atomic/locked. 2020-04-10 13:12:47 -07:00
David Goldblatt
d0c43217b5 Arena stats: Move retained to PA, use plain ints.
Retained is a property of the allocated pages.  The derived fields no longer
require any locking; they're computed on demand.
2020-04-10 13:12:47 -07:00
David Goldblatt
e2cf3fb1a3 PA: Move in all modifications of mapped. 2020-04-10 13:12:47 -07:00
David Goldblatt
436789ad96 PA: Make mapped stat atomic.
We always have atomic_zu_t, and mapped/unmapped transitions are always expensive
enough that trying to piggyback on a lock is a waste of time.
2020-04-10 13:12:47 -07:00
David Goldblatt
3c28aa6f17 PA: Move edata_avail stat in, make it non-atomic. 2020-04-10 13:12:47 -07:00
David Goldblatt
f6bfa3dcca Move extent stats to the PA module.
While we're at it, make them non-atomic -- they are purely derived statistics
(and in fact aren't even in the arena_t or pa_shard_t).
2020-04-10 13:12:47 -07:00
David Goldblatt
527dd4cdb8 PA: Move in nactive counter. 2020-04-10 13:12:47 -07:00
David Goldblatt
c075fd0bcb PA: Minor cleanups and comment fixes. 2020-04-10 13:12:47 -07:00
David Goldblatt
46a9d7fc0b PA: Move in rest of purging. 2020-04-10 13:12:47 -07:00
David Goldblatt
2d6eec7b5c PA: Move in decay-all pathway. 2020-04-10 13:12:47 -07:00
David Goldblatt
65698b7f2e PA: Remove public visibility of some internals. 2020-04-10 13:12:47 -07:00
David Goldblatt
f012c43be0 PA: Move in decay_to_limit 2020-04-10 13:12:47 -07:00
David Goldblatt
3034f4a508 PA: Move in decay_stashed. 2020-04-10 13:12:47 -07:00
David Goldblatt
aef28b2f8f PA: Move in stash_decayed. 2020-04-10 13:12:47 -07:00
David Goldblatt
71fc0dc968 PA: Move in remaining page allocation functions. 2020-04-10 13:12:47 -07:00
David Goldblatt
74958567a4 PA: have expand take sizes instead of new usize.
This avoids involving usize, which makes some of the stats modifications more
intuitively correct.
2020-04-10 13:12:47 -07:00
David Goldblatt
5bcc2c2ab9 PA: Have expand take szind and slab.
This isn't really necessary, but having a uniform API will help us later.
2020-04-10 13:12:47 -07:00
David Goldblatt
0880c2ab97 PA: Have large expands use it. 2020-04-10 13:12:47 -07:00
David Goldblatt
9f93625c14 PA: Move in arena large allocation functionality. 2020-04-10 13:12:47 -07:00
David Goldblatt
7624043a41 PA: Add ehook-getting support. 2020-04-10 13:12:47 -07:00
David Goldblatt
eba35e2e48 Remove extent knowledge of arena. 2020-04-10 13:12:47 -07:00
David Goldblatt
e77f47a85a Move arena decay getters to PA. 2020-04-10 13:12:47 -07:00
David Goldblatt
f77cec311e Decay: Take current time as an argument.
This better facilitates testing.
2020-04-10 13:12:47 -07:00
David Goldblatt
d1d7e1076b Decay: move in some background_thread accesses. 2020-04-10 13:12:47 -07:00
David Goldblatt
cdb916ed3f Decay: Add comments for the public API. 2020-04-10 13:12:47 -07:00
David Goldblatt
8f2193dc8d Decay: Move in arena decay functions. 2020-04-10 13:12:47 -07:00
David Goldblatt
7b62885476 Introduce decay module and put decay objects in PA 2020-04-10 13:12:47 -07:00
David Goldblatt
497836dbc8 Arena stats: mark edata_avail as derived.
The true number is in the edata_cache itself.
2020-04-10 13:12:47 -07:00
David Goldblatt
3192d6b77d Extents: Have extent_dalloc_gap take ehooks.
We're almost to the point where the extent code doesn't know about arenas at
all.  In that world, we shouldn't pull them out of the arena.
2020-04-10 13:12:47 -07:00
David Goldblatt
22a0a7b93a Move arena_decay_extent to extent module. 2020-04-10 13:12:47 -07:00
David Goldblatt
70d12ffa05 PA: Move mapped into pa stats. 2020-04-10 13:12:47 -07:00
David Goldblatt
6ca918d0cf PA: Add a stats comment. 2020-04-10 13:12:47 -07:00
David Goldblatt
ce8c0d6c09 PA: Move in arena extent_sn counter.
Just another step towards making PA self-contained.
2020-04-10 13:12:47 -07:00
David Goldblatt
1ad368c8b7 PA: Move in decay stats. 2020-04-10 13:12:47 -07:00
David Goldblatt
356aaa7dc6 Introduce lockedint module.
This pulls out the various abstractions where some stats counter is sometimes an
atomic, sometimes a plain variable, sometimes always protected by a lock,
sometimes protected by reads but not writes, etc.  With this change, these cases
are treated consistently, and access patterns tagged.

In the process, we fix a few missed-update bugs (where one caller assumes
"protected-by-a-lock" semantics and another does not).
2020-04-10 13:12:47 -07:00
David Goldblatt
acd0bf6a26 PA: move in ecache_grow. 2020-04-10 13:12:47 -07:00
David Goldblatt
32cb7c2f0b PA: Add a stats type. 2020-04-10 13:12:47 -07:00
David Goldblatt
688fb3eb89 PA: Move in the arena edata_cache. 2020-04-10 13:12:47 -07:00
David Goldblatt
8433ad84ea PA: move in shard initialization. 2020-04-10 13:12:47 -07:00
David Goldblatt
a24faed569 PA: Move in the ecache_t objects. 2020-04-10 13:12:47 -07:00
David Goldblatt
585f925055 Move cache index randomization out of extent.
This is logically at a higher level of the stack; extent should just allocate
things at the page-level; it shouldn't care exactly why the callers wants a
given number of pages.
2020-04-10 13:12:47 -07:00
David Goldblatt
12be9f5727 Add a stub PA module -- a page allocator. 2020-04-10 13:12:47 -07:00
Yinan Zhang
c4e9ea8cc6 Get rid of locks in prof recent test 2020-04-07 17:22:24 -07:00