Commit Graph

790 Commits

Author SHA1 Message Date
David Goldblatt
54c94c1679 flat bitmap: add scount / ucount functions.
These can compute the number or set or unset bits in a subrange of the bitmap.
2020-12-07 06:21:08 -08:00
David Goldblatt
734e72ce8f bit_util: Guarantee popcount's presence.
Implement popcount generically, so that we can rely on it being present.
2020-12-07 06:21:08 -08:00
David Goldblatt
d9f7e6c668 hpdata: Add a test.
We're about to make the functionality here more complicated; testing hpdata
directly (rather than relying on user's tests) will make debugging easier.
2020-12-07 06:21:08 -08:00
David Goldblatt
f7cf23aa4d psset: Relegate alloc/dalloc to test code.
This is no longer part of the "core" functionality; we only need the stub
implementations as an end-to-end test of hpdata + psset interactions when
metadata is being modified.  Treat them accordingly.
2020-12-07 06:21:08 -08:00
David Goldblatt
ca30b5db2b Introduce hpdata_t.
Using an edata_t both for hugepages and the allocations within those hugepages
was convenient at first, but has outlived its usefulness.  Representing
hugepages explicitly, with their own data structure, will make future
development easier.
2020-12-07 06:21:08 -08:00
David Goldblatt
4a15008cfb HPA unit test: skip if unsupported.
Previously, we replicated the logic in hpa_supported in the test as well.
2020-12-07 06:21:08 -08:00
David Goldblatt
43af63fff4 HPA: Manage whole hugepages at a time.
This redesigns the HPA implementation to allow us to manage hugepages all at
once, locally, without relying on a global fallback.
2020-12-07 06:21:08 -08:00
David Goldblatt
c1b2a77933 psset: Move in stats.
A later change will benefit from having these functions pulled into a
psset-module set of functions.
2020-12-07 06:21:08 -08:00
David Goldblatt
d0a991d47b psset: Add insert/remove functions.
These will allow us to (for instance) move pageslabs from a psset dedicated to
not-yet-hugeified pages to one dedicated to hugeified ones.
2020-12-07 06:21:08 -08:00
David Goldblatt
ecd39418ac Add fxp: A fixed-point math library.
This will be used in the next commit to allow non-integer values for
narenas_ratio.
2020-12-04 23:48:19 -08:00
Yinan Zhang
d96e4525ad Route batch allocation of small batch size to tcache 2020-11-16 20:58:01 -08:00
Yinan Zhang
ac480136d7 Split out locality checking in batch allocation tests 2020-11-16 20:58:01 -08:00
Yinan Zhang
be5e49f4fa Add a batch mode for cache_bin_alloc() 2020-11-16 20:58:01 -08:00
Yinan Zhang
4a65f34930 Fix a cache bin test 2020-11-16 20:58:01 -08:00
Yinan Zhang
9545c2cd36 Add sample interval to prof last-N dump 2020-11-13 15:33:27 -08:00
David Goldblatt
cf2549a149 Add a per-arena oversize_threshold.
This can let manual arenas trade off memory and CPU the way auto arenas do.
2020-11-13 13:45:35 -08:00
David Goldblatt
4ca3d91e96 Rename geom_grow -> exp_grow.
This was promised in the review of the introduction of geom_grow, but would have
been painful to do there because of the series that introduced it.  Now that
those are comitted, renaming is easier.
2020-11-13 13:42:33 -08:00
David Goldblatt
03a6047111 Edata cache small: rewrite.
In previous designs, this was intended to be a sort of cache that couldn't fail.
In the current design, we want to use it just as a contention reduction
mechanism.  Rewrite it with those goals in mind.
2020-11-05 12:34:43 -08:00
David Goldblatt
1b3ee75667 Add experimental.thread.activity_callback.
This (experimental, undocumented) functionality can be used by users to track
various statistics of interest at a finer level of granularity than the thread.
2020-11-05 12:33:25 -08:00
Qi Wang
bf72188f80 Allow opt.tcache_max to accept small size classes.
Previously all the small size classes were cached.  However this has downsides
-- particularly when page size is greater than 4K (e.g. iOS), which will result
in much higher SMALL_MAXCLASS.

This change allows tcache_max to be set to lower values, to better control
resources taken by tcache.
2020-10-24 20:43:44 -07:00
David Goldblatt
d16849c91d psset: Do first-fit based on slab age.
This functions more like the serial number strategy of the ecache and
hpa_central_t.  Longer-lived slabs are more likely to continue to live for
longer in the future.
2020-10-23 11:14:34 -07:00
David Goldblatt
6599651aee PA: Use an SEC in fron of the HPA shard. 2020-10-23 11:14:34 -07:00
David Goldblatt
ea51e97bb8 Add SEC module: a small extent cache.
This can be used to take pressure off a more centralized, worse-sharded
allocator without requiring a full break of the arena abstraction.
2020-10-23 11:14:34 -07:00
David Goldblatt
534504d4a7 HPA: add size-exclusion functionality.
I.e. only allowing allocations under or over certain sizes.
2020-10-23 11:14:34 -07:00
David Goldblatt
1c7da33317 HPA: Tie components into a PAI implementation. 2020-10-23 11:14:34 -07:00
Qi Wang
c8209150f9 Switch from opt.lg_tcache_max to opt.tcache_max
Though for convenience, keep parsing lg_tcache_max.
2020-10-22 20:40:41 -07:00
Qi Wang
3de19ba401 Eagerly detect double free and sized dealloc bugs for large sizes. 2020-10-15 10:03:16 -07:00
David Goldblatt
21b70cb540 Add hpa_central module
This will be the centralized component of the coming hugepage allocator; the
source of larger chunks of memory from which smaller ones can be obtained.
2020-10-05 19:55:57 -07:00
David Goldblatt
2a6ba121b5 PRNG test: cleanups.
Since we no longer have both atomic and non-atomic variants, there's no reason
to try to test both.
2020-10-05 19:55:57 -07:00
David Goldblatt
9e6aa77ab9 PRNG: Remove atomic functionality.
These had no uses and complicated the API.  As a rule we now expect to only use
thread-local randomization for contention-reduction reasons, so we only pay the
API costs and never get the functionality benefits.
2020-10-05 19:55:57 -07:00
David Goldblatt
259c5e3e8f psset: Add stats 2020-09-18 12:39:25 -07:00
David Goldblatt
018b162d67 Add psset: a set of pageslabs.
This introduces a new sort of edata_t; a pageslab, and a set to manage them.
This is part of a series of a commits to implement a hugepage allocator; the
pageset will be per-arena, and track small page allocations requests within a
larger extent allocated from a centralized hugepage allocator.
2020-09-18 12:39:25 -07:00
David Goldblatt
ed99d300b9 Flat bitmap: Add longest-range computation.
This will come in handy in the (upcoming) page-slab set assertions.
2020-09-18 12:39:25 -07:00
David Goldblatt
e034500698 Edata: rename "ranged" bit to "pai".
This better represents its intended purpose; the hugepage allocator design
evolved away from needing contiguity of hugepage virtual address space.
2020-09-18 12:39:25 -07:00
Yinan Zhang
09eda2c9b6 Add unit tests for usize in prof recent records 2020-09-09 13:31:35 -07:00
David Goldblatt
131b1b5338 Rename ecache_grow -> geom_grow.
We're about to start using it outside of the ecaches, in the HPA central
allocator.
2020-08-19 16:53:21 -07:00
David Goldblatt
b399463fba flat_bitmap unit test: Silence a warning. 2020-08-17 12:50:27 -07:00
David Goldblatt
b0ffa39cac Mallctl stress test: fix a type.
The mallctlbymib_long helper was copy-pasted from mallctlbymib_short, and
incorrectly used its output variable (a char *) rather than the output variable
of the mallctl call it was using (a uint64_t), causing breakages when
sizeof(char *) differed from sizeof(uint64_t).
2020-08-17 12:50:14 -07:00
David Goldblatt
753bbf1849 Benchmarks: Also print ns / iter.
This is often what we really care about.  It's not easy to do the division
mentally in all cases.
2020-08-13 10:03:15 -07:00
David Goldblatt
7b187360e9 IO: Support 0-padding for unsigned numbers. 2020-08-13 10:03:15 -07:00
David Goldblatt
32d4673221 Add a mallctl speed stress test. 2020-08-13 10:03:15 -07:00
Yinan Zhang
8f9e958e1e Add alignment stress test for rallocx 2020-08-11 11:56:43 -07:00
David Goldblatt
eaed1e39be Add sized-delete size-checking functionality.
The existing checks are good at finding such issues (on tcache flush), but not
so good at pinpointing them.  Debug mode can find them, but sometimes debug mode
slows down a program so much that hard-to-hit bugs can take a long time to
crash.

This commit adds functionality to keep programs mostly on their fast paths,
while also checking every sized delete argument they get.
2020-08-05 19:34:05 -07:00
David Goldblatt
81c2f841e5 Add a simple utility to detect profiling bias. 2020-08-05 18:33:55 -07:00
Yinan Zhang
e032a1a1de Add a stress test for batch allocation 2020-08-03 09:36:40 -07:00
Yinan Zhang
f6cf5eb388 Add mallctl for batch allocation API 2020-07-31 09:16:50 -07:00
Yinan Zhang
978f830ee3 Add batch allocation API 2020-07-31 09:16:50 -07:00
David Goldblatt
ddb8dc4ad0 FB: Add range iteration support. 2020-07-30 15:25:23 -07:00
David Goldblatt
ceee823519 Add flat_bitmap.
The flat_bitmap module offers an extended API, at the cost of decreased
performance in the case of very large bitmaps.
2020-07-30 15:25:23 -07:00
David Goldblatt
7fde6ac490 Nbits: Add a couple more interesting sizes.
Previously, all tests with more than two levels came in powers of 2.  It's
usefule to check cases where we have a partially filled group at above the
second level.
2020-07-30 15:25:23 -07:00
David Goldblatt
efeab1f498 bitset test: Pull NBITS_TAB into its own file. 2020-07-30 15:25:23 -07:00
David Goldblatt
22da836094 bit_util: Add fls_ functions; "find last set".
These simplify a lot of the bit_util module, which had grown bits and pieces of
this functionality across a variety of places over the years.

While we're here, kill off BIT_UTIL_INLINE and don't do reentrancy testing for
bit_util.
2020-07-30 15:25:23 -07:00
David Goldblatt
1ed0288d9c bit_util: Change ffs functions indexing.
Making these 0-based instead of 1-based makes calling code simpler and will be
more consistent with functions introduced in subsequent diffs.
2020-07-30 15:25:23 -07:00
David Goldblatt
471eb5913c PAC: Move in decay rate setting. 2020-07-09 13:41:04 -07:00
David Goldblatt
6a2774719f PA->PAC: Move in decay functions. 2020-07-09 13:41:04 -07:00
David Goldblatt
7391382349 PA->PAC: Move in stats. 2020-07-09 13:41:04 -07:00
David Goldblatt
db211eefbf PAC: Move in decay. 2020-07-09 13:41:04 -07:00
David Goldblatt
c81e389996 PAC: Move in ecache_grow. 2020-07-09 13:41:04 -07:00
David Goldblatt
777b0ba965 Add PAC: Page allocator classic.
For now, this is just a stub containing the ecaches, with no surrounding code
changed.  Eventually all the core allocator bits will be moved in, in the
subsequent stack of commits.
2020-07-09 13:41:04 -07:00
Yinan Zhang
c2e7a06392 No need to intercept prof_dump_header() in tests 2020-06-29 14:27:50 -07:00
Yinan Zhang
f58ebdff7a Generalize prof_cnt_all() for testing 2020-06-29 14:27:50 -07:00
Yinan Zhang
d4259ea53b Simplify signatures for prof dump functions 2020-06-29 14:27:50 -07:00
Yinan Zhang
1f5fe3a3e3 Pass write callback explicitly in prof_data 2020-06-29 14:27:50 -07:00
Yinan Zhang
4736fb4fc9 Move file handling logic in prof_data to prof_sys 2020-06-29 14:27:50 -07:00
Yinan Zhang
03ae509f32 Create prof_sys module for reading system thread name 2020-06-29 14:27:50 -07:00
Yinan Zhang
8118056c03 Expose prof_data testing internals only in prof tests 2020-06-29 14:27:50 -07:00
Yinan Zhang
5d292b5660 Push error handling logic out of core dumping logic 2020-06-29 14:27:50 -07:00
Yinan Zhang
f307b25804 Only replace the dump file opening function in test 2020-06-29 14:27:50 -07:00
Yinan Zhang
d8cea87562 Move size inspections to test/analyze 2020-06-26 09:45:28 -07:00
Yinan Zhang
537a4bedb4 Add a tool to examine random number distributions 2020-06-26 09:45:28 -07:00
Yinan Zhang
d460333efb Improve naming for prof system thread name option 2020-06-24 14:32:01 -07:00
Yinan Zhang
b7858abfc0 Expose prof testing internal functions 2020-06-19 09:16:51 -07:00
David Goldblatt
7e09a57b39 stress/sizes: Fix an off-by-one issue.
Algorithmically, a size greater than 1024 ZB could access one-past-the-end of
the sizes array.  This couldn't really happen since SIZE_MAX is less than 1024
ZB on all platforms we support (and we pick the arguments to this function to be
reasonable anyways), but it's not like there's any reason *not* to fix it,
either.
2020-06-16 10:34:19 -07:00
David Goldblatt
dcfa6fd507 stress/sizes: Add a couple more types. 2020-06-16 10:34:19 -07:00
Jon Haslam
4aea743279 High Resolution Timestamps for Profiling 2020-06-15 12:12:49 -07:00
David Goldblatt
d82a164d0d Add thread.peak.[read|reset] mallctls.
These can be used to track net allocator activity on a per-thread basis.
2020-06-11 13:54:22 -07:00
David Goldblatt
fe7108305a Add peak_t, for tracking allocator net max. 2020-06-11 13:54:22 -07:00
David Goldblatt
17a64fe91c Add a small program to print data structure sizes. 2020-06-11 08:13:38 -07:00
Yinan Zhang
857ebd3daf Make edata pointer on prof recent record an atomic fence 2020-06-09 17:03:05 -07:00
Yinan Zhang
b8bdea6b26 Fix: prof_recent_alloc_max_ctl_read() does not take tsd 2020-06-09 17:03:05 -07:00
David Goldblatt
d338dd45d7 Tcache: Make incremental gc bytes configurable. 2020-05-16 13:34:23 -07:00
David Goldblatt
181093173d Tcache: make slot sizing configurable. 2020-05-16 13:34:23 -07:00
David Goldblatt
97b7a9cf77 Add a fill/flush microbenchmark. 2020-05-16 13:34:23 -07:00
David Goldblatt
eda9c2858f Edata: zero stack edatas before initializing.
This avoids some UB. No compilers take advantage of it for now, but no sense in
tempting fate.
2020-05-14 10:30:20 -07:00
Yinan Zhang
fc052ff728 Migrate counter to use locked int 2020-05-12 08:23:15 -07:00
Yinan Zhang
039bfd4e30 Do not rollback prof idump counter in arena_prof_promote() 2020-05-11 12:24:56 -07:00
David Goldblatt
2c09d43494 Add a benchmark of large allocations. 2020-05-04 12:36:45 -07:00
David Goldblatt
79ae7f9211 Rtree: Remove the per-field accessors.
We instead split things into "edata" and "metadata".
2020-04-10 13:12:47 -07:00
David Goldblatt
26e9a3103d PA: Simple decay test. 2020-04-10 13:12:47 -07:00
David Goldblatt
dc26b30094 Rtree: Clean up compact/non-compact split. 2020-04-10 13:12:47 -07:00
David Goldblatt
294b276fc7 PA: Parameterize emap. Move emap_global to arena.
This lets us test the PA module without interfering with the global emap used by
the real allocator (the one not under test).
2020-04-10 13:12:47 -07:00
David Goldblatt
12eb888e54 Edata: Add a ranged bit.
We steal the dumpable bit, which we ended up not needing.
2020-04-10 13:12:47 -07:00
David Goldblatt
bd4fdf295e Rtree: Pull leaf contents into their own struct. 2020-04-10 13:12:47 -07:00
David Goldblatt
48a2cd6d79 Decay: Add a (mostly stub) test case. 2020-04-10 13:12:47 -07:00
David Goldblatt
bf55e58e63 Rename test/unit/decay -> test/unit/arena_decay.
This is really more of an end-to-end test at the arena level; it's not just of
the decay code in particular any more.
2020-04-10 13:12:47 -07:00
David Goldblatt
acd0bf6a26 PA: move in ecache_grow. 2020-04-10 13:12:47 -07:00
Yinan Zhang
c4e9ea8cc6 Get rid of locks in prof recent test 2020-04-07 17:22:24 -07:00
Yinan Zhang
2deabac079 Get rid of custom iterator for last-N records 2020-04-07 17:22:24 -07:00
David Goldblatt
8da6676a02 Don't do reentrant testing in junk tests. 2020-04-07 15:45:40 -07:00
Yinan Zhang
4b66297ea0 Add move constructor to ql module 2020-04-06 09:50:27 -07:00
Yinan Zhang
a62b7ed928 Add emptiness checking to ql module 2020-04-06 09:50:27 -07:00
Yinan Zhang
1dd24ca6d2 Add rotate functionality to ql module 2020-04-06 09:50:27 -07:00
Yinan Zhang
0dc95a882f Add concat and split functionality to ql module 2020-04-06 09:50:27 -07:00
Yinan Zhang
c9d56cddf2 Optimize meld in qr module
The goal of `qr_meld()` is to change the following four fields
`(a->prev, a->prev->next, b->prev, b->prev->next)` from the values
`(a->prev, a, b->prev, b)` to `(b->prev, b, a->prev, a)`.

This commit changes

```
a->prev->next = b;
b->prev->next = a;
temp = a->prev;
a->prev = b->prev;
b->prev = temp;
```

to

```
temp = a->prev;
a->prev = b->prev;
b->prev = temp;
a->prev->next = a;
b->prev->next = b;
```

The benefit is that we can use `b->prev->next` for `temp`, and so
there's no need to pass in `a_type`.

The restriction is that `b` cannot be a `qr_next()` macro, so users
of `qr_meld()` must pay attention.  (Before this change, neither `a`
nor `b` could be a `qr_next()` macro.)
2020-04-06 09:50:27 -07:00
Yinan Zhang
f9aad7a49b Add piping API to buffered writer 2020-04-01 09:41:20 -07:00
Yinan Zhang
09cd79495f Encapsulate buffer allocation failure in buffered writer 2020-04-01 09:41:20 -07:00
David T. Goldblatt
d936b46d3a Add malloc_conf_2_conf_harder
This comes in handy when you're just a user of a canary system who wants to
change settings set by the configuration system itself.
2020-03-31 06:25:08 -07:00
Yinan Zhang
2256ef8961 Add option to fetch system thread name on each prof sample 2020-03-24 21:39:57 -07:00
Yinan Zhang
ccdc70a5ce Fix: assertion could abort on past failures 2020-03-18 20:48:26 -07:00
David Goldblatt
2e5899c129 Stats: Fix tcache_bytes reporting.
Previously, large allocations in tcaches would have their sizes reduced during
stats estimation.  Added a test, which fails before this change but passes now.

This fixes a bug introduced in 5934846612, which
was itself fixing a bug introduced in 9c0549007d.
2020-03-13 07:53:34 -07:00
Yinan Zhang
a5780598b3 Remove thread_event_rollback() 2020-03-12 13:55:00 -07:00
David Goldblatt
99b1291d17 Edata cache: add edata_cache_small_t.
This can be used to amortize the synchronization costs of edata_cache accesses.
2020-03-12 11:58:09 -07:00
David Goldblatt
734109d9c2 Edata cache: add a unit test. 2020-03-12 11:58:09 -07:00
David Goldblatt
e732344ef1 Inspect test: Reduce checks when profiling is on.
Profiled small allocations don't live in bins, which is contrary to the test
expectation.
2020-03-12 11:58:09 -07:00
David Goldblatt
d701a085c2 Fast path: allow low-water mark changes.
This lets us put more allocations on an "almost as fast" path after a flush.
This results in around a 4% reduction in malloc cycles in prod workloads
(corresponding to about a 0.1% reduction in overall cycles).
2020-03-12 11:54:19 -07:00
David Goldblatt
370c1ea007 Cache bin: Write the unit test in terms of the API
I.e. stop allowing the unit test to have secret access to implementation
internals.
2020-03-12 11:54:19 -07:00
David Goldblatt
ff6acc6ed5 Cache bin: simplify names and argument ordering.
We always start with the cache bin, then its info (if necessary).
2020-03-12 11:54:19 -07:00
David Goldblatt
e1dcc557d6 Cache bin: Only take the relevant cache_bin_info_t
Previously, we took an array of cache_bin_info_ts and an index, and dereferenced
ourselves.  But infos for other cache_bins aren't relevant to any particular
cache bin, so that should be the caller's job.
2020-03-12 11:54:19 -07:00
David Goldblatt
74d36d78ef Cache bin: Make ncached_max a query on the info_t. 2020-03-12 11:54:19 -07:00
David Goldblatt
909c501b07 Cache_bin: Shouldn't know about tcache.
Instead, have it take the cache_bin_info_ts to use by pointer.  While we're
here, add a src file for the cache bin.
2020-03-12 11:54:19 -07:00
David Goldblatt
79f1ee2fc0 Move junking out of arena/tcache code.
This is debug only and we keep it off the fast path.  Moving it here simplifies
the internal logic.

This never tries to junk on regions that were shrunk via xallocx.  I think this
is fine for two reasons:
- The shrunk-with-xallocx case is rare.
- We don't always do that anyway before this diff (it depends on the opt
  settings and extent hooks in effect).
2020-03-12 11:54:19 -07:00
Yinan Zhang
4a78c6d81b Correct thread event unit test 2020-03-10 09:31:55 -07:00
Yinan Zhang
51bd147422 Make use of assert_* in test/unit/thread_event.c 2020-02-19 16:03:16 -08:00
Yinan Zhang
9d2cc3b0fa Make use of assert_* in test/unit/prof_recent.c 2020-02-19 16:03:16 -08:00
Yinan Zhang
a88d22ea11 Make use of assert_* in test/unit/inspect.c 2020-02-19 16:03:16 -08:00
Yinan Zhang
0ceb31184d Make use of assert_* in test/unit/buf_writer.c 2020-02-19 16:03:16 -08:00
Yinan Zhang
fa61579382 Add assert_* functionality to tests 2020-02-19 16:03:16 -08:00
Yinan Zhang
21dfa4300d Change assert_* to expect_* in tests
```
grep -Irl assert_ test/ | xargs sed -i \
    's/witness_assert/witness_do_not_replace/g';
grep -Irl assert_ test/ | xargs sed -i \
    's/malloc_mutex_assert_owner/malloc_mutex_do_not_replace_owner/g';

grep -Ir assert_ test/ | grep -o "[_a-zA-Z]*assert_[_a-zA-Z]*" | \
    grep -v "^assert_"; # confirm no output
grep -Irl assert_ test/ | xargs sed -i 's/assert_/expect_/g';

grep -Irl witness_do_not_replace test/ | xargs sed -i \
    's/witness_do_not_replace/witness_assert/g';
grep -Irl malloc_mutex_do_not_replace_owner test/ | xargs sed -i \
    's/malloc_mutex_do_not_replace_owner/malloc_mutex_assert_owner/g';
```
2020-02-19 16:03:16 -08:00
David T. Goldblatt
a0c1f4ac57 Rtree: take the base allocator as a parameter.
This facilitates better testing by avoiding mixing of the "real" base with the
base used by the rtree under test.
2020-02-18 11:22:09 -08:00
David Goldblatt
7e6c8a7286 Emap: Standardize naming.
Namespace everything under emap_, always specify what it is we're looking up
(emap_lookup -> emap_edata_lookup), and use "ctx" over "info".
2020-02-17 10:50:51 -08:00
David Goldblatt
ac50c1e44b Emap: Remove direct access to emap internals.
In the process, we do a few local cleanups and optimizations.  In particular,
the size safety check on tcache flush no longer does a redundant load.
2020-02-17 10:50:51 -08:00
David Goldblatt
9b5d105fc3 Emap: Move in iealloc.
This is logically scoped to the emap.
2020-02-17 10:50:51 -08:00
David Goldblatt
01f255161c Add emap, for tracking extent locking. 2020-02-17 10:50:51 -08:00
Yinan Zhang
68e8ddcaff Add mallctl for dumping last-N profiling records 2020-02-14 12:46:38 -08:00
Yinan Zhang
bc05ecebf6 Add const qualifier in assert_cmp() 2020-02-14 12:46:38 -08:00
Yinan Zhang
9cac3fa8f5 Encapsulate buffer allocation in buffered writer 2020-02-04 13:21:58 -08:00
Yinan Zhang
bdc08b5158 Better naming buffered writer 2020-02-04 13:21:58 -08:00
Qi Wang
e896522616 Abbreviate thread-event to te. 2020-02-04 13:07:05 -08:00
Qi Wang
97dd79db6c Implement deallocation events.
Make the event module to accept two event types, and pass around the event
context.  Use bytes-based events to trigger tcache GC on deallocation, and get
rid of the tcache ticker.
2020-02-04 00:18:15 -08:00
Qi Wang
88d9eca848 Enforce page alignment for sampled allocations.
This allows sampled allocations to be checked through alignment, therefore
enable sized deallocation regardless of cache_oblivious.
2020-01-31 00:04:22 -08:00
Qi Wang
88b0e03a4e Implement opt.stats_interval and the _opts options.
Add options stats_interval and stats_interval_opts to allow interval based stats
printing.  This provides an easy way to collect stats without code changes,
because opt.stats_print may not work (some binaries never exit).
2020-01-29 09:57:55 -08:00
David Goldblatt
6a622867ca Add "thread.idle" mallctl.
This can encapsulate various internal cleaning logic, and can be used to free up
resources before a long sleep.
2020-01-22 18:29:13 -08:00
Yinan Zhang
cd6e908241 Add stress test for last-N profiling mode 2020-01-21 16:51:26 -08:00
Yinan Zhang
a72ea0db60 Restructure and correct sleep utility for testing 2020-01-21 16:51:26 -08:00
Yinan Zhang
2b604a3016 Record request size in prof recent entries 2020-01-10 12:01:01 -08:00
Yinan Zhang
6d8e616902 Make buffered writer an independent module 2020-01-10 11:59:02 -08:00
Yinan Zhang
6b6b4709b3 Unify buffered writer naming 2020-01-09 14:31:31 -08:00
Yinan Zhang
9a60cf54ec Last-N profiling mode 2019-12-30 15:58:57 -08:00
Yinan Zhang
3fa142cf39 Remove _externs from prof internal header names 2019-12-23 11:14:15 -08:00
Yinan Zhang
ea42174d07 Refactor profiling headers 2019-12-20 17:17:48 -08:00
David Goldblatt
bb70df8e5b Extent refactor: Introduce ecache module.
This will eventually completely wrap the eset, and handle concurrency,
allocation, and deallocation.  For now, we only pull out the mutex from the
eset.
2019-12-20 10:18:40 -08:00
David Goldblatt
a7862df616 Rename extent_t to edata_t.
This frees us up from the unfortunate extent/extent2 naming collision.
2019-12-20 10:18:40 -08:00
David Goldblatt
403f2d1664 Extents: Split out introspection functionality.
This isn't really part of the core extent allocation facilities.  Especially as
this module grows, having it in its own place may come in handy.
2019-12-20 10:18:40 -08:00
David Goldblatt
c8dae890c8 Extent -> Ehooks: Move over default hooks. 2019-12-20 10:18:40 -08:00
Yinan Zhang
1d01e4c770 Initialization utilities for nstime 2019-12-16 16:08:56 -08:00
Qi Wang
dd649c9485 Optimize away the tsd_fast() check on fastpath.
Fold the tsd_state check onto the event threshold check.  The fast threshold is
set to 0 when tsd switch to non-nominal.

The fast_threshold can be reset by remote threads, to refect the non nominal tsd
state change.
2019-12-11 23:44:20 -08:00
Qi Wang
1decf958d1 Fix incorrect usage of cassert. 2019-12-11 14:02:59 -08:00
Yinan Zhang
aa1d71fb7a Rename prof_tctx to alloc_tctx in prof_info_t 2019-12-06 09:47:51 -08:00
Yinan Zhang
6945371778 Change tsdn to tsd for profiling code path 2019-11-22 16:31:56 -08:00
Yinan Zhang
b55419f9b9 Restructure profiling
Develop new data structure and code logic for holding profiling
related information stored in the extent that may be needed after the
extent is released, which in particular is the case for the
reallocation code path (e.g. in `rallocx()` and `xallocx()`).  The
data structure is a generalization of `prof_tctx_t`: we previously
only copy out the `prof_tctx` before the extent is released, but we
may be in need of additional fields. Currently the only additional
field is the allocation time field, but there may be more fields in
the future.

The restructuring also resolved a bug: `prof_realloc()` mistakenly
passed the new `ptr` to `prof_free_sampled_object()`, but passing in
the `old_ptr` would crash because it's already been released.  Now
the essential profiling information is collectively copied out early
and safely passed to `prof_free_sampled_object()` after the extent is
released.
2019-11-22 16:31:56 -08:00
Yinan Zhang
a8b578d538 Remove mallctl test for zero_realloc 2019-11-05 10:09:18 -08:00
Yinan Zhang
97f93fa0f2 Pull tcache GC events into thread event handler 2019-11-04 16:07:56 -08:00
Yinan Zhang
152c0ef954 Build a general purpose thread event handler 2019-11-04 11:15:50 -08:00
David T. Goldblatt
de81a4eada Add stats counters for number of zero reallocs 2019-10-29 17:48:44 -07:00
David T. Goldblatt
9cfa805947 Realloc: Make behavior of realloc(ptr, 0) configurable. 2019-10-29 17:48:44 -07:00
Yinan Zhang
bd6e28d6a3 Guard slabcur fetching in extent_util 2019-10-28 17:27:51 -07:00
Qi Wang
4094b7c03f Limit # of iters of test_bitmap_xfu.
Otherwise the test is too slow for higher page sizes such as 64k.
2019-10-09 11:15:37 -07:00
Yinan Zhang
beb7c16e94 Guard prof_active reset by opt_prof
Set `prof_active` to read-only when `opt_prof` is turned off.
2019-10-02 11:42:53 -07:00
David T. Goldblatt
820f070c6b Move page quantization to sz module. 2019-09-23 23:06:27 -07:00
David T. Goldblatt
41187bdfb0 Extents: Break extent-struct/arena interactions
Specifically, the extent_arena_[g|s]et functions and the address randomization.

These are the only things that tie the extent struct itself to the arena code.
2019-09-23 23:06:27 -07:00
zhxchen17
4b76c684bb Add "prof.dump_prefix" to override filename prefixes for dumps. 2019-09-12 22:26:03 -07:00
Qi Wang
22bc75ee3e Workaround the stringop-overflow check false positives. 2019-09-09 11:35:04 -07:00
Qi Wang
0043e68d4c Track low_water == -1 case explicitly.
The -1 value of low_water indicates if the cache has been depleted and
refilled.  Track the status explicitly in the tcache struct.

This allows the fast path to check if (cur_ptr > low_water), instead of >=,
which avoids reaching slow path when the last item is allocated.
2019-08-21 16:00:38 -07:00
Qi Wang
937ca1db9f Store ncached_max * ptr_size in tcache_bin_info.
With the cache bin metadata switched to pointers, ncached_max is usually
accessed and timed by sizeof(ptr). Store the results in tcache_bin_info for
direct access, and add a helper function for the ncached_max value.
2019-08-19 12:23:24 -07:00
Qi Wang
7599c82d48 Redesign the cache bin metadata for fast path.
Implement the pointer-based metadata for tcache bins --
- 3 pointers are maintained to represent each bin;
- 2 of the pointers are compressed on 64-bit;
- is_full / is_empty done through pointer comparison;

Comparing to the previous counter based design --
- fast-path speed up ~15% in benchmarks
- direct pointer comparison and de-reference
- no need to access tcache_bin_info in common case
2019-08-19 12:21:44 -07:00
Yinan Zhang
ad3f7dbfa0 Buffer prof_log_stop
Make use of the new buffered writer for the output of `prof_log_stop`.
2019-08-12 09:06:01 -07:00
Yinan Zhang
8c8466fa6e Add compact json option for emitter
JSON format is largely meant for machine-machine communication, so
adding the option to the emitter.  According to local testing, the
savings in terms of bytes outputted is around 50% for stats printing
and around 25% for prof log printing.
2019-08-09 09:53:41 -07:00
Yinan Zhang
7fc6b1b259 Add buffered writer
The buffered writer adopts a signature identical to `write_cb`,
so that it can be plugged into anywhere `write_cb` appears.
2019-08-09 09:44:29 -07:00
Qi Wang
10fcff6c38 Lower nthreads in test/unit/retained on 32-bit to avoid OOM. 2019-07-25 13:10:03 -07:00
Qi Wang
9a86c65abc Implement retain on Windows.
The VirtualAlloc and VirtualFree APIs are different because MEM_DECOMMIT cannot
be used across multiple VirtualAlloc regions.  To properly support decommit,
only allow merge / split within the same region -- this is done by tracking the
"is_head" state of extents and not merging cross-region.

Add a new state is_head (only relevant for retain && !maps_coalesce), which is
true for the first extent in each VirtualAlloc region.  Determine if two extents
can be merged based on the head state, and use serial numbers for sanity checks.
2019-07-23 22:18:55 -07:00
Qi Wang
f32f23d6cc Fix posix_memalign with input size 0.
Return a valid pointer instead of failed assertion.
2019-07-18 00:43:23 -07:00
Yinan Zhang
c92ac30601 Add confirm_conf option
If the confirm_conf option is set, when the program starts, each of
the four malloc_conf strings will be printed, and each option will
be printed when being set.
2019-05-22 09:38:39 -07:00
Yinan Zhang
4c63b0e76a Improve memory utilization tests
Added tests for large size classes and expanded the tests to
cover wider range of allocation sizes.
2019-05-21 12:57:06 -07:00
Doron Roberts-Kedes
7fc4f2a32c Add nonfull_slabs to bin_stats_t.
When config_stats is enabled track the size of bin->slabs_nonfull in
the new nonfull_slabs counter in bin_stats_t. This metric should be
useful for establishing an upper ceiling on the savings possible by
meshing.
2019-04-29 13:35:02 -07:00
David Goldblatt
33e1dad680 Safety checks: Add a redzoning feature. 2019-04-15 16:48:12 -07:00
Yinan Zhang
7ee3897740 Separate tests for extent utilization API
As title.
2019-04-10 13:03:20 -07:00
Qi Wang
c2a3a7cd3f Fix test/unit/prof_log
Compiler optimizations may produce traces more than expected.  Instead verify
the lower bound only.
2019-04-05 13:47:10 -07:00
Yinan Zhang
9aab3f2be0 Add memory utilization analytics to mallctl
The analytics tool is put under experimental.utilization namespace in
mallctl.  Input is one pointer or an array of pointers and the output
is a list of memory utilization statistics.
2019-04-04 13:48:39 -07:00
Qi Wang
6fe11633b0 Fix the binshard unit test.
The test attempts to trigger usage of multiple sharded bins, which percpu_arena
makes it less reliable.
2019-04-02 16:53:00 -07:00
Qi Wang
fb56766ca9 Eagerly purge oversized merged extents.
This change improves memory usage slightly, at virtually no CPU cost.
2019-03-14 17:34:55 -07:00
Qi Wang
e3db480f6f Rename huge_threshold to oversize_threshold.
The keyword huge tend to remind people of huge pages which is not relevent to
the feature.
2019-01-25 13:15:45 -08:00
Qi Wang
d3145014a0 Explicitly use arena 0 in alignment and OOM tests.
This helps us avoid issues with size based routing (i.e. the huge_threshold
feature).
2019-01-24 13:29:23 -08:00
Qi Wang
7a815c1b7c Un-experimental the huge_threshold feature. 2019-01-16 12:28:57 -08:00
Qi Wang
441335d924 Add unit test for producer-consumer pattern. 2018-12-18 15:09:53 -08:00
Qi Wang
711a61f3b4 Add unit test for sharded bins. 2018-12-03 17:17:03 -08:00
Qi Wang
45bb4483ba Add stats for arenas.bin.i.nshards. 2018-12-03 17:17:03 -08:00
Qi Wang
43f3b1ad0c Deprecate OSSpinLock. 2018-11-14 08:44:05 -08:00
Dave Watson
2b112ea593 add test for zero-sized alloc and aligned alloc 2018-10-17 08:50:58 -07:00
gnzlbg
01e2a38e5a Make smallocx symbol name depend on the JEMALLOC_VERSION_GID
This comments concatenates the `JEMALLOC_VERSION_GID` to the
`smallocx` symbol name, such that the symbol ends up exported
as `smallocx_{git_hash}`.
2018-10-17 07:12:28 -07:00
gnzlbg
741fca1bb7 Hide smallocx even when enabled from the library API
The experimental `smallocx` API is not exposed via header files,
requiring the users to peek at `jemalloc`'s source code to manually
add the external declarations to their own programs.

This should reinforce that `smallocx` is experimental, and that `jemalloc`
does not offer any kind of backwards compatiblity or ABI gurantees for it.
2018-10-17 07:12:28 -07:00