Commit Graph

37 Commits

Author SHA1 Message Date
Yinan Zhang
ba783b3a0f Remove prof -> thread_event dependency 2020-03-12 13:55:00 -07:00
Yinan Zhang
441d88d1c7 Rewrite profiling thread event 2020-03-12 13:55:00 -07:00
David Goldblatt
7e6c8a7286 Emap: Standardize naming.
Namespace everything under emap_, always specify what it is we're looking up
(emap_lookup -> emap_edata_lookup), and use "ctx" over "info".
2020-02-17 10:50:51 -08:00
Qi Wang
88d9eca848 Enforce page alignment for sampled allocations.
This allows sampled allocations to be checked through alignment, therefore
enable sized deallocation regardless of cache_oblivious.
2020-01-31 00:04:22 -08:00
Yinan Zhang
2b604a3016 Record request size in prof recent entries 2020-01-10 12:01:01 -08:00
Yinan Zhang
9a60cf54ec Last-N profiling mode 2019-12-30 15:58:57 -08:00
Yinan Zhang
7a27a05940 Delete tdata states used for cleanup 2019-12-30 15:58:57 -08:00
Yinan Zhang
4afd709d1f Restructure setters for profiling info
Explicitly define three setters:

- `prof_tctx_reset()`: set `prof_tctx` to `1U`, if we don't know in
advance whether the allocation is large or not;
- `prof_tctx_reset_sampled()`: set `prof_tctx` to `1U`, if we already
know in advance that the allocation is large;
- `prof_info_set()`: set a real `prof_tctx`, and also set other
profiling info e.g. the allocation time.

Code structure wise, the prof level is kept as a thin wrapper, the
large level only provides low level setter APIs, and the arena level
carries out the main logic.
2019-12-17 10:01:28 -08:00
Yinan Zhang
45836d7fd3 Pass nstime_t pointer for profiling 2019-12-11 11:38:16 -08:00
Yinan Zhang
055478cca8 Threshold is no longer updated before prof_realloc() 2019-12-10 16:31:05 -08:00
Yinan Zhang
dfdd46f6c1 Refactor prof_tctx_t creation 2019-12-06 09:47:51 -08:00
Yinan Zhang
aa1d71fb7a Rename prof_tctx to alloc_tctx in prof_info_t 2019-12-06 09:47:51 -08:00
Yinan Zhang
5e0b090992 No need to pass usize to prof_tctx_set() 2019-12-06 09:47:51 -08:00
Yinan Zhang
6945371778 Change tsdn to tsd for profiling code path 2019-11-22 16:31:56 -08:00
Yinan Zhang
b55419f9b9 Restructure profiling
Develop new data structure and code logic for holding profiling
related information stored in the extent that may be needed after the
extent is released, which in particular is the case for the
reallocation code path (e.g. in `rallocx()` and `xallocx()`).  The
data structure is a generalization of `prof_tctx_t`: we previously
only copy out the `prof_tctx` before the extent is released, but we
may be in need of additional fields. Currently the only additional
field is the allocation time field, but there may be more fields in
the future.

The restructuring also resolved a bug: `prof_realloc()` mistakenly
passed the new `ptr` to `prof_free_sampled_object()`, but passing in
the `old_ptr` would crash because it's already been released.  Now
the essential profiling information is collectively copied out early
and safely passed to `prof_free_sampled_object()` after the extent is
released.
2019-11-22 16:31:56 -08:00
Yinan Zhang
73510dfd15 Revert "Fix bug in prof_realloc"
This reverts commit 3b5eecf102.
2019-11-15 15:13:39 -08:00
Yinan Zhang
3b5eecf102 Fix bug in prof_realloc
We should pass in `old_ptr` rather than the new `ptr` to
`prof_free_sampled_object()` when `old_ptr` points to a sampled
allocation.
2019-11-15 13:28:33 -08:00
Qi Wang
da50d8ce87 Refactor and optimize prof sampling initialization.
Makes the prof sample prng use the tsd prng_state.  This allows us to properly
initialize the sample interval event, without having to create tdata.  As a
result, tdata will be created on demand (when a thread reaches the sample
interval bytes allocated), instead of on the first allocation.
2019-11-11 10:35:37 -08:00
Yinan Zhang
152c0ef954 Build a general purpose thread event handler 2019-11-04 11:15:50 -08:00
Yinan Zhang
4fbbc817c1 Simplify time setting and getting for prof log 2019-10-16 09:24:52 -07:00
Yinan Zhang
66e07f986d Suppress tdata creation in reentrancy
This change suppresses tdata initialization and prof sample threshold
update in interrupting malloc calls.  Interrupting calls have no need
for tdata.  Delaying tdata creation aligns better with our lazy tdata
creation principle, and it also helps us gain control back from
interrupting calls more quickly and reduces any risk of delegating
tdata creation to an interrupting call.
2019-10-04 08:52:50 -07:00
Yinan Zhang
93d6151800 Pass tsd down to prof_backtrace() 2019-09-05 10:57:43 -07:00
David Goldblatt
33e1dad680 Safety checks: Add a redzoning feature. 2019-04-15 16:48:12 -07:00
Dave Watson
936bc2aa15 prof: Fix memory regression
The diff 'refactor prof accum...' moved the bytes_until_sample
subtraction before the load of tdata.  If tdata is null,
tdata_get(true) will overwrite bytes_until_sample, but we
still sample the current allocation.   Instead, do the subtraction
and check logic again, to keep the previous behavior.

blame-rev: 0ac524308d
2018-10-23 12:39:57 -07:00
Dave Watson
997d86acc6 restrict bytes_until_sample to int64_t. This allows optimal asm
generation of sub bytes_until_sample, usize; je; for x86 arch.
Subtraction is unconditional, and only flags are checked for the jump,
no extra compare is necessary.  This also reduces register pressure.
2018-10-15 08:24:12 -07:00
Dave Watson
0ac524308d refactor prof accum, so that tdata is not loaded if we aren't going to sample. 2018-10-15 08:24:12 -07:00
Dave Watson
9ed3bdc848 move bytes until sample to tsd. Fastpath allocation does not need
to load tdata now, avoiding several branches.
2018-10-15 08:24:12 -07:00
Tyler Etzel
b664bd7935 Add logging for sampled allocations
- prof_opt_log flag starts logging automatically at runtime
- prof_log_{start,stop} mallctl for manual control
2018-08-01 13:27:11 -07:00
Qi Wang
2dccf45640 Control idump and gdump with prof_active. 2018-04-09 16:35:14 -07:00
David Goldblatt
8261e581be Header refactoring: Pull size helpers out of jemalloc module. 2017-05-31 13:08:45 -07:00
David Goldblatt
209f2926b8 Header refactoring: tsd - cleanup and dependency breaking.
This removes the tsd macros (which are used only for tsd_t in real builds).  We
break up the circular dependencies involving tsd.

We also move all tsd access through getters and setters.  This allows us to
assert that we only touch data when tsd is in a valid state.

We simplify the usages of the x macro trick, removing all the customizability
(get/set, init, cleanup), moving the lifetime logic to tsd_init and tsd_cleanup.
This lets us make initialization order independent of order within tsd_t.
2017-05-01 10:49:56 -07:00
Qi Wang
05775a3736 Avoid prof_dump during reentrancy. 2017-04-25 12:54:36 -07:00
David Goldblatt
4d2e4bf5eb Get rid of most of the various inline macros. 2017-04-24 10:33:21 -07:00
Qi Wang
ccfe68a916 Pass alloc_ctx down profiling path.
With this change, when profiling is enabled, we avoid doing redundant rtree
lookups. Also changed dalloc_atx_t to alloc_atx_t, as it's now used on
allocation path as well (to speed up profiling).
2017-04-12 13:55:39 -07:00
Jason Evans
5e67fbc367 Push down iealloc() calls.
Call iealloc() as deep into call chains as possible without causing
redundant calls.
2017-03-22 18:33:32 -07:00
Jason Evans
4f341412e5 Remove extent arg from isalloc() and arena_salloc(). 2017-03-22 18:33:32 -07:00
Jason Evans
fa2d64c94b Convert arena->prof_accumbytes synchronization to atomics. 2017-02-16 09:39:46 -08:00