Commit Graph

94 Commits

Author SHA1 Message Date
Jason Evans
3a81cbd2d4 Dump heap profile backtraces in a stable order.
Also iterate over per thread stats in a stable order, which prepares the
way for stable ordering of per thread heap profile dumps.
2014-08-19 21:05:54 -07:00
Jason Evans
ab532e9799 Directly embed prof_ctx_t's bt. 2014-08-19 21:05:54 -07:00
Jason Evans
b41ccdb125 Convert prof_tdata_t's bt2cnt to a comprehensive map.
Treat prof_tdata_t's bt2cnt as a comprehensive map of the thread's
extant allocation samples (do not limit the total number of entries).
This helps prepare the way for per thread heap profiling.
2014-08-19 21:05:54 -07:00
Chris Peterson
3e310b34eb Fix -Wsign-compare warnings 2014-06-02 07:51:33 -07:00
Jason Evans
6f001059aa Simplify backtracing.
Simplify backtracing to not ignore any frames, and compensate for this
in pprof in order to increase flexibility with respect to function-based
refactoring even in the presence of non-deterministic inlining.  Modify
pprof to blacklist all jemalloc allocation entry points including
non-standard ones like mallocx(), and ignore all allocator-internal
frames.  Prior to this change, pprof excluded the specifically
blacklisted functions from backtraces, but it left allocator-internal
frames intact.
2014-04-22 20:55:09 -07:00
Lucian Adrian Grijincu
9d4e13f45a prof_backtrace: use unw_backtrace
unw_backtrace:
- does internal per-thread caching
- doesn't acquire an internal lock
2014-04-22 18:39:47 -07:00
Ben Maurer
6c39f9e059 refactor profiling. only use a bytes till next sample variable. 2014-04-16 13:43:30 -07:00
Jason Evans
9b0cbf0850 Remove support for non-prof-promote heap profiling metadata.
Make promotion of sampled small objects to large objects mandatory, so
that profiling metadata can always be stored in the chunk map, rather
than requiring one pointer per small region in each small-region page
run.  In practice the non-prof-promote code was only useful when using
jemalloc to track all objects and report them as leaks at program exit.
However, Valgrind is at least as good a tool for this particular use
case.

Furthermore, the non-prof-promote code is getting in the way of
some optimizations that will make heap profiling much cheaper for the
predominant use case (sampling a small representative proportion of all
allocations).
2014-04-11 14:24:51 -07:00
Harald Weppner
c2da2591be Consistently use debug lib(s) if present
Fixes a situation where nm uses the debug lib but
addr2line does not, which completely messes up the symbol
lookup.
2014-03-28 13:47:59 -07:00
Harald Weppner
bf543df20c Enable profiling / leak detection in FreeBSD
* Assumes procfs is mounted at /proc, cf.
  <http://www.freebsd.org/doc/en/articles/linux-users/procfs.html>
2014-03-17 23:53:00 -07:00
Jason Evans
772163b4f3 Add heap profiling tests.
Fix a regression in prof_dump_ctx() due to an uninitized variable.  This
was caused by revision 4f37ef693e, so no
releases are affected.
2014-01-17 15:40:52 -08:00
Jason Evans
eefdd02e70 Fix a variable prototype/definition mismatch. 2014-01-16 18:04:30 -08:00
Jason Evans
4f37ef693e Refactor prof_dump() to reduce contention.
Refactor prof_dump() to use a two pass algorithm, and prof_leave() prior
to the second pass.  This avoids write(2) system calls while holding
critical prof resources.

Fix prof_dump() to close the dump file descriptor for all relevant error
paths.

Minimize the size of prof-related static buffers when prof is disabled.
This saves roughly 65 KiB of application memory for non-prof builds.

Refactor prof_ctx_init() out of prof_lookup_global().
2014-01-16 13:36:38 -08:00
Jason Evans
fb1775e47e Refactor prof_lookup() by extracting prof_lookup_global(). 2014-01-14 17:04:34 -08:00
Jason Evans
239692b18e Fix whitespace. 2013-10-28 12:41:37 -07:00
Jason Evans
93f39f8d23 Fix a file descriptor leak in a prof_dump_maps() error path.
Reported by Pat Lynch.
2013-10-21 15:07:40 -07:00
Jason Evans
f1c3da8b02 Consistently use malloc_mutex_prefork().
Consistently use malloc_mutex_prefork() instead of malloc_mutex_lock()
in all prefork functions.
2013-10-21 14:59:10 -07:00
Jason Evans
6556e28be1 Prefer not_reached() over assert(false) where appropriate. 2013-10-21 14:56:27 -07:00
Jason Evans
bbe29d374d Fix potential TLS-related memory corruption.
Avoid writing to uninitialized TLS as a side effect of deallocation.
Initializing TLS during deallocation is unsafe because it is possible
that a thread never did any allocation, and that TLS has already been
deallocated by the threads library, resulting in write-after-free
corruption.  These fixes affect prof_tdata and quarantine; all other
uses of TLS are already safe, whether intentionally (as for tcache) or
unintentionally (as for arenas).
2013-01-31 14:23:48 -08:00
Jason Evans
ae03bf6a57 Update hash from MurmurHash2 to MurmurHash3.
Update hash from MurmurHash2 to MurmurHash3, primarily because the
latter generates 128 bits in a single call for no extra cost, which
simplifies integration with cuckoo hashing.
2013-01-22 12:02:08 -08:00
Jason Evans
a3b3386ddd Avoid arena_prof_accum()-related locking when possible.
Refactor arena_prof_accum() and its callers to avoid arena locking when
prof_interval is 0 (as when profiling is disabled).

Reported by Ben Maurer.
2012-11-13 13:47:53 -08:00
Jason Evans
20f1fc95ad Fix fork(2)-related deadlocks.
Add a library constructor for jemalloc that initializes the allocator.
This fixes a race that could occur if threads were created by the main
thread prior to any memory allocation, followed by fork(2), and then
memory allocation in the child process.

Fix the prefork/postfork functions to acquire/release the ctl, prof, and
rtree mutexes.  This fixes various fork() child process deadlocks, but
one possible deadlock remains (intentionally) unaddressed: prof
backtracing can acquire runtime library mutexes, so deadlock is still
possible if heap profiling is enabled during fork().  This deadlock is
known to be a real issue in at least the case of libgcc-based
backtracing.

Reported by tfengjun.
2012-10-09 15:21:46 -07:00
Jason Evans
f278994029 Fix more prof_tdata resurrection corner cases. 2012-04-28 23:27:13 -07:00
Jason Evans
0050a0f7e6 Handle prof_tdata resurrection.
Handle prof_tdata resurrection during thread shutdown, similarly to how
tcache and quarantine handle resurrection.
2012-04-28 18:14:24 -07:00
Jason Evans
95ff6aadca Don't set prof_tdata during thread cleanup.
Don't set prof_tdata during thread cleanup, because doing so will cause
the cleanup function to be called again, the second time with a NULL
argument.
2012-04-28 14:15:28 -07:00
Jason Evans
52386b2dc6 Fix heap profiling bugs.
Fix a potential deadlock that could occur during interval- and
growth-triggered heap profile dumps.

Fix an off-by-one heap profile statistics bug that could be observed in
interval- and growth-triggered heap profiles.

Fix heap profile dump filename sequence numbers (regression during
conversion to malloc_snprintf()).
2012-04-22 16:00:11 -07:00
Jason Evans
0b25fe79aa Update prof defaults to match common usage.
Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB).

Change the "opt.prof_accum" default from true to false.

Add the "opt.prof_final" mallctl, so that "opt.prof_prefix" need not be
abused to disable final profile dumping.
2012-04-17 16:39:33 -07:00
Jason Evans
a1ee7838e1 Rename labels.
Rename labels from FOO to label_foo in order to avoid system macro
definitions, in particular OUT and ERROR on mingw.

Reported by Mike Hommey.
2012-04-10 15:07:44 -07:00
Jason Evans
ae4c7b4b40 Clean up *PAGE* macros.
s/PAGE_SHIFT/LG_PAGE/g and s/PAGE_SIZE/PAGE/g.

Remove remnants of the dynamic-page-shift code.

Rename the "arenas.pagesize" mallctl to "arenas.page".

Remove the "arenas.chunksize" mallctl, which is redundant with
"opt.lg_chunk".
2012-04-02 07:04:34 -07:00
Jason Evans
6da5418ded Remove ephemeral mutexes.
Remove ephemeral mutexes from the prof machinery, and remove
malloc_mutex_destroy().  This simplifies mutex management on systems
that call malloc()/free() inside pthread_mutex_{create,destroy}().

Add atomic_*_u() for operation on unsigned values.

Fix prof_printf() to call malloc_vsnprintf() rather than
malloc_snprintf().
2012-03-23 18:05:51 -07:00
Jason Evans
cd9a1346e9 Implement tsd.
Implement tsd, which is a TLS/TSD abstraction that uses one or both
internally.  Modify bootstrapping such that no tsd's are utilized until
allocation is safe.

Remove malloc_[v]tprintf(), and use malloc_snprintf() instead.

Fix %p argument size handling in malloc_vsnprintf().

Fix a long-standing statistics-related bug in the "thread.arena"
mallctl that could cause crashes due to linked list corruption.
2012-03-23 15:14:55 -07:00
Jason Evans
e24c7af35d Invert NO_TLS to JEMALLOC_TLS. 2012-03-19 10:21:17 -07:00
Jason Evans
2bb6c7a632 s/PRIx64/PRIxPTR/ for uintptr_t printf() argument. 2012-03-12 13:38:55 -07:00
Jason Evans
d81e4bdd5c Implement malloc_vsnprintf().
Implement malloc_vsnprintf() (a subset of vsnprintf(3)) as well as
several other printing functions based on it, so that formatted printing
can be relied upon without concern for inducing a dependency on floating
point runtime support.  Replace malloc_write() calls with
malloc_*printf() where doing so simplifies the code.

Add name mangling for library-private symbols in the data and BSS
sections.  Adjust CONF_HANDLE_*() macros in malloc_conf_init() to expose
all opt_* variable use to cpp so that proper mangling occurs.
2012-03-07 16:19:19 -08:00
Jason Evans
b8c8be7f8a Use UINT64_C() rather than LLU for 64-bit constants. 2012-03-05 12:26:26 -08:00
Jason Evans
84f7cdb0c5 Rename prn to prng.
Rename prn to prng so that Windows doesn't choke when trying to create
a file named prn.h.
2012-03-02 15:59:45 -08:00
Jason Evans
5389146191 Remove the opt.lg_prof_bt_max option.
Remove opt.lg_prof_bt_max, and hard code it to 7.  The original
intention of this option was to enable faster backtracing by limiting
backtrace depth.  However, this makes graphical pprof output very
difficult to interpret.  In practice, decreasing sampling frequency is a
better mechanism for limiting profiling overhead.
2012-02-13 18:41:36 -08:00
Jason Evans
0b526ff94d Remove the opt.lg_prof_tcmax option.
Remove the opt.lg_prof_tcmax option and hard-code a cache size of 1024.
This setting is something that users just shouldn't have to worry about.
If lock contention actually ends up being a problem, the simple solution
available to the user is to reduce sampling frequency.
2012-02-13 18:04:26 -08:00
Jason Evans
7372b15a31 Reduce cpp conditional logic complexity.
Convert configuration-related cpp conditional logic to use static
constant variables, e.g.:

  #ifdef JEMALLOC_DEBUG
    [...]
  #endif

becomes:

  if (config_debug) {
    [...]
  }

The advantage is clearer, more concise code.  The main disadvantage is
that data structures no longer have conditionally defined fields, so
they pay the cost of all fields regardless of whether they are used.  In
practice, this is only a minor concern; config_stats will go away in an
upcoming change, and config_prof is the only other major feature that
depends on more than a few special-purpose fields.
2012-02-10 20:22:09 -08:00
Jason Evans
a9076c9483 Fix a prof-related race condition.
Fix prof_lookup() to artificially raise curobjs for all paths through
the code that creates a new entry in the per thread bt2cnt hash table.
This fixes a race condition that could corrupt memory if prof_accum were
false, and a non-default lg_prof_tcmax were used and/or threads were
destroyed.
2011-08-30 23:40:11 -07:00
Jason Evans
0cdd42eb32 Clean up prof-related comments.
Clean up some prof-related comments to more accurately reflect how the
code works.

Simplify OOM handling code in a couple of prof-related error paths.
2011-08-09 19:06:06 -07:00
Jason Evans
41b954ed36 Use prof_tdata_cleanup() argument.
Use the argument to prof_tdata_cleanup(), rather than calling
PROF_TCACHE_GET().  This fixes a bug in the NO_TLS case.
2011-08-08 17:10:07 -07:00
Jason Evans
f0b22cf932 Use LLU suffix for all 64-bit constants.
Add the LLU suffix for all 0x... 64-bit constants.

Reported by Jakob Blomer.
2011-05-22 10:49:44 -07:00
Jason Evans
7427525c28 Move repo contents in jemalloc/ to top level. 2011-03-31 20:36:17 -07:00