Commit Graph

792 Commits

Author SHA1 Message Date
Jason Evans
6fd53da030 Fix prof_tdata_get()-related regressions.
Fix prof_tdata_get() to avoid dereferencing an invalid tdata pointer
(when it's PROF_TDATA_STATE_{REINCARNATED,PURGATORY}).

Fix prof_tdata_get() callers to check for invalid results besides NULL
(PROF_TDATA_STATE_{REINCARNATED,PURGATORY}).

These regressions were caused by
602c8e0971 (Implement per thread heap
profiling.), which did not make it into any releases prior to these
fixes.
2014-09-09 15:29:34 -07:00
Jason Evans
7c17e1670d Fix threaded heap profile bug in pprof.
Fix ReadThreadedHeapProfile to pass the correct parameters to
AdjustSamples.
2014-09-09 15:29:34 -07:00
Jason Evans
a2260c95cd Fix sdallocx() assertion.
Refactor sdallocx() and nallocx() to share inallocx(), and fix an
sdallocx() assertion to check usize rather than size.
2014-09-09 10:39:15 -07:00
Bert Maher
d95e704fea Support threaded heap profiles in pprof
- Add a --thread N option to select profile for thread N (otherwise, all
      threads will be printed)
    - The $profile map now has a {threads} element that is a map from thread id to
      a profile that has the same format as the {profile} element
    - Refactor ReadHeapProfile into smaller components and use them to implement
      ReadThreadedHeapProfile
2014-09-09 10:01:35 -07:00
Jason Evans
ffe93419d5 Merge pull request #115 from thestinger/isqalloct
fix isqalloct (should call isdalloct)
2014-09-08 20:19:08 -07:00
Daniel Micay
a62812eacc fix isqalloct (should call isdalloct) 2014-09-08 21:46:17 -04:00
Daniel Micay
4cfe55166e Add support for sized deallocation.
This adds a new `sdallocx` function to the external API, allowing the
size to be passed by the caller.  It avoids some extra reads in the
thread cache fast path.  In the case where stats are enabled, this
avoids the work of calculating the size from the pointer.

An assertion validates the size that's passed in, so enabling debugging
will allow users of the API to debug cases where an incorrect size is
passed in.

The performance win for a contrived microbenchmark doing an allocation
and immediately freeing it is ~10%.  It may have a different impact on a
real workload.

Closes #28
2014-09-08 17:34:24 -07:00
Jason Evans
c3f8650749 Add relevant function attributes to [msn]allocx(). 2014-09-08 16:47:51 -07:00
Jason Evans
a1f3929ffd Thwart optimization of free(malloc(1)) in microbench. 2014-09-08 16:23:48 -07:00
Jason Evans
c54f93f186 Merge pull request #114 from thestinger/timer
avoid conflict with the POSIX timer_t type
2014-09-07 22:41:47 -07:00
Daniel Micay
c3bfe9569a avoid conflict with the POSIX timer_t type
It hits a compilation error with glibc 2.19 without a rename.
2014-09-08 01:20:44 -04:00
Jason Evans
423d78a21b Add microbench tests. 2014-09-07 19:58:04 -07:00
Jason Evans
b67ec3c497 Add a simple timer implementation for use in benchmarking. 2014-09-07 19:57:24 -07:00
Jason Evans
82e88d1ecf Move typedefs from jemalloc_protos.h.in to jemalloc_typedefs.h.in.
Move typedefs from jemalloc_protos.h.in to jemalloc_typedefs.h.in, so
that typedefs aren't redefined when compiling stress tests.
2014-09-07 19:55:03 -07:00
Jason Evans
b718cf77e9 Optimize [nmd]alloc() fast paths.
Optimize [nmd]alloc() fast paths such that the (flags == 0) case is
streamlined, flags decoding only happens to the minimum degree
necessary, and no conditionals are repeated.
2014-09-07 14:40:19 -07:00
Jason Evans
c21b05ea09 Whitespace cleanups. 2014-09-04 22:27:26 -07:00
Qinfan Wu
ff6a31d3b9 Refactor chunk map.
Break the chunk map into two separate arrays, in order to improve cache
locality. This is related to issue #23.
2014-09-04 22:22:52 -07:00
Jason Evans
f34f6037e8 Disable autom4te cache. 2014-09-02 17:49:29 -07:00
Jason Evans
a5a658ab48 Make VERSION generation more robust.
Relax the "are we in a git repo?" check to succeed even if the top level
jemalloc directory is not at the top level of the git repo.

Add git tag filtering so that only version triplets match when
generating VERSION.

Add fallback bogus VERSION creation, so that in the worst case, rather
than generating empty values for e.g. JEMALLOC_VERSION_MAJOR,
configuration ends up generating useless constants.
2014-09-02 15:07:07 -07:00
Jason Evans
3ebf6db2c7 Merge pull request #108 from wqfish/dev
Remove junk filling in tcache_bin_flush_small().
2014-08-27 12:04:01 -07:00
Qinfan Wu
58799f6d1c Remove junk filling in tcache_bin_flush_small().
Junk filling is done in arena_dalloc_bin_locked(), so arena_alloc_junk_small()
is redundant. Also, we should use arena_dalloc_junk_small() instead of
arena_alloc_junk_small().
2014-08-26 21:28:31 -07:00
Sara Golemon
3e24afa28e Test for availability of malloc hooks via autoconf
__*_hook() is glibc, but on at least one glibc platform (homebrew),
the __GLIBC__ define isn't set correctly and we miss being able to
use these hooks.

Do a feature test for it during configuration so that we enable it
anywhere the hooks are actually available.
2014-08-22 15:19:21 -07:00
Jason Evans
602c8e0971 Implement per thread heap profiling.
Rename data structures (prof_thr_cnt_t-->prof_tctx_t,
prof_ctx_t-->prof_gctx_t), and convert to storing a prof_tctx_t for
sampled objects.

Convert PROF_ALLOC_PREP() to prof_alloc_prep(), since precise backtrace
depth within jemalloc functions is no longer an issue (pprof prunes
irrelevant frames).

Implement mallctl's:
- prof.reset implements full sample data reset, and optional change of
  sample interval.
- prof.lg_sample reads the current sample interval (opt.lg_prof_sample
  was the permanent source of truth prior to prof.reset).
- thread.prof.name provides naming capability for threads within heap
  profile dumps.
- thread.prof.active makes it possible to activate/deactivate heap
  profiling for individual threads.

Modify the heap dump files to contain per thread heap profile data.
This change is incompatible with the existing pprof, which will require
enhancements to read and process the enriched data.
2014-08-19 21:31:16 -07:00
Jason Evans
1628e8615e Add rb_empty(). 2014-08-19 21:05:54 -07:00
Jason Evans
3a81cbd2d4 Dump heap profile backtraces in a stable order.
Also iterate over per thread stats in a stable order, which prepares the
way for stable ordering of per thread heap profile dumps.
2014-08-19 21:05:54 -07:00
Jason Evans
ab532e9799 Directly embed prof_ctx_t's bt. 2014-08-19 21:05:54 -07:00
Jason Evans
b41ccdb125 Convert prof_tdata_t's bt2cnt to a comprehensive map.
Treat prof_tdata_t's bt2cnt as a comprehensive map of the thread's
extant allocation samples (do not limit the total number of entries).
This helps prepare the way for per thread heap profiling.
2014-08-19 21:05:54 -07:00
Jason Evans
586c8ede42 Fix arena.<i>.dss mallctl to handle read-only calls. 2014-08-15 12:20:20 -07:00
Jason Evans
070b3c3fbd Fix and refactor runs_dirty-based purging.
Fix runs_dirty-based purging to also purge dirty pages in the spare
chunk.

Refactor runs_dirty manipulation into arena_dirty_{insert,remove}(), and
move the arena->ndirty accounting into those functions.

Remove the u.ql_link field from arena_chunk_map_t, and get rid of the
enclosing union for u.rb_link, since only rb_link remains.

Remove the ndirty field from arena_chunk_t.
2014-08-14 14:45:58 -07:00
Qinfan Wu
e8a2fd83a2 arena->npurgatory is no longer needed since we drop arena's lock
after stashing all the purgeable runs.
2014-08-12 09:50:01 -07:00
Qinfan Wu
90737fcda1 Remove chunks_dirty tree, nruns_avail and nruns_adjac since we no
longer need to maintain the tree for dirty page purging.
2014-08-12 09:50:00 -07:00
Qinfan Wu
e970800c78 Purge dirty pages from the beginning of the dirty list. 2014-08-12 09:50:00 -07:00
Qinfan Wu
a244e5078e Add dirty page counting for debug 2014-08-12 09:50:00 -07:00
Qinfan Wu
04d60a132b Maintain all the dirty runs in a linked list for each arena 2014-08-12 09:50:00 -07:00
Jason Evans
dd03242da9 Merge pull request #105 from psi-mankoski/dev
Set VERSION also when the source directory is a git submodule using a "....
2014-08-11 17:53:40 -07:00
Psi Mankoski
011dde96c5 Set VERSION also when the source directory is a git submodule using a ".git" file pointing to the repo. directory. 2014-08-11 17:08:25 -07:00
Jason Evans
1522937e9c Fix the cactive statistic.
Fix the cactive statistic to decrease (rather than increase) when active
memory decreases.  This regression was introduced by
aa5113b1fd (Refactor overly large/complex
functions) and first released in 3.5.0.
2014-08-06 23:43:39 -07:00
Jason Evans
a2ea54c986 Add atomic operations tests and fix latent bugs. 2014-08-06 23:36:19 -07:00
Jason Evans
7f944aa621 Merge pull request #103 from wqfish/dev
Fix the bug that causes not allocating free run with lowest address.

This fixes a regression due to f9ff60346d,
which was never incorporated into a release.
2014-08-06 17:20:09 -07:00
Qinfan Wu
ea73eb8f3e Reintroduce the comment that was removed in f9ff603. 2014-08-06 16:43:01 -07:00
Qinfan Wu
55c9aa1038 Fix the bug that causes not allocating free run with lowest address. 2014-08-06 16:10:08 -07:00
Jason Evans
095819f011 Merge pull request #102 from mneumann/dfly
Support DragonFlyBSD
2014-08-06 09:14:51 -07:00
Mike Hommey
cf6032d0ef Remove ${srcroot} from cfghdrs_in, cfgoutputs_in and cfghdrs_tup in configure
On Windows, srcroot may start with "drive:", which confuses autoconf's
AC_CONFIG_* macros. The macros works equally well without ${srcroot},
provided some adjustment to Makefile.in.
2014-08-05 16:12:32 -07:00
Jason Evans
d79d59b866 Merge pull request #96 from manuelafm/dev
Please add support for OpenRISC/or1k architecture
2014-08-05 16:02:47 -07:00
Michael Neumann
1aa25a3ca2 Support DragonFlyBSD
Note that in contrast to FreeBSD, DragonFly does not work
with force_lazy_lock enabled.
2014-08-05 03:06:02 +02:00
Manuel A. Fernandez Montecelo
b433d7a87b Update config.{guess,sub} to more recent versions, to add better support to OpenRISC/or1k (among others) 2014-07-29 23:15:26 +01:00
Manuel A. Fernandez Montecelo
ffa259841c Add OpenRISC/or1k LG_QUANTUM size definition 2014-07-29 23:11:26 +01:00
Jason Evans
087ef3bc71 Merge pull request #88 from sstewartgallus/fix-bashisms
Fix unportable == operator in configure scripts
2014-07-07 17:45:34 -07:00
Steven Stewart-Gallus
79230fef31 Fix unportable == operator in configure scripts
Now this code is more portable and now people can use faster shells than
Bash such as Dash.

To use a faster shell with autoconf set the CONFIG_SHELL environment
variable to the shell and run the configure script with the shell.
2014-06-19 16:11:43 -07:00
Mike Hommey
c521df5dcf Allow to build with clang-cl 2014-06-12 10:39:39 -07:00