-
Notifications
You must be signed in to change notification settings - Fork 3k
Profiling
HHVM has several options for profiling your PHP (and Hack!) code and figuring out where it's spending its time and optimizing it.
You may want to also look into tuning HHVM itself.
HHVM natively supports the xhprof extension. This is good for producing parent-child pairs of function calls. The data collected are inclusive wall and CPU times. There are some libraries available for manipulating this data.
Although the earliest and least fancy of tools in Facebook's historical performance tuning story, this got us an extremely long way. We'd record 1:N requests, throw them into a DB, and have offline scripts (similar to those provided in the library above) aggregate those into 10m rollups per endpoint, then aggregate those into 1h rollups, 1d rollups, etc. It's not in wide use at FB to this day only since it's notion of "function call" doesn't deal well with Hack's async functions.
You can use the Linux perf
tool to look at HHVM performance. HHVM can spit out metadata for perf
to use so that its stack traces can look into even JITted code. This is extremely good for profiling HHVM itself (as opposed to the code running on it), as well as looking into complex systems interactions. However, since it's kernel-based, getting any sort of PHP metadata, such as even which endpoint a trace is running, is very difficult.
This work with the standard Linux perf
tool. For a quick intro, assuming $PID
is the pid of HHVM:
- To collect data, run
perf record -g --call-graph dwarf -p $PID
. This works best on a running, warmed-up, release version of HHVM. It will write to a fileperf.data
in the current directory, and keep adding samples until you press Control-C. (The--call-graph dwarf
option in particular can cause this file to balloon very quickly, but we need that option since HHVM is built with-fomit-frame-pointer
for performance.) - To view the data, run
perf report -g
from the same directory. This will show a list of the functions where the most CPU time was spent, and you can press enter on any function to look at its callers, and the proportion of the time that came from each caller. - PHP functions should show up with symbolic names in the backtraces. If they do not, i.e., if you see a bunch of hex addresses instead, there can be a few reasons for this.
- Make sure there is a file
/tmp/perf-$PID.map
. This is written by HHVM to tellperf report
what JIT'd code corresponds to what source code. If the file is missing, either you're on a different machine than when you did theperf record
, or HHVM cleaned up the file. The INI optionhhvm.keep_perf_pid_map=1
will tell HHVM not to clean up this file. - If the file is there but
perf report
doesn't seem to be respecting it, some versions of theperf
tool have a bug where they do the wrong thing with the maps for code JIT'd into the heap (i.e., where HHVM puts it). The version ofperf
that ships with Ubuntu 14.04 (3.13.11-ckt18
) is known to be broken in this way. You may have to upgrade your version of the tool; to build it yourself, it can be found in the Linux kernel sources undertools/perf
. Version4.0.2
is known to work on Ubuntu 14.04, newer versions will probably work too. You can try using http://dl.hhvm.com/resources/perf.gz which should work on Ubuntu 14.04, YMMV on other OSes.
- Make sure there is a file
- There's a ton more this tool can do; see the various bits of documentation and examples for the Linux perf tools available online.
The newest of the tools. A configuration option controls how often a snapshot of the current execution backtrace is taken (usually every few minutes). PHP code, usually a shutdown function, can then look and see if that request had a profile taken, and if so write it to a DB. In aggregate, these "flashes" of profiling information tell which functions it's more likely execution time will be spent, i.e., which functions are more expensive.
There are a bunch of things you can do with this data, especially since it gives a full backtrace, and the PHP shutdown function logging it can provide arbitrary metadata with the trace. Facebook has some fancy visualization tools to walk up and down aggregated traces, which are sadly not open source. Wikimedia also has some tooling built around Xenon, to aggregate its traces into flame graphs.
Because it's a flash profiler, the data collected are all effectively about wall time. You might want to filter out IO heavy functions.