Doc: Update Performance Analyzer docs
The tool was formerly known as "CPU Usage Analyzer", but can now be used also for analyzing memory usage on devices. Change-Id: I8e0c2b76be44340e5511c2cbb85efadb5a2f559d Reviewed-by: Leena Miettinen <riitta-leena.miettinen@qt.io>
Before Width: | Height: | Size: 64 KiB |
Before Width: | Height: | Size: 137 KiB |
Before Width: | Height: | Size: 26 KiB |
Before Width: | Height: | Size: 127 KiB |
BIN
doc/images/qtcreator-performance-analyzer-flamegraph.png
Normal file
After Width: | Height: | Size: 154 KiB |
BIN
doc/images/qtcreator-performance-analyzer-settings.png
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
doc/images/qtcreator-performance-analyzer-statistics.png
Normal file
After Width: | Height: | Size: 158 KiB |
BIN
doc/images/qtcreator-performance-analyzer-timeline.png
Normal file
After Width: | Height: | Size: 86 KiB |
@@ -1,6 +1,6 @@
|
||||
/****************************************************************************
|
||||
**
|
||||
** Copyright (C) 2017 The Qt Company Ltd.
|
||||
** Copyright (C) 2018 The Qt Company Ltd.
|
||||
** Contact: https://www.qt.io/licensing/
|
||||
**
|
||||
** This file is part of the Qt Creator documentation.
|
||||
@@ -40,15 +40,15 @@
|
||||
\commercial
|
||||
|
||||
\QC is integrated with the Linux Perf tool (commercial only) that can be
|
||||
used to analyze the CPU usage of an application on embedded devices and, to
|
||||
a limited extent, on Linux desktop platforms. The CPU Usage Analyzer uses
|
||||
the Perf tool bundled with the Linux kernel to take periodic snapshots of
|
||||
the call chain of an application and visualizes them in a timeline view or
|
||||
as a flame graph.
|
||||
used to analyze the CPU and memory usage of an application on embedded
|
||||
devices and, to a limited extent, on Linux desktop platforms. The
|
||||
Performance Analyzer uses the Perf tool bundled with the Linux kernel to
|
||||
take periodic snapshots of the call chain of an application and visualizes
|
||||
them in a timeline view or as a flame graph.
|
||||
|
||||
\section1 Using the CPU Usage Analyzer
|
||||
\section1 Using the Performance Analyzer
|
||||
|
||||
The CPU Usage Analyzer usually needs to be able to locate debug symbols for
|
||||
The Performance Analyzer usually needs to be able to locate debug symbols for
|
||||
the binaries involved.
|
||||
|
||||
Profile builds produce optimized binaries with separate debug symbols and
|
||||
@@ -69,16 +69,16 @@
|
||||
|
||||
\endlist
|
||||
|
||||
You can start the CPU Usage Analyzer in the following ways:
|
||||
You can start the Performance Analyzer in the following ways:
|
||||
|
||||
\list
|
||||
\li Select \uicontrol Analyze > \uicontrol {CPU Usage Analyzer} to
|
||||
\li Select \uicontrol Analyze > \uicontrol {Performance Analyzer} to
|
||||
profile the current application.
|
||||
|
||||
\li Select the
|
||||
\inlineimage qtcreator-analyze-start-button.png
|
||||
(\uicontrol Start) button to start the application from the
|
||||
CPU Usage Analyzer.
|
||||
Performance Analyzer.
|
||||
|
||||
\endlist
|
||||
|
||||
@@ -87,7 +87,7 @@
|
||||
(\uicontrol {Collect profile data}) button.
|
||||
|
||||
When you start analyzing an application, the application is launched, and
|
||||
the CPU Usage Analyzer immediately begins to collect data. This is indicated
|
||||
the Performance Analyzer immediately begins to collect data. This is indicated
|
||||
by the time running in the \uicontrol Recorded field. However, as the data
|
||||
is passed through the Perf tool and an extra helper program bundled with
|
||||
\QC, and both buffer and process it on the fly, data may arrive in \QC
|
||||
@@ -103,16 +103,28 @@
|
||||
Profile data will still be generated, but \QC will discard it until you
|
||||
select the button again.
|
||||
|
||||
\section1 Specifying CPU Usage Analyzer Settings
|
||||
\section1 Profiling Memory Usage on Devices
|
||||
|
||||
To specify global settings for the CPU Usage Analyzer, select
|
||||
To create trace points for profiling memory usage on a target device, select
|
||||
\uicontrol Analyze > \uicontrol {Performance Analyzer Options} >
|
||||
\uicontrol {Create Memory Trace Points}.
|
||||
|
||||
To add events for the trace points, see \l{Choosing Event Types}
|
||||
|
||||
You can record a memory trace to view usage graphs in the samples rows of
|
||||
the timeline and to view memory allocations, peaks, and releases in the
|
||||
flame graph.
|
||||
|
||||
\section1 Specifying Performance Analyzer Settings
|
||||
|
||||
To specify global settings for the Performance Analyzer, select
|
||||
\uicontrol Tools > \uicontrol Options > \uicontrol Analyzer >
|
||||
\uicontrol {CPU Usage Analyzer}. For each run configuration, you can also
|
||||
\uicontrol {CPU Usage}. For each run configuration, you can also
|
||||
use specialized settings. Select \uicontrol Projects > \uicontrol Run, and
|
||||
then select \uicontrol Details next to
|
||||
\uicontrol {CPU Usage Analyzer Settings}.
|
||||
\uicontrol {Performance Analyzer Settings}.
|
||||
|
||||
\image qtcreator-cpu-usage-analyzer-settings.png
|
||||
\image qtcreator-performance-analyzer-settings.png
|
||||
|
||||
To edit the settings for the current run configuration, you can also select
|
||||
the dropdown menu next to the \uicontrol {Collect profile data} button.
|
||||
@@ -120,12 +132,12 @@
|
||||
\section2 Choosing Event Types
|
||||
|
||||
In the \uicontrol Events table, you can specify which events should trigger
|
||||
the CPU Usage Analyzer to take a sample. The most common way of analyzing
|
||||
the Performance Analyzer to take a sample. The most common way of analyzing
|
||||
CPU usage involves periodic sampling, driven by hardware performance
|
||||
counters that react to the number of instructions or CPU cycles executed.
|
||||
Alternatively, a software counter that uses the CPU clock can be chosen.
|
||||
|
||||
Select \uicontrol Add to add events to the table.
|
||||
Select \uicontrol {Add Event} to add events to the table.
|
||||
In the \uicontrol {Event Type} column, you can choose the general type of
|
||||
event to be sampled, most commonly \uicontrol {hardware} or
|
||||
\uicontrol {software}. In the \uicontrol {Counter} column, you can choose
|
||||
@@ -141,7 +153,19 @@
|
||||
\uicontrol {L1-dcache} on the \uicontrol {load} operation with a result
|
||||
of \uicontrol {misses}. That would sample L1-dcache misses on reading.
|
||||
|
||||
Select \uicontrol Remove to remove the selected event from the table.
|
||||
Select \uicontrol {Remove Event} to remove the selected event from the
|
||||
table.
|
||||
|
||||
Select \uicontrol {Use Trace Points} to replace the current selection of
|
||||
events with trace points defined on the target device and set the
|
||||
\uicontrol {Sample mode} to \uicontrol {event count} and the
|
||||
\uicontrol {Sample period} to \c {1}. If the trace points on the target
|
||||
were defined using the \uicontrol {Create Trace Points} option, the
|
||||
Performance Analyzer will automatically use them to profile memory usage.
|
||||
|
||||
Select \uicontrol {Reset} to revert the selection of events, as well as the
|
||||
\uicontrol {Sample mode} and \uicontrol {Sample period} to the default
|
||||
values.
|
||||
|
||||
\section2 Choosing a Sampling Mode and Period
|
||||
|
||||
@@ -154,7 +178,7 @@
|
||||
a sample every \c n times one of the chosen events has occurred,
|
||||
where \c n is specified in the \uicontrol {Sample period} field.
|
||||
|
||||
\li Sampling by \uicontrol {frequency} instructs the kernel to try and
|
||||
\li Sampling by \uicontrol {frequency (Hz)} instructs the kernel to try and
|
||||
take a sample \c n times per second, by automatically adjusting the
|
||||
sampling period. Specify \c n in the \uicontrol {Sample period}
|
||||
field.
|
||||
@@ -168,7 +192,7 @@
|
||||
There may be a significant difference between the sampling period you
|
||||
request and the actual result.
|
||||
|
||||
In general, if you configure the CPU Usage Analyzer to collect more data
|
||||
In general, if you configure the Performance Analyzer to collect more data
|
||||
than it can transmit over the connection between the target and the host
|
||||
device, the application may get blocked while Perf is trying to send the
|
||||
data, and the processing delay may grow excessively. You should then change
|
||||
@@ -176,25 +200,33 @@
|
||||
|
||||
\section2 Selecting Call Graph Mode
|
||||
|
||||
In the \uicontrol {Call graph mode} field, you can specify how the CPU Usage
|
||||
Analyzer recovers call chains from your application.
|
||||
In the \uicontrol {Call graph mode} field, you can specify how the
|
||||
Performance Analyzer recovers call chains from your application:
|
||||
|
||||
The \uicontrol {Frame Pointer}, or \c fp, mode relies on frame pointers
|
||||
\list
|
||||
|
||||
\li The \uicontrol {Frame Pointer}, or \c fp, mode relies on frame pointers
|
||||
being available in the profiled application and will instruct the kernel on
|
||||
the target device to walk the chain of frame pointers in order to retrieve
|
||||
a call chain for each sample.
|
||||
|
||||
The \uicontrol {Dwarf} mode works also without frame pointers, but
|
||||
\li The \uicontrol {Dwarf} mode works also without frame pointers, but
|
||||
generates significantly more data. It takes a snapshot of the current
|
||||
application stack each time a sample is triggered and transmits that
|
||||
snapshot to the host computer for analysis.
|
||||
|
||||
\li The \uicontrol {Last Branch Record} mode does not use a memory buffer.
|
||||
It automatically decodes the last 16 taken branches every time execution
|
||||
stops. It is supported only on recent Intel CPUs.
|
||||
|
||||
\endlist
|
||||
|
||||
Qt and most system libraries are compiled without frame pointers by
|
||||
default, so the frame pointer mode is only useful with customized systems.
|
||||
|
||||
\section2 Setting Stack Snapshot Size
|
||||
|
||||
The CPU Usage Analyzer will analyze and \e unwind the stack snapshots
|
||||
The Performance Analyzer will analyze and \e unwind the stack snapshots
|
||||
generated by Perf in dwarf mode. Set the size of the stack snapshots in the
|
||||
\uicontrol {Stack snapshot size} field. Large stack snapshots result in a
|
||||
larger volume of data to be transferred and processed. Small stack
|
||||
@@ -212,7 +244,7 @@
|
||||
\section2 Resolving Names for JIT-compiled JavaScript Functions
|
||||
|
||||
Since version 5.6.0, Qt can generate perf.map files with information about
|
||||
JavaScript functions. The CPU Usage Analyzer will read them and show the
|
||||
JavaScript functions. The Performance Analyzer will read them and show the
|
||||
function names in the \uicontrol Timeline, \uicontrol Statistics, and
|
||||
\uicontrol {Flame Graph} views. This only works if the process being
|
||||
profiled is running on the host computer, not on the target device. To
|
||||
@@ -225,30 +257,30 @@
|
||||
The \uicontrol Timeline view displays a graphical representation of CPU
|
||||
usage per thread and a condensed view of all recorded events.
|
||||
|
||||
\image cpu-usage-analyzer.png "CPU Usage Analyzer"
|
||||
\image qtcreator-performance-analyzer-timeline.png "Performance Analyzer"
|
||||
|
||||
Each category in the timeline describes a thread in the application. Move
|
||||
the cursor on an event (6) on a row to see how long it takes and which
|
||||
the cursor on an event (5) on a row to see how long it takes and which
|
||||
function in the source it represents. To display the information only when
|
||||
an event is selected, disable the
|
||||
\uicontrol {View Event Information on Mouseover} button (5).
|
||||
\uicontrol {View Event Information on Mouseover} button (4).
|
||||
|
||||
The outline (10) summarizes the period for which data was collected. Drag
|
||||
the zoom range (8) or click the outline to move on the outline. You can
|
||||
The outline (9) summarizes the period for which data was collected. Drag
|
||||
the zoom range (7) or click the outline to move on the outline. You can
|
||||
also move between events by selecting the
|
||||
\uicontrol {Jump to Previous Event} (1) and \uicontrol {Jump to Next Event}
|
||||
(2) buttons.
|
||||
\uicontrol {Jump to Previous Event} and \uicontrol {Jump to Next Event}
|
||||
buttons (1).
|
||||
|
||||
Select the \uicontrol {Show Zoom Slider} button (3) to open a slider that
|
||||
you can use to set the zoom level. You can also drag the zoom handles (9).
|
||||
Select the \uicontrol {Show Zoom Slider} button (2) to open a slider that
|
||||
you can use to set the zoom level. You can also drag the zoom handles (8).
|
||||
To reset the default zoom level, right-click the timeline to open the
|
||||
context menu, and select \uicontrol {Reset Zoom}.
|
||||
|
||||
\section2 Selecting Event Ranges
|
||||
|
||||
You can select an event range (7) to view the time it represents or to zoom
|
||||
You can select an event range (6) to view the time it represents or to zoom
|
||||
into a specific region of the trace. Select the \uicontrol {Select Range}
|
||||
button (4) to activate the selection tool. Then click in the timeline to
|
||||
button (3) to activate the selection tool. Then click in the timeline to
|
||||
specify the beginning of the event range. Drag the selection handle to
|
||||
define the end of the range.
|
||||
|
||||
@@ -276,10 +308,10 @@
|
||||
events to move the cursor in the code editor to the part of the code the
|
||||
event is associated with.
|
||||
|
||||
As the Perf tool only provides periodic samples, the CPU Usage Analyzer
|
||||
As the Perf tool only provides periodic samples, the Performance Analyzer
|
||||
cannot determine the exact time when a function was called or when it
|
||||
returned. You can, however, see exactly when a sample was taken in the
|
||||
second row of each thread. The CPU Usage Analyzer assumes that if the same
|
||||
second row of each thread. The Performance Analyzer assumes that if the same
|
||||
function is present at the same place in the call chain in multiple
|
||||
consecutive samples, then this represents a single call to the respective
|
||||
function. This is, of course, a simplification. Also, there may be other
|
||||
@@ -318,7 +350,7 @@
|
||||
|
||||
\section1 Viewing Statistics
|
||||
|
||||
\image qtcreator-cpu-usage-analyzer-statistics.png
|
||||
\image qtcreator-performance-analyzer-statistics.png
|
||||
|
||||
The \uicontrol Statistics view displays the number of samples each function
|
||||
in the timeline was contained in, in total and when on the top of the
|
||||
@@ -344,12 +376,39 @@
|
||||
|
||||
\section2 Visualizing Statistics as Flame Graphs
|
||||
|
||||
\image qtcreator-cpu-usage-analyzer-flamegraph.png
|
||||
\image qtcreator-performance-analyzer-flamegraph.png
|
||||
|
||||
The \uicontrol {Flame Graph} view shows a more concise statistical overview
|
||||
of the execution. The horizontal bars show the total number of samples
|
||||
taken for a certain function, relative to the total number of samples. The
|
||||
nesting shows which functions were called by which other ones.
|
||||
of the execution. The horizontal bars show an aspect of the samples
|
||||
taken for a certain function, relative to the same aspect of all samples
|
||||
together. The nesting shows which functions were called by which other ones.
|
||||
|
||||
The \uicontrol {Visualize} button lets you choose what aspect to show in the
|
||||
\uicontrol {Flame Graph}.
|
||||
|
||||
\list
|
||||
|
||||
\li \uicontrol {Samples} is the default visualization. The size of the
|
||||
horizontal bars represents the number of samples recorded for the given
|
||||
function.
|
||||
|
||||
\li In \uicontrol {Peak Usage} mode, the size of the horizontal bars
|
||||
represents the amount of memory allocated by the respective functions, at
|
||||
the point in time when the allocation's memory usage was at its peak.
|
||||
|
||||
\li In \uicontrol {Allocations} mode, the size of the horizontal bars
|
||||
represents the number of memory allocations triggered by the respective
|
||||
functions.
|
||||
|
||||
\li In \uicontrol {Releases} mode, the size of the horizontal bars
|
||||
represents the number of memory releases triggered by the respective
|
||||
functions.
|
||||
|
||||
\endlist
|
||||
|
||||
The \uicontrol {Peak Usage}, \uicontrol {Allocations}, and
|
||||
\uicontrol {Releases} modes will only show any data if samples from memory
|
||||
trace points have been recorded.
|
||||
|
||||
\section2 Interaction between the views
|
||||
|
||||
@@ -357,19 +416,20 @@
|
||||
\uicontrol {Flame Graph}, or \uicontrol {Statistics} views, information
|
||||
about it is displayed in the other two views. To view a time range in the
|
||||
\uicontrol {Statistics} and \uicontrol {Flame Graph} views, select
|
||||
\uicontrol {Limit Statistics to Selected Range} in the context menu in the
|
||||
\uicontrol {Timeline} view.
|
||||
\uicontrol Analyze > \uicontrol {Performance Analyzer Options} >
|
||||
\uicontrol {Limit to the Range Selected in Timeline}. To show the full
|
||||
stack frame, select \uicontrol {Show Full Range}.
|
||||
|
||||
\section1 Loading Perf Data Files
|
||||
|
||||
You can load any \c perf.data files generated by recent versions of the
|
||||
Linux Perf tool and view them in \QC. Select \uicontrol Analyze >
|
||||
\uicontrol {CPU Usage Analyzer Options} > \uicontrol {Load perf.data} to
|
||||
\uicontrol {Performance Analyzer Options} > \uicontrol {Load perf.data} to
|
||||
load a file.
|
||||
|
||||
\image qtcreator-cpu-usage-analyzer-load-perf-trace.png
|
||||
|
||||
The CPU Usage Analyzer needs to know the context in which the
|
||||
The Performance Analyzer needs to know the context in which the
|
||||
data was recorded to find the debug symbols. Therefore, you have to specify
|
||||
the kit that the application was built with and the folder where the
|
||||
application executable is located.
|
||||
@@ -377,11 +437,11 @@
|
||||
The Perf data files are generated by calling \c {perf record}. Make sure to
|
||||
generate call graphs when recording data by starting Perf with the
|
||||
\c {--call-graph} option. Also check that the necessary debug symbols are
|
||||
available to the CPU Usage Analyzer, either at a standard location
|
||||
available to the Performance Analyzer, either at a standard location
|
||||
(\c /usr/lib/debug or next to the binaries), or as part of the Qt package
|
||||
you are using.
|
||||
|
||||
The CPU Usage Analyzer can read Perf data files generated in either frame
|
||||
The Performance Analyzer can read Perf data files generated in either frame
|
||||
pointer or dwarf mode. However, to generate the files correctly, numerous
|
||||
preconditions have to be met. All system images for the
|
||||
\l{http://doc.qt.io/QtForDeviceCreation/qtee-supported-platforms.html}
|
||||
@@ -394,15 +454,15 @@
|
||||
\section1 Loading and Saving Trace Files
|
||||
|
||||
You can save and load trace data in a format specific to the
|
||||
CPU Usage Analyzer with the respective entries in \uicontrol Analyze >
|
||||
\uicontrol {CPU Usage Analyzer Options}. This format is self-contained, and
|
||||
Performance Analyzer with the respective entries in \uicontrol Analyze >
|
||||
\uicontrol {Performance Analyzer Options}. This format is self-contained, and
|
||||
therefore loading it does not require you to specify the recording
|
||||
environment. You can transfer such trace files to a different computer
|
||||
without any tool chain or debug symbols and analyze them there.
|
||||
|
||||
\section1 Troubleshooting
|
||||
|
||||
The CPU Usage Analyzer might fail to record data for the following reasons:
|
||||
The Performance Analyzer might fail to record data for the following reasons:
|
||||
|
||||
\list 1
|
||||
\li Perf events may be globally disabled on your system. The
|
||||
|
@@ -75,10 +75,10 @@
|
||||
You can use the Heob heap observer on Windows to detect buffer
|
||||
overruns and memory leaks.
|
||||
|
||||
\li \l{Analyzing CPU Usage}{CPU Usage Analyzer}
|
||||
\li \l{Analyzing CPU Usage}{Performance Analyzer}
|
||||
|
||||
You can analyze the CPU usage of embedded applications and Linux
|
||||
desktop applications with the CPU Usage Analyzer (commercial only)
|
||||
desktop applications with the Performance Analyzer (commercial only)
|
||||
that integrates the Linux Perf tool.
|
||||
|
||||
\endlist
|
||||
|
@@ -41,7 +41,7 @@
|
||||
\l{http://qt.io/licensing/}{Qt license}:
|
||||
|
||||
\list
|
||||
\li \l{Analyzing CPU Usage}{CPU Usage Analyzer}
|
||||
\li \l{Analyzing CPU Usage}{Performance Analyzer}
|
||||
\li \l{Browsing ISO 7000 Icons} in \QMLD
|
||||
\li \l{http://doc.qt.io/QtForDeviceCreation/index.html}{Developing for
|
||||
embedded devices}
|
||||
|
@@ -160,7 +160,7 @@
|
||||
|
||||
To generate debug symbols also for applications compiled in release mode,
|
||||
select the \uicontrol {Generate separate debug info} check box. For more
|
||||
information, see \l{Using the CPU Usage Analyzer}.
|
||||
information, see \l{Using the Performance Analyzer}.
|
||||
|
||||
\section3 Compiling QML
|
||||
|
||||
|