Intel® Advisor is composed of a set of tools to help ensure Fortran, C, C++, OpenCL™, and Data Parallel C++ (DPC++) applications realize full performance potential on modern processors.

Intel Advisor is available as a standalone installation and as part of Intel® oneAPI Base Toolkit.

Intel Advisor enables you to analyze your code from the following perspectives:

This document summarizes typical workflows to get started improving the performance potential of your application with Intel Advisor.

Before You Begin

Default Installation Paths

By default, the Intel® Advisor <install-dir> is as follows:

On Windows* OS:

C:\Program Files (x86)\Intel\oneAPI\advisor\<version> (on certain systems, instead of Program Files (x86), the directory name is Program Files)

On Linux* OS:

On macOS*:

/opt/intel/oneapi/advisor/<version>

Set Up Environment

Note: Set up the env variable if you plan to use Intel Advisor command line interface (advisor) or launch the GUI (advisor-gui) from the command line.

On Windows* OS:

To set up the environment for Intel Advisor, run the <install-dir>\env\vars.bat script.

On Linux* OS:

To set up the environment for Intel Advisor, run the source <install-dir>/env/vars.sh script.

On macOS*:

To set up the environment for Intel Advisor, run one of the following commands:

  • <install-dir>/env/vars.sh
  • <install-dir>/env/vars.csh

Explore a step-by step instruction how to view your Intel Advisor results on a macOS* machine in the Intel Advisor Cookbook: Analyze Performance Remotely and Visualize Results on a Local macOS* System.

Discover Where Vectorization Will Pay Off The Most

Vectorization and Code Insights perspective is a vectorization analysis toolset that enables you identify loops that will benefit most from vector parallelism. Profile your application using the Survey tool to locate un-vectorized and under-vectorized time-consuming functions/loops and calculate estimated performance gain achieved by vectorization.

Intel Advisor Workflow: Discover Where Vectorization Will Pay Off the Most

There are two ways to run the Vectorization and Code Insights perspective: from the Intel® Advisor GUI and from CLI. Intel Advisor enables you to open results collected using both methods in the GUI.

Run Vectorization and Code Insights Perspective from Intel® Advisor GUI

In the Analysis Workflow pane, use a drop-down menu to select Vectorization and Code Insights

perspective, set data collection accuracy level to Low, and click the button to run it. At this accuracy level, Intel® Advisor runs Survey analysis and collects performance metrics of your application to locate under- and non-vectorized hotspots that can improve the total execution time of your application. For details about data collection accuracy presets, see the respective section in the Intel Advisor User Guide. Upon completion, Intel Advisor generates a Summary Report.

Vectorization and Code Insights Summary includes the most important information about your application and gives you hints for further optimization steps. The report includes the following sections:

  • Program Metrics section shows execution time details and provides estimated speed-up data for further optimization of functions/loops that are already vectorized.
  • Per Program Recommentations section provides hints that might help you improve the overall performance of your application.
  • Top Time-Consuming Loops section shows the list of top five loops that take the longest time to execute. These loops might be the best candidates for optimization.
  • Recommendations section shows top five hints recommended as first steps for optimization.

Run Vectorization and Code Insights Perspective from Command Line Interface

To run Survey analysis and collect performance metrics of your application using advisor command line interface, run the following command:

advisor –collect=survey --project-dir =.\advi -- myApplication

Upon completion, Intel Advisor enables you to open the summary of collected results in the GUI or generate a Survey report in the CLI using the following command:

advisor --report=survey --project-dir=./advi

In the terminal, Intel Advisor displays a Survey report that shows top ten most time-consuming functions/loops. By default, a copy of this report is saved into <project-dir>/e<NNN>/hs<NNN>/data.0/advisor-survey.txt. For details about generating CLI reports, see the respective section in the Intel Advisor User Guide or use the following command in your terminal:

advisor --help report

What's Next

If all loops are vectorizing properly and performance is satisfactory, you are done! Congratulations!

If one or more loops is not vectorizing properly and performance is unsatisfactory:

  1. Improve application performance using various Intel Advisor features to guide your efforts, such as:

  2. Rebuild your modified code.

  3. Run another Survey analysis to verify all loops are vectorizing properly and performance is satisfactory.

See Also

Explore a vectorization use case in Intel Advisor Cookbook: Analyze Vectorization and Memory Aspects of an MPI Applications.

Identify Performance Bottlenecks Using Roofline

CPU / Memory Roofline Insights perspective enables you to visualize actual performance against hardware-imposed performance ceilings, as well as determine the main limiting factor (memory bandwidth or compute capacity).

There are two ways to run the CPU / Memory Roofline Insights perspective: from the Intel® Advisor GUI and from CLI. Intel Advisor enables you to open results collected using both methods in the GUI.

Run CPU / Memory Roofline Insights Perspective from Intel® Advisor GUI

In the Analysis Workflow pane, the drop-down menu to select the CPU / Memory Roofline Insights perspective, set data collection accuracy level to Low, and click the button to run it. At this accuracy level, Intel Advisor:

  • Measures the hardware limitations of your machine and collects loop/function timings using the Survey analysis.

  • Collects floating-point and integer operations data, and memory data using the Characterization analysis.

For details about data collection accuracy presets, see Intel Advisor User Guide: CPU Roofline Accuracy Presets. Upon completion, Intel Advisor displays a Roofline chart.

The Roofline chart plots an application's achieved performance and arithmetic intensity against the machine's maximum achievable performance:

  • Arithmetic intensity (x axis) - measured in number of floating-point operations (FLOPs) and/or integer operations (INTOPs) per byte, based on the loop/function algorithm, transferred between CPU/VPU and memory.
  • Performance (y axis) - measured in billions of floating-point operations per second (GFLOPS) and/or billions of integer operations per second (GINTOPS).

In general:

  • Dots of different color and size represent functions/loops. The size and color of a dot represent execution time for this loop/function in relation to total execution time of the application. Large red dots are profitable to optimize as they take the longest execution time. Small green dots take less time and may be poor candidates for optimization.
  • Diagonal lines indicate memory bandwidth limitations preventing loops/functions from achieving better performance without optimization. For example, the L1 Bandwidth roofline represents the maximum amount of work that can get done at a given arithmetic intensity if the loop always hits L1 cache. A loop does not benefit from L1 cache speed if a dataset causes it to miss L1 cache too often. In this case, it is subject to the limitations of the lower-speed L2 cache it is hitting. So, a dot representing a loop that misses L1 cache too often but hits L2 cache is positioned below the L2 Bandwidth roofline.
  • Horizontal lines indicate compute capacity limitations preventing loops/functions from achieving better performance without optimization. For example, the Scalar Add Peak represents the peak number of add instructions that can be performed by a scalar loop under these circumstances. The Vector Add Peak represents the peak number of add instructions that can be performed under these circumstances by a vectorized loop with the highest instruction set available. So, a dot representing a loop that is not vectorized is positioned somewhere below the Scalar Add Peak roofline.
  • A dot cannot exceed the topmost rooflines, as these represent the maximum capabilities of the machine; however, not all loops can utilize maximum machine capabilities.
  • The greater the distance between a dot and the highest achievable roofline, the more room for optimization a function/loop has.

Run CPU / Memory Roofline Insights Perspective from Command Line Interface

To run CPU / Memory Roofline Insights perspective using advisor command line interface, use the following command:

advisor --collect=roofline --project-dir=./advi  --search-dir src:p=./advi –- myApplication

This command is a batch mode that runs two analyses one by one:

  1. Survey analysis that collects loops/functions execution time data.
  2. Characterization analysis that collects floating-point and integer operations, memory traffic and mask utilization metrics for AVX-512 platforms to measure arithmetic intensity and performance of your application, and compute capacity of your hardware.

To view the achieved performance of your application against hardware-imposed performance ceilings on an interactive Roofline chart, open the collected results in the Intel Advisor GUI or use the following command to generate an interactive HTML Roofline report:

advisor --report=roofline --report-output=./advi/advisor-roofline.html --project-dir=./advi

Where report-output option specifies the directory and the HTML file into which Intel Advisor saves the generated report.

For details about generating CLI reports, see the respective section in the Intel Advisor User Guide or use the following command in your terminal:

advisor --help report

What's Next

If one or more loops is not vectorizing properly and performance is unsatisfactory:

  1. Consider working with the most time-consuming function/loop indicated on a Roofline chart.
    • Use the Code Analytics tab to examine the main information for the selected function/loop. Refer to the Roofline pane to identify whether the function/loop is compute or memory bound.
    • Use Recommendations tab to view hints on possible optimization steps for the selected function/loop in the Roofline Guidance section.
  2. If your loop is compute bound:
    • Check the Vectorized Loops/Efficiency values in the Survey Report.
    • Consider running Dependencies analysis to discover why the compiler assumed a dependency and did not vectorize the selected function/loop.
    • Consider running Memory Access Patterns (MAP) analysis to identify expensive memory instructions.
  3. If your loop is memory bound:

See Also

Identify High-impact Opportunities to Offload to GPU

Offload Modeling perspective enables you to identify high-impact opportunities to offload to GPU as well as the areas that are not profitable to offload. It provides performance speed-up projection on accelerators along with offload overhead estimation and pinpoints accelerator performance bottlenecks.

There are two ways to run the Offload Modeling perspective: from the Intel® Advisor GUI and from CLI. Intel Advisor enables you to open results collected using both methods in the GUI or in your web browser.

Run Offload Modeling Perspective from Intel® Advisor GUI

In the Analysis Workflow pane, use a drop-down menu to select the Offload Modeling perspective, set data collection accuracy level to Medium. At this accuracy level, Intel® Advisor:

  • Collects Survey data with basic execution metrics of your application
  • Runs Characterization analysis to get the information about Trip Counts and floating-point operations (FLOPs), simulate cache traffic, and estimate time required to transfer data from one device to another
  • Models application performance on a target device assuming that main hotspots can be executed in parallel

Click the button to run the perspective

For details about data collection accuracy presets, see Intel Advisor User Guide: Offload Modeling Accuracy Presets.

Upon completion, Intel Advisor displays an Offload Modeling Summary that offers you information on total potential speed-up of your application, top 5 offloaded code regions in your call tree, top 5 regions that are not profitable to offload, number of offloaded functions/loops, and a fraction of offloaded code relative to total time of original application.

Tip:

Intel Advisor generates an interactive HTML report that is stored in the <project-dir>/e<NNN>/pp<NNN>/data.0. You can open the HTML report in your web browser.

Run Offload Modeling Perspective from Command Line Interface

To collect data and model your application performance on a target GPU using the advisor command line interface, do the following:

  1. Run the Survey analysis and collect performance metrics with online stackwalk and static instruction mix on a host device:
    advisor --collect=survey --stackwalk-mode=online --static-instruction-mix --project-dir=./advi -- myApplication
  2. Run the Trip Counts and FLOP analysis, simulate multi-level GPU memory subsystem behavior, and estimate time required for transferring data from host to target:
    advisor --collect=tripcounts --flop --stacks --enable-cache-simulation --data-transfer=light --target-device=gen9_gt2 --project-dir=./advi -- myApplication

    Where:

    • flop collects data about floating-point operations, integer operations, memory traffic, and mask utilization metrics.
    • stacks performs advanced collection of callstack data.
    • enable-cache-simulation models memory subsystem behavior on a target application.
    • data-transfer is set to light mode that models data transfer between host and device memory.
    • target-device specifies a device configuration to use for simulating cache behavior during Trip Counts collection. The following device configurations are available: dg1, gen11_gt2, gen9_gt2, gen9_gt3, gen9_gt4.

  3. Model application performance on a target device:
    advisor --collect=projection --no-assume-dependencies --config=gen9_gt2 --project-dir=./advi

    Where:

    • no-assume-dependencies assumes that a loop does not have dependencies if the dependency type is unknown. It’s recommended option for the Medium accuracy mode since dependency analysis is not executed.
    • config sets a device configuration to model your application performance for. By default, this option is set to gen11_gt2. The following device configurations are available: dg1, gen11_gt2, gen9_gt2, gen9_gt3, gen9_gt4.

For details about CLI options, see Intel Advisor User Guide: Command Line Interface.

Upon completion, open the collected results in the Intel Advisor GUI or open the interactive HTML report that is stored in the <project-dir>/e<NNN>/pp<NNN>/data.0 using your web browser.

What's Next

After running the Offload Modeling perspective, you need to identify, whether your top hotspots have loop-carried dependencies that might be show-stoppers for offloading. To do that:

  1. Rerun Performance Modeling analysis assuming that your main hotspots with unknown dependency types cannot be executed in parallel:
    • In the Intel Advisor GUI, expand the Performance Modeling analysis, make sure to enable the Assume Dependencies checkbox, and click the button to run it.
    • In the CLI, use the following command line:
      advisor --collect=projection --assume-dependencies --config=gen9_gt2 --project-dir=./advi
  2. If the difference between program metrics for your main hotspots collected with and without Assume Dependencies option is small (for example, 2x speed-up with Assume Dependencies and 2.2x speed-up without Assume Dependencies), you can rely on collected results. If the difference is big (for example 2x speed-up with Assume Dependencies and 50x speed-up without Assume Dependencies), consider running Dependencies analysis.

For details about checking for loop-carried dependencies, see the respective section in the Intel Advisor User Guide.

See Also

View useful information about Offload Modeling in the Offload Modeling Resources page.

Explore more ways to run Offload Modeling perspective from command line interface in Intel Advisor User Guide: Run Offload Modeling from Command Line.

Explore typical scenarios of optimizing GPU usage described in the Intel Advisor Cookbook:

Explore a typical scenario of optimizing GPU usage described in .

Measure GPU Performance Using GPU Roofline

GPU Roofline Insights perspective enables you to estimate and visualize actual performance of GPU kernels using benchmarks and hardware metric profiling against hardware-imposed performance ceilings, as well as determine the main limiting factor.

There are two ways to run GPU Roofline Insights perspective: from the Intel® Advisor GUI and from CLI. Intel Advisor enables you to open results collected using both methods in the GUI.

Run GPU Roofline Insights Perspective from Intel® Advisor GUI

In the Analysis Workflow pane, use a drop-down menu to select the GPU Roofline Insights perspective, set data collection accuracy level to Low, and click the button to run it. At this accuracy level, Intel Advisor:

  • Measures the hardware limitations and collects OpenCL™, OpenMP*, oneAPI Level Zero (Level Zero) and Data Parallel C++ (DPC++) kernels timings and memory data using the Survey analysis with GPU profiling.

  • Collects floating-point and integer operations data using the Trip Counts and FLOP analysis with GPU profiling.

For details about data collection accuracy presets, see Intel Advisor User Guide: GPU Roofline Accuracy Presets. Upon completion, Intel Advisor displays a GPU Roofline Summary. Switch to the GPU Roofline Regions tab to view the Roofline Chart and identify the main factors limiting the performance of your application.

Important:

GPU profiling is applicable only to Intel® Processor Graphics.

A Roofline chart plots an application's achieved performance and arithmetic intensity against the machine's maximum achievable performance:

  • Arithmetic intensity (x axis) - measured in number of floating-point operations (FLOPS) per byte for FLOAT Roofline chart and in number of integer operations (INTOPS) per byte for INT Roofline chart based on the kernel algorithm, transferred between GPU and memory

  • Performance (y axis) - measured in billions of floating-point operations (GFLOPS) per second for FLOAT Roofline chart and in billions of integer operations (GINTOPS) per second for INT Roofline chart

In general:

  • The size and color of each dot represent relative execution time for each kernel. Large red dots take the most time, so are the best candidates for optimization. Small green dots take less time, so may not be worth optimizing.

  • Diagonal lines indicate memory bandwidth limitations preventing kernels from achieving better performance without some form of optimization.

    Depending on your system configuration the following rooflines might be available on the Roofline chart:

    • L3 cache roof: Represents the maximal bandwidth of the L3 cache for your current graphics hardware. Measured using an optimized sequence of load operations, iterating over an array that fits entirely into L3 cache.

    • SLM cache roof: Represents the maximal bandwidth of the Shared Local Memory for your current graphics hardware. Measured using an optimized sequence of load and store operations that work only with SLM.

    • GTI roof: Represents the maximum bandwidth between the GPU and the rest of the SoC. This estimate is calculated via analytical formula based on the maximum frequency of your current graphics hardware.

    • DRAM roof: Represents the maximal bandwidth of the DRAM memory available to your current graphics hardware. Measured using an optimized sequence of load operations, iterating over an array that does not fit in GPU caches.

  • Horizontal lines indicate compute capacity limitations preventing kernels from achieving better performance without some form of optimization.

  • A dot cannot exceed the topmost rooflines, as these represent the maximum capabilities of the machine. However, not all kernels can utilize maximum machine capabilities.

  • The greater the distance between a dot and the highest achievable roofline, the more opportunity exists for performance improvement.

The GPU Roofline chart is based on a CPU Roofline chart layout, but there are some differences:

  • The dots on the chart correspond to OpenCL, OpenMP, Level Zero and DPC++ kernels, while in the CPU version, they correspond to individual loops.

  • Some displayed information and controls (for example, thread/core count) are not relevant to GPU Roofline. For more information, see the table below.

  • The GPU Roofline chart enables you to view arithmetic intensity of one kernel at multiple memory levels. To do so, double-click a dot representing this kernel or select it and perss ENTER. The dots that appear on the Roofline chart correspond to different memory levels used to calculate arithmetic intensity. Hover over a dot to identify its arithmetic intensity. To show or hide certain dots from a chart, use the Memory Level drop-down filter.

Run GPU Roofline Insights Perspective from Command Line Interface

To run GPU Roofline Insights perspective using advisor command line interface, use the following command:

advisor --collect=roofline --profile-gpu --project-dir=./advi  --search-dir src:p=./advi –- myApplication
  1. Collect performance metrics for loops/functions of your application using Survey analysis:
    advisor --collect=survey --profile-gpu --project-dir=./advi  --search-dir src:p=./advi –- myApplication
  2. Collect floating-point operations data using Characterization analysis:
    advisor --collect=tripcounts --no-trip-counts --flop --profile-gpu --project-dir=./advi --search-dir src:p=./advi –- myApplication

    Where:

    • no-trip-counts disables collection of trip counts during Characterization analysis.
    • flop enables collection of data about floating-point and integer operations, memory traffic, and mask utilization metrics for AVX-512 platforms during Characterization analysis.

This command is a batch mode that runs two analyses one by one:

  1. Survey analysis that collects loops/functions execution time data and measure L3, SLM, and GTI traffic.
  2. Characterization analysis that collects floating-point and integer operations considering mask utilization, and CARM memory traffic to measure arithmetic intensity and performance of your application.

To view the achieved performance of your application against hardware-imposed performance ceilings on an interactive Roofline chart, open the collected results in the Intel Advisor GUI or use the following command to generate an interactive HTML Roofline report:

advisor --report=roofline --profile-gpu --report-output=./advi/advisor-roofline.html --project-dir=./advi

Where report-output option specifies the directory and the HTML file into which Intel Advisor saves the generated report.

By default, Intel Advisor generates a FLOAT Roofline chart. To switch to INT Roofline chart, add a –-data-type=int option to your command.

For details about generating CLI reports, see the respective section in the Intel Advisor User Guide or use the following command in your terminal:
advisor --help report

What's Next

Use the GPU Roofline Summary to compare performance of your application on a CPU and on a GPU device.

Investigate performance metrics for your kernels and recommendations with possible optimization steps in the GPU Code Analytics pane.

See Also

Explore a use case for optimizing GPU usage described in Intel Advisor Cookbook: Identify Code Regions to Offload to GPU and Visualize GPU Usage.

Prototype Threading Designs

Threading perspective enables you to identify the best candidates for parallelizing, prototype threading and check, if there are data dependencies preventing parallelizing of certain functions/loops.

Intel Advisor Typical Workflow: Prototype Threading Designs

There are two ways to run the Threading perspective: from Intel® Advisor GUI and from CLI. Intel Advisor enables you to open results collected using both methods in the GUI.

Run Threading Perspective from Intel® Advisor GUI

To run Threading perspective and improve performance of your application, follow the scenario described below:

  1. In the Analysis Workflow pane, use a drop-down menu to select the Threading perspective, set data collection accuracy to Low, and click the button to run the perspective. At this accuracy level, Intel Advisor collects data about execution time of your functions/loops using Survey analysis. Upon completion, Intel Advisor generates a Survey report.
  2. Filter functions/loops in the Survey report by Total Time. Loops/functions with the longest execution time are the best candidates for parallelizing.
  3. In your source code, insert annotations to mark regions that are profitable to parallelize and re-build your application.

    The main types of Intel Advisor annotations mark the location of:

    • A parallel site. A parallel site is a region of code that contains one or more tasks that may execute in one or more parallel threads to distribute work. An effective parallel site typically contains a hotspot that consumes application execution time. To distribute these frequently executed instructions to different tasks that can run at the same time, the best parallel site is not usually located at the hotspot, but higher in the call tree.

    • One or more parallel tasks within a parallel site. A task is a portion of time-consuming code with data that can be executed in one or more parallel threads to distribute work.

    • Locking synchronization, where mutual exclusion of data access must occur in the parallel application.

  4. In the Intel Advisor GUI, set the data collection accuracy level to Medium and rerun Threading perspective. At this accuracy level, Intel Advisor:
    • Collects data about execution time of your functions/loops using Survey analysis
    • Collects trip counts data using the Characterization analysis
    • Runs Suitability analysis to prototype threading designs for annotated regions and estimate the potential speed-up achieved by parallelizing
    • Collects dependencies data to identify data sharing problems that might be showstoppers for parallelizing

Upon completion, open the Suitability report in the results window.

The Suitability Report predicts maximum speedup based on the inserted annotations and what-if modeling parameters with which you can experiment, such as:

  • Different hardware configurations and parallel frameworks

  • Different trip counts and instance durations

  • Any plans to address parallel overhead, lock contention, or task chunking when you implement your parallel framework code

Use the Refinement report to examine the collected dependencies data and identify functions/loops with data sharing problems.

Run Threading Perspective from Command Line Interface

To run Threading perspective using advisor command line interface, do the following:

  1. Run Survey analysis to collect performance metrics and identify loops/functions with the longest total time:
    advisor --collect=survey --project-dir=./advi --search-dir=./advi -- myApplication
    Note: In the Threading perspective, you should specify the Source Search directory using the --search-dir option.
  2. Generate a Survey report and filter functions/loops by Total Time:
    advisor --report=survey --filter=total-time --project-dir=./advi

    Loops/functions with the longest execution time are the best candidates for parallelizing.

  3. In your source code, insert annotations for loops/functions that are the best candidates for parallelizing and rebuild your application.
  4. Collect trip counts data:
    advisor --collect=tripcounts --project-dir=./advi --search-dir src:p=./advi -- myApplication
  5. Run Suitability analysis to prototype threading designs for the annotated functions/loops:
    advisor --collect=suitability --project-dir=./advi --search-dir src:p=./advi -- myApplication
  6. Run Dependencies analysis to identify data sharing problems that might prevent functions/loops from parallelizing:
    advisor --collect=dependencies --project-dir=./advi --search-dir src:p=./advi -- myApplication

Open and examine the collected results in the Intel Advisor GUI or generate reports using CLI:

  • To generate Suitability report, use the following command:
    advisor --report=suitability --project-dir=./advi

    By default, Intel Advisor displays the report in the terminal and saves it to <project-dir>\eNNN\stNNN\advisor-suitability.txt.

  • To generate Dependencies report, use the following command:
    advisor --report=dependencies --project-dir=./advi

    By default, Intel Advisor displays the report in the terminal and saves it to <project-dir>\eNNN\dpNNN\advisor-dependencies.txt.

For details about generating CLI reports, see the respective section in the Intel Advisor User Guide or use the following command in your terminal:

advisor --help report

What's Next

If you decide the predicted maximum speedup benefit is worth the effort to add threading parallelism to your application:

  1. Complete developer/architect design and code reviews about the proposed parallel changes.

  2. Choose one parallel programming framework (threading model) for your application, such as oneTBB, OpenMP*, Microsoft Task Parallel Library* (TPL), or some other parallel framework.

  3. Add the parallel framework to your build environment.

  4. Add parallel framework code to synchronize access to the shared data resources, such as oneTBB or OpenMP locks.

  5. Add parallel framework code to create parallel tasks.

As you add the appropriate parallel code from the chosen parallel framework, you can keep, comment out, or replace the Intel Advisor annotations.

Learn More

Resource

Description

Intel Advisor User Guide

Refer to this guide for instructions to get started with the command line, detailed information on analysis types, information on how to use the GUI, and more.

Vectorization Resources for Intel Advisor Users

View the most useful resources that can help you achieve better performance of your application using vectorization.
Roofline Resources for Intel® Advisor Users View the most useful resources that can help you identify hardware-imposed ceilings using Intel Advisor CPU/GPU Roofline perspectives.

Intel Advisor Cookbook

Explore typical use-cases of Intel Advisor. Follow the step-by-step instructions to help effectively use more cores, vectorization, or heterogeneous processing.

Flow Graph Analyzer User Guide

Explore a built-in graphical tool that helps you visualize and analyze graphs of a oneAPI Threading Building Blocks (oneTBB), OpenMP*, and Data Parallel C++ (DPC++) applications.

Analyze Performance Remotely and Visualize Results on a Local macOS* System

View a step-by-step instruction how to visualize Intel Advisor perspective results on a macOS machine.

Intel Advisor Release Notes and New Features

Explore new features of Intel Advisor.

Vectorization Tutorial

Threading Tutorial

Roofline Tutorial for Windows* OS

View tutorials that can help you experiment with Intel Advisor sample applications and run different perspectives.

Offline Resources

One of the key Vectorization perspective features is GUI-embedded advice on how to fix vectorization issues specific to your code. To help you quickly locate information that augments that GUI-embedded advice, the Intel Advisor provides offline compiler mini-guides. You can also find offline Recommendations and Compiler Diagnostic Details advice libraries in the same location as the mini-guides. Each issue and recommendation in these HTML files is collapsible/expandable.

Linux* OS: Available offline documentation is installed inside <advisor-install-dir>/documentation/<locale>/.

Windows* OS: Available offline documentation is installed inside <advisor-install-dir>\documentation\<locale>\.

Note:

You may encounter the following known issues when using the following to view documentation:

  • Microsoft Windows Server* 2012 system: Trusted site prompt appears. Solution: Add about:internet to the list of trusted sites in the Tools > Internet Options > Security tab. You can remove after you finish viewing the documentation.

  • Microsoft Internet Explorer* 11 browser: Topics do not appear when you select them in the TOC pane. Solution: Add http://localhost to the list of trusted sites in the Tools > Internet Options > Security tab. You can remove after you finish viewing the documentation.

  • Microsoft Edge browser:

    • Context-sensitive (also known as F1) calls to a specific topic open the title page of the corresponding document instead. Solution: Use a different default browser.

    • Panes are truncated and a proper style sheet is not applied. Solution: Use a different default browser.

Notices and Disclaimers

Intel technologies may require enabled hardware, software or service activation.

No product or component can be absolutely secure.

Your costs and results may vary.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation in the United States and/or other countries.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.