dotTrace, dotMemory, PerfView, PerfMon, Windows Performance Recorder
A decent Benchmark workflow
flowchart LR
A[prepare] ==> B("<em><b>GlobalSetup</b></em>")
B ==> C("<em><b>Warmup</b></em>")
C ==> D[["🔁<em><b>Iteration</b></em>"]]
E1 --erroneous<br/>execution--x X[/"<em>Outliner</em><br/>❌"/]
D ===> F("<em><b>GlobalCleanup</b></em>")
F ==> G((("collect result<br/>📈")))
D -.-> E1(["<em><b>Invocation</b></em><br/>(measure time🕥)"])
E1 -.-> D
Warmup is for warming up the JIT compiler (eq. RyuJIT) or let Ngen.exe to create native images, or Tired Compilation stick to a steady state.
How many Invocations(operation count) is determined by a PIlot procedure.
By default, BDN will need project to be run as Release mode.
The BenchmarkDotNet.Diagnostics.Windows package is for Windows OS.
Generated Benchmarks.cs file:
using System;using BenchmarkDotNet;using BenchmarkDotNet.Attributes;namespace HelloBDN{publicclass Benchmarks{[Benchmark]publicvoidScenario1(){// Implement your benchmark here}[Benchmark]publicvoidScenario2(){// Implement your benchmark here}}}
[Benchmark] attribute marks a method as a benchmark case, it will got “Invocation” multiple times and BDN will collect its execution time, calculate.
Generated Program.cs file:
using BenchmarkDotNet.Configs;using BenchmarkDotNet.Running;namespace HelloBDN{publicclass Program{publicstaticvoidMain(string[] args){var config = DefaultConfig.Instance;var summary = BenchmarkRunner.Run<Benchmarks>(config, args);// Use this to select benchmarks from the console:// var summaries = BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args, config);}}}
BenchmarkRunner.Run<Benchmarks>(config, args) will run all benchmarks in Benchmarks class.
If you need to tweak the BDN running benchmark behavior, you can create a custom IConfig class or using fluid API style to create config object and use it in BenchmarkRunner.Run<T>() method.
Run the benchmark
Inside project folder, run:
dotnet run -c Release
BenchmarkDotNET report
After running the benchmark, BDN will generate a report in the console, and also generate report file(s) in BenchmarkDotNet.Artifacts/results folder.
So we achieved a quick benchmark demo with BenchmarkDotNET.
BenchmarkRunner generates an isolated project per each runtime settings and builds it in Release mode.
Next, we take each method/job/params combination and try to measure its performance by launching benchmark process several times (LaunchCount).
An invocation of the workload method is an operation. A bunch of operation is an iteration. If you have an IterationSetup method, it will be invoked before each iteration, but not between operations. We have the following type of iterations:
Pilot: The best operation count will be chosen.
OverheadWarmup, OverheadWorkload: BenchmarkDotNet overhead will be evaluated.
ActualWarmup: Warmup of the workload method.
ActualWorkload: Actual measurements.
Result = ActualWorkload - <MedianOverhead>
After all of the measurements, BenchmarkDotNet creates:
An instance of the Summary class that contains all information about benchmark runs.
A set of files that contains summary in human-readable and machine-readable formats.
A set of plots.
Microbenchmark example project
A microbenchmark example with BDN to prove characteristic of four Fibonacci Sequence generate algorithm:
FibonaccCore project is a class library project, contains four Fibonacci Sequence generate algorithm.
FibSeqMicroBench project is a console project that we write BDN benchmark test code here.
C# code
SequenceLib.cs file in FibonaccCore project:
publicstaticclass SequenceLib{/// <summary>/// Calculate Fibonacci number using loop implementation/// </summary>/// <param name="n"></param>/// <returns></returns>publicstatic BigInteger FibonacciUsingLoop(int n){if(n <=1){return n;}var a =newBigInteger(0);var b =newBigInteger(1);var result =newBigInteger(0);for(var i =2; i <= n; i++){ result = a + b; a = b; b = result;}return result;}/// <summary>/// Calculate Fibonacci number using recursion implementation/// </summary>/// <param name="n"></param>/// <returns></returns>publicstatic BigInteger FibonacciUsingRecursion(int n){if(n <=1){return n;}else{returnFibonacciUsingRecursion(n -1)+FibonacciUsingRecursion(n -2);}}/// <summary>/// Calculate Fibonacci number using Golden Ration approximation math formula/// https://www.wikihow.com/Calculate-the-Fibonacci-Sequence#Using-Binet.27s-Formula-and-the-Golden-Ratio/// </summary>/// <param name="n"></param>/// <returns></returns>publicstatic BigInteger FibonacciUsingGoldenRatio(int n){if(n <=1){return n;}// will be inaccurate after 70th Fibonacci number// https://stackoverflow.com/questions/41938313/n-th-fibonacci-with-binets-formula-not-accurate-after-70#41938441var phi =(1+ Math.Sqrt(5))/2;var result =(Math.Pow(phi, n)- Math.Pow(1- phi, n))/ Math.Sqrt(5);returnnewBigInteger(Math.Round(result));}/// <summary>/// Calculate Fibonacci number using Matrix Exponentiation/// https://www.nayuki.io/page/fast-fibonacci-algorithms/// </summary>/// <param name="n"></param>/// <returns></returns>publicstatic BigInteger FibonacciUsingMatrixExponentiation(int n){if(n <=1){return n;} BigInteger[,] F ={{1,1},{1,0}};Power(F, n -1);return F[0,0];}privatestaticvoidPower(BigInteger[,] F,int n){if(n <=1){return;} BigInteger[,] M ={{1,1},{1,0}};Power(F, n /2);Multiply(F, F);if(n %2!=0){Multiply(F, M);}}privatestaticvoidMultiply(BigInteger[,] F, BigInteger[,] M){var x = F[0,0]* M[0,0]+ F[0,1]* M[1,0];var y = F[0,0]* M[0,1]+ F[0,1]* M[1,1];var z = F[1,0]* M[0,0]+ F[1,1]* M[1,0];var w = F[1,0]* M[0,1]+ F[1,1]* M[1,1]; F[0,0]= x; F[0,1]= y; F[1,0]= z; F[1,1]= w;}/// <summary>/// Calculate Fibonacci number using Fast Doubling ( https://www.nayuki.io/page/fast-fibonacci-algorithms )/// </summary>/// <param name="n"></param>/// <returns></returns>publicstatic BigInteger FibonacciUsingFastDoubling(int n){var a = BigInteger.Zero;var b = BigInteger.One;for(var i =31; i >=0; i--){var d = a *(b *2- a);var e = a * a + b * b; a = d; b = e;if((((uint)n >> i)&1)!=0){var c = a + b; a = b; b = c;}}return a;}}
FibonacciSeqBenchmarks.cs file in FibonaccCore project:
[RankColumn(NumeralSystem.Roman)]publicclass FibonacciSeqBenchmarks{[ParamsSource(nameof(NthValues))]publicint Nth {get;set;}publicstatic IEnumerable<int> NthValues => ActualFibonacci.Keys;privatestaticint RecursionUpperLimit =>int.TryParse(Environment.GetEnvironmentVariable(Const.RecursionUpperLimit),outvar limit)? limit: Const.RecursionUpperLimitValue;[Benchmark(Baseline =true),BenchmarkCategory("simple","canonical")]public BigInteger FibSeqUsingLoop(){var result = FibonacciCore.SequenceLib.FibonacciUsingLoop(Nth);ValidateCorrectness(Nth, result);return result;}[Benchmark,BenchmarkCategory("simple","slow")]public BigInteger FibSeqUsingRecursion(){if(Nth > RecursionUpperLimit){thrownewNotSupportedException($"Recursion will run too long for {Nth}th over {RecursionUpperLimit}th");}var result = FibonacciCore.SequenceLib.FibonacciUsingRecursion(Nth);ValidateCorrectness(Nth, result);return result;}[Benchmark,BenchmarkCategory("math","approximate")]public BigInteger FibSeqUsingGoldenRatio(){var result = FibonacciCore.SequenceLib.FibonacciUsingGoldenRatio(Nth);ValidateCorrectness(Nth, result);return result;}[Benchmark,BenchmarkCategory("math","fast")]public BigInteger FibSeqUsingMatrixExponentiation(){var result = FibonacciCore.SequenceLib.FibonacciUsingMatrixExponentiation(Nth);ValidateCorrectness(Nth, result);return result;}[Benchmark,BenchmarkCategory("math","faster")]public BigInteger FibSeqUsingFastDoubling(){var result = FibonacciCore.SequenceLib.FibonacciUsingFastDoubling(Nth);ValidateCorrectness(Nth, result);return result;}#region Check Fibonacci correctness// see https://r-knott.surrey.ac.uk/Fibonacci/fibtable.html for precomputed Fibonacci seriesprivatestatic IReadOnlyDictionary<int, BigInteger> ActualFibonacci =>new Dictionary<int, BigInteger>(){{1,1},/* too long so omit it ... */{300, BigInteger.Parse("222232244629420445529739893461909967206666939096499764990979600")}};[MethodImpl(MethodImplOptions.NoInlining)]privatestaticvoidValidateCorrectness(int Nth, BigInteger result){if(ActualFibonacci[Nth]!= result){thrownewArithmeticException( $"Fibonacci calculation failed, actual {Nth}th is '{ActualFibonacci[Nth]}', but calculated is '{result}'");}}#endregion}
At first you may attempt to run the benchmark with dotnet run -c Release command directly inside the FibSeqMicroBench folder, but you will get an error message like this:
This is due to BDN default wants to run benchmark using optimized build, and by design of .NET Core SDK, if you don’t specify the solution file but just run the command in the project folder, the dependency project will not be built in optimized mode.
So you need to execuate the dotnet run command in the solution folder.
(See readme file in the github project root folder)
--list tree or --list flat : list all benchmark cases.
--help : show help message.
Run the microbenchmark
Finish microbenchmark
Microbenchmark result
Conclusion
BenchmarkDotNET is a powerful tool for measuring the performance of your code.
It is easy to write benchmarks with BDN.
You can abandon the test via throwing an exception in the benchmark method.
Use environment variables to control the benchmark running behavior.
Math Theory v.s. Real World Computer Architecture.
Use BDN on various C# applications
Web (Web API)
For a basic ASP.NET Core Web API / minimal API project, we can create the WebApplication instance and associated HttpClient in the [GlobalSetup] method, and dispose of them in the [GlobalCleanup] method.
flowchart LR
B(["<em><b>Benchmark program</b></em><br/>(measure time🕥)"]) -...-> W[["Web API endpoint"]]
W -...-> B
B(["<em><b>Benchmark program</b></em><br/>(measure time🕥)"]) -...-> g[["gRPC Service"]]
g -...-> B
B(["<em><b>Benchmark program</b></em><br/>(measure time🕥)"]) -...-> O[["MS Orleans Silo"]]
O -...-> B
When design test data & benchmark method signature, Be caution of C# Compiler limitation:
CSC : error CS8103: Combined length of user strings used by the program exceeds allowed limit.
Caution
The workaround is to reduce the usage of primitive string type parameter in benchmark method signature.
Web (deal with ASP.NET Core Dependency Injection)
[GlobalSetup] / [GlobalCleanup] / [IterationSetup] / [IterationCleanup] attributes to manually setup/cleanup the DI container.
private ServiceProvider _serviceProvider =null!;private Echo.EchoClient _client =null!;[GlobalSetup]publicvoidPrepareClient(){var serviceCollection =newServiceCollection(); serviceCollection.AddGrpcClient<Echo.EchoClient>( options =>{ options.Address=newUri("https://localhost:7228"); options.ChannelOptionsActions.Add(channelOptions =>{// you need to raise the message size limit to send large messages// see https://github.com/grpc/grpc-dotnet/issues/2277#issuecomment-1728559455 channelOptions.MaxSendMessageSize=int.MaxValue; channelOptions.MaxReceiveMessageSize=int.MaxValue;});}); _serviceProvider = serviceCollection.BuildServiceProvider();}[IterationSetup]publicvoidInitGrpcClient(){ _client = _serviceProvider.GetRequiredService<Echo.EchoClient>();}[Benchmark][ArgumentsSource(nameof(GetTestData))]public async Task<string>gRPC_Invoke(RequestMsg request){var reply = await _client.EchoAsync(new EchoRequest { Message = request.Message});return reply.Message;}[IterationCleanup]publicvoidCleanupClient(){ _client =null!;}[GlobalCleanup]publicvoidCleanupServiceProvider(){ _serviceProvider.Dispose();}
On Windows you need to install Visual Studio with proper C++ workload via ./eng/scripts/InstallVisualStudio.ps1
Run restore.cmd on Windows or restore.sh on Linux/macOS at the root of git repository.
Go to ./src/Components folder, run build.cmd on Windows or build.sh on Linux/macOS (Note: if there’s some build fail about NPM/Webpack, just ignore it)
Go to ./src/Components/Components/pref, run dotnet run -c Release -- -f '*' to run benchmarks.
“Client-Server” architect, which can accept BDN project to build & run for doing micro benchmark:
Write a .yml file to specify the .NET SDK, runtime version, and how to run the benchmark project, then submit project folder data to a remote machine for building & running benchmark test on “Crank-Agent” server:
Crank-Controller side can use --json argument to export Crank-Agent benchmark running raw data as json format file in controller side , then use compare command option to merge results:
Its purpose is to compare Object to JSON string serialization performance of different libraries:
It will help you resolve the issue when you need to do some performance comparison between different versions of the same library by using WithNuget() API of BDN:
or different libraries that consume the same nuget package(eq. JSON.NET) but with different versions, and you want to write benchmarks to compare:
After clone the git repository, Open the solution file (Sample Applications/WPFGallery/WPFGallery.sln) using Visual Studio 2022 with .NET 9 Desktop Development workload installed folder,
Then run the WPF Gallery Preview app in Visual Studio 2022 once to generate the executable file (WPFGallery.exe) inside the Sample Applications-windows folder:
Run PerfView with Administrative access right in Windows, select [Collect]/[Collect(Alt+C)] menu entry, on opened dialog, Set “Current Dir:” to a folder that will store collected *.etl.zip file; Add E13B77A8-14B6-11DE-8069-001B212B5009:2:* in “Additional Providers”, make sure the “.NET Symbol Collection” is checked, then press Start Collection button:
Run the WPFGallery.exe and go to the Icons page to record performance, then back to PerfView and press Stop Collection button.
Waiting for PerfView to generate the collection file ( PerfViewData.etl.zip* ) and it will show on PerfView main window left tree view, which is expandable to see the performance data:
Double click the “CPU Stacks”, then select the WPFGallery:
On opened new window, select all entires in “By Name?” tab, then right click open context menu to select Lookup Symbols
After lookup symbols done, you can type IconsPageViewModel in the top Find input box to search for the ViewModel class entries in the list, PerfView will auto switch to a Call Tree view to see call tree view, and select correct view style in GroupPats: list:
PerfView Features - Export flame graph
Use “Flame Graph” to get visualized overview of performance data.
Move mouse pointer to hover on the block to see method name.
Zoom in/out via mouse wheel.
Left click & drag to move viewport.
Right click and select “Save Frame Graph” to save current view as .png file