Sponsored by
; contest hosted at the FPL'26 conference.
Hosted on GitHub Pages using the Dinky theme
This page describes the suite of benchmark designs that are used to assess contestant performance. The full list of benchmark designs, along with links to the original sources and some utilization numbers is provided in the following tables.
The benchmark DCP files are available as a single archive from the v1.1.0 release:
| File | Size |
|---|---|
fpl26_contest_benchmarks_v1.1.0.tar.gz |
~525 MB |
fpl26_contest_benchmarks_v1.1.0.md5 |
checksum |
After downloading, extract the archive in the repository root:
tar xzf fpl26_contest_benchmarks_v1.1.0.tar.gz
This creates a fpl26_contest_benchmarks/ directory containing all 13 benchmark
DCP files. Alternatively, make setup will download and extract the benchmarks
automatically.
| Source | Benchmark Name | LUTs | FFs | DSPs | BRAMs | Fmax (MHz) |
|---|---|---|---|---|---|---|
| AMD | amd_mini-isp |
3k | 4k | 40 | 12 | 307 |
| BOOM | boom_soc |
227k | 98k | 61 | 161 | 48.2 |
| CoreScore | corescore_500_mod |
100k | 120k | 0 | 250 | 344.2 |
| FINN | finn_radioml |
74k | 46k | 0 | 25 | 284.9 |
| ISPD16 | ispd16_example2 |
289k | 234k | 200 | 384 | 107.6 |
| LogicNets | logicnets_jscl (Jet Substructure Classification L) |
31k | 2k | 0 | 0 | 403.6 |
| Rosetta | rosetta_3d-rendering |
14k | 5k | 3 | 0 | 270.9 |
| Rosetta | rosetta_digit-recognition |
23k | 23k | 0 | 161 | 367.0 |
| Rosetta | rosetta_optical-flow |
34k | 37k | 42 | 61 | 324.9 |
| Rosetta | rosetta_spam-filter |
5k | 13k | 224 | 3 | 437.4 |
| VexRiscv | vexriscv_re-place |
2k | 1k | 4 | 6 | 310.2 |
| VexRiscv | vexriscv_re-place_v2 † |
2k | 2k | 4 | 4 | 397.5 |
| VTR | vtr_mcml |
43k | 15k | 105 | 142 | 62.2 |
† vexriscv_re-place_v2 was held back as a hidden benchmark during the alpha
submission round and was not made available to contestants until after that
round closed. It was released publicly as part of the
v1.1.0 benchmark archive
and is included in the suite for the remainder of the contest.
The following table shows the Fmax improvement achieved by the optimization agent
(dcp_optimizer.py) on the 12 benchmarks that were public during the alpha
submission round. The hidden alpha benchmark vexriscv_re-place_v2 is not
covered in this table. All timing is measured on the clk_fpl26contest clock
domain. Each design was given a maximum runtime of 1 hour.
| Benchmark | Initial Fmax (MHz) | Best Fmax (MHz) | Improvement (MHz) | Improvement (%) | WNS Change (ns) | Runtime | Status |
|---|---|---|---|---|---|---|---|
amd_mini-isp |
307.13 | 375.38 | +68.25 | +22.2% | -1.686 → -1.094 | 619s | Completed |
boom_soc |
48.24 | 50.77 | +2.53 | +5.2% | -19.162 → -18.126 | 3601s | Timed out |
corescore_500_mod |
344.23 | 423.37 | +79.14 | +23.0% | -1.238 → -0.695 | 2131s | Completed |
finn_radioml |
284.90 | 324.46 | +39.56 | +13.9% | -1.910 → -1.482 | 1998s | Completed |
ispd16_example2 |
107.64 | 107.64 | +0.00 | +0.0% | -7.752 → -7.752 | 3600s | Timed out |
logicnets_jscl |
403.55 | 434.97 | +31.42 | +7.8% | -0.978 → -0.799 | 825s | Completed |
rosetta_3d-rendering |
270.93 | 279.25 | +8.32 | +3.1% | -2.153 → -2.043 | 1823s | Completed |
rosetta_digit-recognition |
366.97 | 390.78 | +23.81 | +6.5% | -1.025 → -0.859 | 1512s | Completed |
rosetta_optical-flow |
324.89 | 330.14 | +5.26 | +1.6% | -1.078 → -1.029 | 558s | Completed |
rosetta_spam-filter |
437.45 | 494.07 | +56.63 | +12.9% | -0.686 → -0.424 | 1000s | Completed |
vexriscv_re-place |
310.17 | 415.28 | +105.11 | +33.9% | -1.654 → -0.838 | 385s | Completed |
vtr_mcml |
62.25 | 73.05 | +10.80 | +17.4% | -14.527 → -12.152 | 1689s | Completed |
Summary: 10 of 12 designs completed within the 1-hour limit. Average Fmax
improvement across completed designs was +44.2 MHz (+14.7%). The best single
improvement was vexriscv_re-place at +105.11 MHz (+33.9%). The two timed-out
designs (boom_soc and ispd16_example2) are the largest benchmarks and spent
most of their allotted time in place and route operations.
To be released after contest concludes.
Each of the benchmarks targets the xcvu3p-ffvc1517-2-e device which has the following resources:
| LUTs | FFs | DSPs | BRAMs |
|---|---|---|---|
| 394k | 788k | 2280 | 720 |
Some designs will have multiple clock domains. To make it obvious which clock domain to optimize, each benchmark has a created XDC clock constraint called clk_fpl26contest that should be optimized. All benchmarks in the contest will have the following attributes:
clk_fpl26contest that should be the target clock domain to optimize.To retrieve the clock period and WNS for clk_fpl26contest in a Vivado Tcl session:
# Get the clock period
get_property PERIOD [get_clocks clk_fpl26contest]
# Get setup WNS filtered to the contest clock
set tp [get_timing_paths -max_paths 1 -setup -to [get_clocks clk_fpl26contest]]
get_property SLACK $tp
Note that report_timing_summary reports the overall WNS across all clock
domains, which may differ from the WNS on clk_fpl26contest in multi-clock
designs. Tools and scripts should query WNS using -to [get_clocks clk_fpl26contest] to ensure the Fmax calculation reflects the correct clock
domain. Fmax is calculated as 1000 / (period - WNS) where WNS is negative.