Sponsored by
; contest hosted at the FPL'26 conference.
Hosted on GitHub Pages using the Dinky theme
This page describes the suite of benchmark designs that are used to assess contestant performance. The full list of benchmark designs, along with links to the original sources and some utilization numbers is provided in the following tables.
The benchmark DCP files are available as a single archive from the v1.0.0 release:
| File | Size |
|---|---|
fpl26_contest_benchmarks_v1.0.0.tar.gz |
~511 MB |
fpl26_contest_benchmarks_v1.0.0.md5 |
checksum |
After downloading, extract the archive in the repository root:
tar xzf fpl26_contest_benchmarks_v1.0.0.tar.gz
This creates a fpl26_contest_benchmarks/ directory containing all 12 benchmark
DCP files. Alternatively, make setup will download and extract the benchmarks
automatically.
| Source | Benchmark Name | LUTs | FFs | DSPs | BRAMs | Fmax (MHz) |
|---|---|---|---|---|---|---|
| AMD | amd_mini-isp |
3k | 4k | 40 | 12 | 307 |
| BOOM | boom_soc |
227k | 98k | 61 | 161 | 48.2 |
| CoreScore | corescore_500_mod |
100k | 120k | 0 | 250 | 344.2 |
| FINN | finn_radioml |
74k | 46k | 0 | 25 | 284.9 |
| ISPD16 | ispd16_example2 |
289k | 234k | 200 | 384 | 107.6 |
| LogicNets | logicnets_jscl (Jet Substructure Classification L) |
31k | 2k | 0 | 0 | 403.6 |
| Rosetta | rosetta_3d-rendering |
14k | 5k | 3 | 0 | 270.9 |
| Rosetta | rosetta_digit-recognition |
23k | 23k | 0 | 161 | 367.0 |
| Rosetta | rosetta_optical-flow |
34k | 37k | 42 | 61 | 324.9 |
| Rosetta | rosetta_spam-filter |
5k | 13k | 224 | 3 | 437.4 |
| VexRiscv | vexriscv_re-place |
2k | 1k | 4 | 6 | 310.2 |
| VTR | vtr_mcml |
43k | 15k | 105 | 142 | 62.2 |
The following table shows the Fmax improvement achieved by the optimization agent
(dcp_optimizer.py) on each benchmark design. All timing is measured on the
clk_fpl26contest clock domain. Each design was given a maximum runtime of 1 hour.
| Benchmark | Initial Fmax (MHz) | Best Fmax (MHz) | Improvement (MHz) | Improvement (%) | WNS Change (ns) | Runtime | Status |
|---|---|---|---|---|---|---|---|
amd_mini-isp |
307.13 | 375.38 | +68.25 | +22.2% | -1.686 → -1.094 | 619s | Completed |
boom_soc |
48.24 | 50.77 | +2.53 | +5.2% | -19.162 → -18.126 | 3601s | Timed out |
corescore_500_mod |
344.23 | 423.37 | +79.14 | +23.0% | -1.238 → -0.695 | 2131s | Completed |
finn_radioml |
284.90 | 324.46 | +39.56 | +13.9% | -1.910 → -1.482 | 1998s | Completed |
ispd16_example2 |
107.64 | 107.64 | +0.00 | +0.0% | -7.752 → -7.752 | 3600s | Timed out |
logicnets_jscl |
403.55 | 434.97 | +31.42 | +7.8% | -0.978 → -0.799 | 825s | Completed |
rosetta_3d-rendering |
270.93 | 279.25 | +8.32 | +3.1% | -2.153 → -2.043 | 1823s | Completed |
rosetta_digit-recognition |
366.97 | 390.78 | +23.81 | +6.5% | -1.025 → -0.859 | 1512s | Completed |
rosetta_optical-flow |
324.89 | 330.14 | +5.26 | +1.6% | -1.078 → -1.029 | 558s | Completed |
rosetta_spam-filter |
437.45 | 494.07 | +56.63 | +12.9% | -0.686 → -0.424 | 1000s | Completed |
vexriscv_re-place |
310.17 | 415.28 | +105.11 | +33.9% | -1.654 → -0.838 | 385s | Completed |
vtr_mcml |
62.25 | 73.05 | +10.80 | +17.4% | -14.527 → -12.152 | 1689s | Completed |
Summary: 10 of 12 designs completed within the 1-hour limit. Average Fmax
improvement across completed designs was +44.2 MHz (+14.7%). The best single
improvement was vexriscv_re-place at +105.11 MHz (+33.9%). The two timed-out
designs (boom_soc and ispd16_example2) are the largest benchmarks and spent
most of their allotted time in place and route operations.
To be released after contest concludes.
Each of the benchmarks targets the xcvu3p-ffvc1517-2-e device which has the following resources:
| LUTs | FFs | DSPs | BRAMs |
|---|---|---|---|
| 394k | 788k | 2280 | 720 |
Some designs will have multiple clock domains. To make it obvious which clock domain to optimize, each benchmark has a created XDC clock constraint called clk_fpl26contest that should be optimized. All benchmarks in the contest will have the following attributes:
clk_fpl26contest that should be the target clock domain to optimize.To retrieve the clock period and WNS for clk_fpl26contest in a Vivado Tcl session:
# Get the clock period
get_property PERIOD [get_clocks clk_fpl26contest]
# Get setup WNS filtered to the contest clock
set tp [get_timing_paths -max_paths 1 -setup -to [get_clocks clk_fpl26contest]]
get_property SLACK $tp
Note that report_timing_summary reports the overall WNS across all clock
domains, which may differ from the WNS on clk_fpl26contest in multi-clock
designs. Tools and scripts should query WNS using -to [get_clocks clk_fpl26contest] to ensure the Fmax calculation reflects the correct clock
domain. Fmax is calculated as 1000 / (period - WNS) where WNS is negative.