Benchmark/QoR

This section provides the L2 performance benchmarks and QoR (Quality of Results) for AIE DSP library elements with various configurations. The results are extracted from hardware emulation based simulations using the Makefile flow defined in: Compiling and Simulating Using the Makefile.

The QoR are reflected using the below metrics:

  • cycleCountAvg - average cycle count that takes to execute kernel function (not including kernel/window buffer overheads).
  • throughputAvg - input throughput calculated based on cycleCountAvg, taking into account input window size (not including kernel/window buffer overheads).
  • initiationInterval - time that must pass between two consecutive iterations execution starts of a given function, including overheads i.e., time between a function start and its previous start.
  • throughputInitIntAvg - input throughput calculated based on initiationInterval, taking into account input window size.
  • NUM_BANKS - number of memory banks used by the design
  • NUM_AIE - number of AIE tiles used by the design
  • DATA_MEMORY - total data memory in Bytes used by the design
  • PROGRAM_MEMORY - program memory in Bytes used by each kernel

In addition, for multi-kernel designs, each kernel may take a different amount of time to execute and as a result, figures reported for each kernel’s cycleCountAvg, throughputAvg may vary slightly.

To give a good comparison figure, the highest value of cycleCountAvg reported by each kernel in a multi-kernel configuration will be presented as cycleCountAvg in the benchmark tables. Similarly, the lowest value of throughputAvg`reported by each kernel will be presented as `throughputAvg.

Furthermore, PROGRAM_MEMORY metrics are harvested for each kernel the design consists of. For example a FIR configured to be implemented on two tiles (CASC_LEN=2) will have two sets of figures displayed in the table below (space deliminated).

Filters

Following table gives results for FIR filter with a wide variety of supported parameters, which are defined in: L2 FIR configuration parameters

fir_benchmark.csv

FIR benchmark
Library Element DATA_TYPE COEFF_TYPE FIR_LEN INTERPOLATE_FACTOR DECIMATE_FACTOR INPUT_WINDOW_VSIZE CASC_LEN DUAL_IP USE_COEFF_RELOAD cycleCountAvg throughputAvg initiationInterval throughputInitIntAvg NUM_BANKS NUM_ME DATA_MEMORY PROGRAM_MEMORY
fir_decimate_asym cfloat cfloat 21 1 3 384 1 0 0 9364 41 MSa/s 9430 40 MSa/s 5 1 12209 4446
fir_decimate_asym cint32 cint32 30 1 3 384 1 0 0 2162 177 MSa/s 2229 172 MSa/s 5 1 12476 2902
fir_decimate_asym cint32 cint16 99 1 3 384 1 0 0 3707 103 MSa/s 3791 101 MSa/s 5 1 14510 3912
fir_decimate_asym cint32 cint16 9 1 3 384 1 0 0 451 851 MSa/s 777 494 MSa/s 5 1 10966 1748
fir_decimate_asym cint32 cint16 30 1 3 384 1 0 0 1127 340 MSa/s 1194 321 MSa/s 5 1 11730 2870
fir_decimate_asym cint32 cint16 21 1 3 384 1 0 0 837 458 MSa/s 909 422 MSa/s 5 1 11398 2208
fir_decimate_asym cint16 int16 99 1 3 384 5 0 0 369 1040 MSa/s 551 696 MSa/s 13 5 29536 1890 1574 1574 1630 1764
fir_decimate_asym cint16 int16 99 1 3 384 4 0 0 367 1046 MSa/s 551 696 MSa/s 11 4 24231 2182 1574 1574 1692
fir_decimate_asym cint16 int16 99 1 3 384 3 0 0 432 888 MSa/s 611 628 MSa/s 9 3 19022 2382 2110 2216
fir_decimate_asym cint32 cint32 9 1 3 384 1 0 0 741 518 MSa/s 807 475 MSa/s 5 1 11172 1786
fir_decimate_asym cint16 int16 99 1 3 384 2 0 0 558 688 MSa/s 684 561 MSa/s 7 2 13717 2526 2288
fir_decimate_asym cint16 int16 99 1 3 384 1 0 0 936 410 MSa/s 1008 380 MSa/s 5 1 8476 2784
fir_decimate_asym cint16 int16 9 1 3 384 1 0 1 150 2560 MSa/s 390 984 MSa/s 8 1 6668 2862
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 390 984 MSa/s 5 1 6656 2162
fir_decimate_asym cint16 int16 9 1 3 192 1 0 0 94 2042 MSa/s 197 974 MSa/s 5 1 4608 2178
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 401 957 MSa/s 5 1 6656 2162
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 390 984 MSa/s 5 1 6656 2194
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 390 984 MSa/s 5 1 6656 2194
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 390 984 MSa/s 5 1 6656 2194
fir_decimate_asym cint16 int16 99 1 3 384 1 0 1 1068 359 MSa/s 1173 327 MSa/s 8 1 8656 3264
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 390 984 MSa/s 5 1 6656 2194
fir_decimate_asym cint32 cint32 99 1 3 384 1 0 0 7271 52 MSa/s 7355 52 MSa/s 5 1 16852 4206
fir_decimate_asym cint32 int16 30 1 3 384 1 0 0 684 561 MSa/s 776 494 MSa/s 5 1 11358 2312
fir_decimate_asym int32 int32 99 1 3 384 1 0 0 1768 217 MSa/s 1841 208 MSa/s 5 1 9646 3758
fir_decimate_asym int32 int32 9 1 3 384 1 0 0 214 1794 MSa/s 390 984 MSa/s 5 1 6806 2666
fir_decimate_asym int32 int32 30 1 3 384 1 0 0 642 598 MSa/s 701 547 MSa/s 5 1 7378 2398
fir_decimate_asym int32 int32 21 1 3 384 1 0 0 485 791 MSa/s 547 702 MSa/s 5 1 7110 1822
fir_decimate_asym int32 int16 99 1 3 384 1 0 0 2559 150 MSa/s 2632 145 MSa/s 5 1 8476 3624
fir_decimate_asym int32 int16 9 1 3 384 1 0 0 238 1613 MSa/s 391 982 MSa/s 5 1 6656 2118
fir_decimate_asym int32 int16 30 1 3 384 1 0 0 606 633 MSa/s 668 574 MSa/s 5 1 7006 2676
fir_decimate_asym int32 int16 21 1 3 384 1 0 0 431 890 MSa/s 493 778 MSa/s 5 1 6888 2312
fir_decimate_asym cint32 int16 21 1 3 384 1 0 0 556 690 MSa/s 778 493 MSa/s 5 1 11176 2016
fir_decimate_asym float float 99 1 3 384 1 0 0 10827 35 MSa/s 10900 35 MSa/s 5 1 9903 7178
fir_decimate_asym float float 30 1 3 384 1 0 0 3714 103 MSa/s 3774 101 MSa/s 5 1 7635 3704
fir_decimate_asym float float 21 1 3 384 1 0 0 2623 146 MSa/s 2684 143 MSa/s 5 1 7367 2836
fir_decimate_asym cint32 int32 99 1 3 384 1 0 0 3707 103 MSa/s 3791 101 MSa/s 5 1 14510 3912
fir_decimate_asym cint32 int32 9 1 3 384 1 0 0 451 851 MSa/s 777 494 MSa/s 5 1 10966 1748
fir_decimate_asym cint32 int32 30 1 3 384 1 0 0 1127 340 MSa/s 1194 321 MSa/s 5 1 11730 2870
fir_decimate_asym cint32 int32 21 1 3 384 1 0 0 837 458 MSa/s 909 422 MSa/s 5 1 11398 2208
fir_decimate_asym cint32 int16 99 1 3 384 1 0 0 1845 208 MSa/s 1929 199 MSa/s 5 1 13340 3916
fir_decimate_asym cint32 int16 9 1 3 384 1 0 0 362 1060 MSa/s 776 494 MSa/s 5 1 10816 1740
fir_decimate_asym float float 9 1 3 384 1 0 0 1008 380 MSa/s 1067 359 MSa/s 5 1 7063 2558
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 390 984 MSa/s 5 1 6656 2194
fir_decimate_asym cint32 cint32 21 1 3 384 1 0 0 1511 254 MSa/s 1577 243 MSa/s 5 1 11940 2668
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 390 984 MSa/s 5 1 6656 2162
fir_decimate_asym cint16 int16 144 1 2 432 1 0 0 2152 200 MSa/s 2229 193 MSa/s 5 1 10058 3232
fir_decimate_asym cint16 int16 140 1 7 420 1 0 0 1681 249 MSa/s 1759 238 MSa/s 5 1 10098 3218
fir_decimate_asym cint16 int16 140 1 5 420 1 0 0 893 470 MSa/s 971 432 MSa/s 5 1 9730 3276
fir_decimate_asym cint16 int16 12 1 4 384 1 0 0 135 2844 MSa/s 390 984 MSa/s 5 1 6442 1876
fir_decimate_asym cint16 int16 12 1 3 384 1 0 1 207 1855 MSa/s 391 982 MSa/s 8 1 6694 2966
fir_decimate_asym cint16 int16 12 1 3 384 1 0 0 151 2543 MSa/s 390 984 MSa/s 5 1 6674 2198
fir_decimate_asym cint16 int16 12 1 2 384 1 0 0 183 2098 MSa/s 390 984 MSa/s 5 1 7162 1894
fir_decimate_asym cint16 int16 12 1 3 384 1 0 0 151 2543 MSa/s 390 984 MSa/s 5 1 6674 2198
fir_decimate_asym cint16 cint16 99 1 3 384 1 0 0 1689 227 MSa/s 1762 217 MSa/s 5 1 9646 3698
fir_decimate_asym cint16 cint16 9 1 3 384 1 0 0 214 1794 MSa/s 390 984 MSa/s 5 1 6806 2666
fir_decimate_asym cint16 cint16 30 1 3 384 1 0 0 644 596 MSa/s 703 546 MSa/s 5 1 7378 2398
fir_decimate_asym cint16 cint16 21 1 3 384 1 0 0 423 907 MSa/s 485 791 MSa/s 5 1 7110 1966
fir_decimate_asym cfloat float 99 1 3 384 1 0 0 38338 10 MSa/s 38422 9 MSa/s 5 1 14781 10934
fir_decimate_asym cfloat float 9 1 3 384 1 0 0 3786 101 MSa/s 3850 99 MSa/s 5 1 11237 2512
fir_decimate_asym cfloat float 30 1 3 384 1 0 0 12141 31 MSa/s 12209 31 MSa/s 5 1 12001 4796
fir_decimate_asym cfloat float 21 1 3 384 1 0 0 8714 44 MSa/s 8787 43 MSa/s 5 1 11669 3640
fir_decimate_asym cfloat cfloat 99 1 3 384 1 0 0 44340 8 MSa/s 44428 8 MSa/s 5 1 17121 12474
fir_decimate_asym cint16 int16 9 1 3 384 1 0 0 150 2560 MSa/s 390 984 MSa/s 5 1 6656 2194
fir_decimate_asym cfloat cfloat 30 1 3 384 1 0 0 12341 31 MSa/s 12410 30 MSa/s 5 1 12745 5274
fir_decimate_asym cint16 int16 144 1 4 432 1 0 0 1132 381 MSa/s 1210 357 MSa/s 5 1 9770 3300
fir_decimate_asym cint16 int16 144 1 6 432 1 0 0 1972 219 MSa/s 2050 210 MSa/s 5 1 10058 3106
fir_decimate_asym cfloat cfloat 9 1 3 384 1 0 0 4591 83 MSa/s 4658 82 MSa/s 5 1 11441 2776
fir_decimate_asym cint16 int16 15 1 3 384 1 0 1 249 1542 MSa/s 392 979 MSa/s 8 1 6728 3298
fir_decimate_asym cint16 int16 15 1 3 384 1 0 0 183 2098 MSa/s 390 984 MSa/s 5 1 6692 2362
fir_decimate_asym cint16 int16 63 1 3 384 1 0 1 756 507 MSa/s 855 449 MSa/s 8 1 7784 2776
fir_decimate_asym cint16 int16 60 1 6 384 1 0 0 525 731 MSa/s 593 647 MSa/s 5 1 7482 2732
fir_decimate_asym cint16 int16 60 1 5 320 1 0 0 357 896 MSa/s 425 752 MSa/s 5 1 6850 2468
fir_decimate_asym cint16 int16 60 1 4 512 1 0 0 677 756 MSa/s 745 687 MSa/s 5 1 8778 2468
fir_decimate_asym cint16 int16 60 1 3 384 1 0 0 611 628 MSa/s 678 566 MSa/s 5 1 7634 2424
fir_decimate_asym cint16 int16 33 1 3 384 1 0 0 421 912 MSa/s 482 796 MSa/s 5 1 7184 2200
fir_decimate_asym cint16 int16 28 1 7 448 1 0 0 312 1435 MSa/s 453 988 MSa/s 5 1 7218 2176
fir_decimate_asym cint16 int16 27 1 3 384 1 0 1 355 1081 MSa/s 480 800 MSa/s 8 1 7040 2284
fir_decimate_asym cint16 int16 63 1 7 448 1 0 0 570 785 MSa/s 638 702 MSa/s 5 1 8156 2808
fir_decimate_asym cint16 int16 24 1 6 384 1 0 0 263 1460 MSa/s 391 982 MSa/s 5 1 6538 1814
fir_decimate_asym cint16 int16 237 1 3 768 1 0 0 4169 184 MSa/s 4258 180 MSa/s 5 1 15256 4152
fir_decimate_asym cint16 int16 237 1 3 384 1 0 1 2361 162 MSa/s 2483 154 MSa/s 8 1 11636 4648
fir_decimate_asym cint16 int16 237 1 3 384 1 0 0 2121 181 MSa/s 2210 173 MSa/s 5 1 11160 4152
fir_decimate_asym cint16 int16 21 1 3 384 1 0 1 415 925 MSa/s 506 758 MSa/s 8 1 6916 2776
fir_decimate_asym cint16 int16 20 1 5 400 1 0 0 234 1709 MSa/s 407 982 MSa/s 5 1 6706 1676
fir_decimate_asym cint16 int16 18 1 3 384 1 0 1 352 1090 MSa/s 442 868 MSa/s 8 1 6882 2736
fir_decimate_asym cint16 int16 18 1 3 384 1 0 0 291 1319 MSa/s 391 982 MSa/s 5 1 6870 1732
fir_decimate_asym cint16 int16 27 1 3 384 1 0 0 355 1081 MSa/s 416 923 MSa/s 5 1 6988 2000
fir_decimate_hb cint32 cint32 239 1 2 256 1 0 0 6427 39 MSa/s 6546 39 MSa/s 5 1 15708 4376
fir_decimate_hb cint32 cint32 7 1 2 256 1 0 0 262 977 MSa/s 519 493 MSa/s 5 1 8700 2276
fir_decimate_hb cint32 int16 239 1 2 256 1 0 0 21482 11 MSa/s 21606 11 MSa/s 5 1 13122 7258
fir_decimate_hb cint32 int16 11 1 2 256 1 0 0 1586 161 MSa/s 1650 155 MSa/s 5 1 8698 2494
fir_decimate_hb cint32 int16 15 1 2 256 1 0 0 2347 109 MSa/s 2411 106 MSa/s 5 1 8770 2756
fir_decimate_hb cint16 int16 99 1 2 128 4 0 0 194 659 MSa/s 257 498 MSa/s 11 4 15688 2162 1772 1772 2048
fir_decimate_hb cint32 cint32 99 1 2 256 1 0 0 2579 99 MSa/s 2671 95 MSa/s 5 1 11484 4194
fir_decimate_hb cint32 cint32 15 1 2 256 1 0 0 549 466 MSa/s 613 417 MSa/s 5 1 8988 1940
fir_decimate_hb cint32 cint16 15 1 2 256 1 0 0 517 495 MSa/s 580 441 MSa/s 5 1 8832 1784
fir_decimate_hb cint32 cint16 99 1 2 256 1 0 0 2183 117 MSa/s 2275 112 MSa/s 5 1 10800 3330
fir_decimate_hb cint32 cint16 7 1 2 256 1 0 0 166 1542 MSa/s 518 494 MSa/s 5 1 8640 2078
fir_decimate_hb cint32 cint16 239 1 2 256 1 0 0 5046 50 MSa/s 5165 49 MSa/s 5 1 13984 4400
fir_decimate_hb cint32 cint16 11 1 2 256 1 0 0 453 565 MSa/s 520 492 MSa/s 5 1 8720 1896
fir_decimate_hb cint16 int16 99 1 2 256 1 0 1 614 416 MSa/s 722 354 MSa/s 8 1 6614 3642
fir_decimate_hb cint16 int16 99 1 2 256 1 0 0 552 463 MSa/s 631 405 MSa/s 5 1 6570 2374
fir_decimate_hb cint16 int16 99 1 2 128 5 0 0 194 659 MSa/s 259 494 MSa/s 13 5 19250 2162 1772 1846 1846 2104
fir_decimate_hb cint32 int16 7 1 2 256 1 0 0 1586 161 MSa/s 1650 155 MSa/s 5 1 8626 2474
fir_decimate_hb cint32 cint32 11 1 2 256 1 0 0 453 565 MSa/s 521 491 MSa/s 5 1 8796 1928
fir_decimate_hb cint32 int16 99 1 2 256 1 0 0 8745 29 MSa/s 8846 28 MSa/s 5 1 10410 4020
fir_decimate_hb float float 99 1 2 256 1 0 0 5336 47 MSa/s 5413 47 MSa/s 5 1 7216 4926
fir_decimate_hb cint32 int32 15 1 2 256 1 0 0 517 495 MSa/s 580 441 MSa/s 5 1 8832 1784
fir_decimate_hb cint16 int16 99 1 2 128 2 0 0 228 561 MSa/s 289 442 MSa/s 7 2 8564 2444 2166
fir_decimate_hb int32 int32 99 1 2 256 1 0 0 1161 220 MSa/s 1239 206 MSa/s 5 1 6960 2352
fir_decimate_hb int32 int32 7 1 2 256 1 0 0 117 2188 MSa/s 261 980 MSa/s 5 1 5504 2182
fir_decimate_hb int32 int32 27 1 2 256 1 0 0 484 528 MSa/s 544 470 MSa/s 5 1 5808 1792
fir_decimate_hb int32 int32 239 1 2 256 1 0 0 1422 180 MSa/s 1511 169 MSa/s 5 1 8992 3490
fir_decimate_hb int32 int32 23 1 2 256 1 0 0 388 659 MSa/s 447 572 MSa/s 5 1 5728 1776
fir_decimate_hb int32 int16 99 1 2 256 1 0 0 3771 67 MSa/s 3856 66 MSa/s 5 1 6570 3672
fir_decimate_hb int32 int16 7 1 2 256 1 0 0 85 3011 MSa/s 261 980 MSa/s 5 1 5490 2082
fir_decimate_hb cint32 int32 11 1 2 256 1 0 0 453 565 MSa/s 521 491 MSa/s 5 1 8720 1808
fir_decimate_hb int32 int16 239 1 2 256 1 0 0 8316 30 MSa/s 8410 30 MSa/s 5 1 8130 6306
fir_decimate_hb int32 int16 15 1 2 256 1 0 0 102 2509 MSa/s 261 980 MSa/s 5 1 5570 2058
fir_decimate_hb float float 7 1 2 256 1 0 0 386 663 MSa/s 445 575 MSa/s 5 1 5760 2540
fir_decimate_hb float float 239 1 2 256 1 0 0 12290 20 MSa/s 12378 20 MSa/s 5 1 9248 8652
fir_decimate_hb float float 19 1 2 256 1 0 0 1282 199 MSa/s 1344 190 MSa/s 5 1 5968 2864
fir_decimate_hb float float 15 1 2 256 1 0 0 846 302 MSa/s 906 282 MSa/s 5 1 5888 2428
fir_decimate_hb cint32 int32 99 1 2 256 1 0 0 2183 117 MSa/s 2275 112 MSa/s 5 1 10800 3330
fir_decimate_hb cint32 int32 7 1 2 256 1 0 0 166 1542 MSa/s 518 494 MSa/s 5 1 8640 2006
fir_decimate_hb cint32 int32 239 1 2 256 1 0 0 5046 50 MSa/s 5165 49 MSa/s 5 1 13984 4400
fir_decimate_hb int32 int16 19 1 2 256 1 0 0 986 259 MSa/s 1045 244 MSa/s 5 1 5642 2564
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1910
fir_decimate_hb cint16 int16 99 1 2 128 3 0 0 222 576 MSa/s 285 449 MSa/s 9 3 12126 2346 1952 2076
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1910
fir_decimate_hb cint16 int16 15 1 2 256 1 0 0 102 2509 MSa/s 271 944 MSa/s 5 1 5570 1940
fir_decimate_hb cint16 int16 11 1 2 256 1 0 1 71 3605 MSa/s 261 980 MSa/s 8 1 5582 2406
fir_decimate_hb cint16 int16 11 1 2 256 1 0 0 71 3605 MSa/s 261 980 MSa/s 5 1 5562 1904
fir_decimate_hb cint16 int16 11 1 2 256 1 0 0 71 3605 MSa/s 266 962 MSa/s 5 1 5562 1904
fir_decimate_hb cint16 cint16 99 1 2 256 1 0 0 1161 220 MSa/s 1239 206 MSa/s 5 1 6960 2336
fir_decimate_hb cint16 cint16 7 1 2 256 1 0 0 117 2188 MSa/s 261 980 MSa/s 5 1 5504 2264
fir_decimate_hb cint16 cint16 27 1 2 256 1 0 0 484 528 MSa/s 544 470 MSa/s 5 1 5808 1792
fir_decimate_hb cint16 cint16 239 1 2 256 1 0 0 1422 180 MSa/s 1511 169 MSa/s 5 1 8992 3486
fir_decimate_hb cint16 int16 15 1 2 256 1 0 0 102 2509 MSa/s 261 980 MSa/s 5 1 5570 1924
fir_decimate_hb cint16 cint16 23 1 2 256 1 0 0 389 658 MSa/s 449 570 MSa/s 5 1 5728 1776
fir_decimate_hb cfloat float 7 1 2 256 1 0 0 1696 150 MSa/s 1759 145 MSa/s 5 1 8896 2316
fir_decimate_hb cfloat float 15 1 2 256 1 0 0 2749 93 MSa/s 2815 90 MSa/s 5 1 9088 2880
fir_decimate_hb cfloat float 11 1 2 256 1 0 0 1570 163 MSa/s 1634 156 MSa/s 5 1 8976 2384
fir_decimate_hb cfloat cfloat 99 1 2 256 1 0 0 16555 15 MSa/s 16646 15 MSa/s 5 1 11740 6394
fir_decimate_hb cfloat cfloat 7 1 2 256 1 0 0 1570 163 MSa/s 1633 156 MSa/s 5 1 8956 2328
fir_decimate_hb cfloat cfloat 15 1 2 256 1 0 0 3298 77 MSa/s 3362 76 MSa/s 5 1 9244 2376
fir_decimate_hb cfloat cfloat 11 1 2 256 1 0 0 1954 131 MSa/s 2018 126 MSa/s 5 1 9052 2536
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1910
fir_decimate_hb cfloat float 99 1 2 256 1 0 0 13793 18 MSa/s 13884 18 MSa/s 5 1 11056 5614
fir_decimate_hb cint16 int16 15 1 2 256 1 0 1 102 2509 MSa/s 262 977 MSa/s 8 1 5598 2444
fir_decimate_hb cfloat float 239 1 2 256 1 0 0 32233 7 MSa/s 32351 7 MSa/s 5 1 14240 10108
fir_decimate_hb cint16 int16 19 1 2 256 1 0 1 103 2485 MSa/s 262 977 MSa/s 8 1 5670 2540
fir_decimate_hb cint16 int16 19 1 2 256 1 0 0 103 2485 MSa/s 261 980 MSa/s 5 1 5642 1956
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1910
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1894
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1894
fir_decimate_hb cint16 int16 7 1 2 64 1 0 0 51 1254 MSa/s 107 598 MSa/s 5 1 3186 1738
fir_decimate_hb cint16 int16 7 1 2 512 1 0 0 102 5019 MSa/s 517 990 MSa/s 5 1 8562 1878
fir_decimate_hb cint16 int16 7 1 2 256 1 0 1 70 3657 MSa/s 261 980 MSa/s 8 1 5510 2294
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1894
fir_decimate_hb cint16 int16 7 1 2 128 1 0 0 57 2245 MSa/s 132 969 MSa/s 5 1 3954 1848
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1894
fir_decimate_hb cint16 int16 7 1 2 256 1 0 0 70 3657 MSa/s 261 980 MSa/s 5 1 5490 1926
fir_decimate_hb cint16 int16 55 1 2 256 1 1 0 172 1488 MSa/s 303 844 MSa/s 7 1 8498 2462
fir_decimate_hb cint16 int16 27 1 2 256 1 0 1 242 1057 MSa/s 331 773 MSa/s 8 1 5758 2570
fir_decimate_hb cint16 int16 27 1 2 256 1 0 0 201 1273 MSa/s 263 973 MSa/s 5 1 5722 1902
fir_decimate_hb cint16 int16 27 1 2 256 1 1 0 138 1855 MSa/s 276 927 MSa/s 7 1 8026 2206
fir_decimate_hb cint16 int16 239 1 2 256 1 0 1 1308 195 MSa/s 1430 179 MSa/s 8 1 8254 4116
fir_decimate_hb cint16 int16 239 1 2 256 1 0 0 1212 211 MSa/s 1300 196 MSa/s 5 1 8130 3622
fir_decimate_hb cint16 int16 7 1 2 1024 1 0 0 166 6168 MSa/s 1029 995 MSa/s 5 1 14706 1878
fir_decimate_hb cint16 int16 23 1 2 256 1 0 0 103 2485 MSa/s 270 948 MSa/s 5 1 5650 1900
fir_decimate_sym cint16 int16 99 1 3 384 4 0 0 495 775 MSa/s 514 747 MSa/s 11 4 22013 2496 2232 2226 2310
fir_decimate_sym cint16 int16 99 1 3 384 3 0 0 561 684 MSa/s 579 663 MSa/s 9 3 17262 2542 2272 2424
fir_decimate_sym cint16 int16 99 1 3 384 2 0 0 527 728 MSa/s 553 694 MSa/s 7 2 12511 2754 2666
fir_decimate_sym cint16 int16 99 1 3 384 1 0 1 769 499 MSa/s 874 439 MSa/s 8 1 7836 3224
fir_decimate_sym cint16 int16 99 1 3 384 1 0 0 713 538 MSa/s 784 489 MSa/s 5 1 7760 3570
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 393 977 MSa/s 5 1 6555 2160
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 388 989 MSa/s 5 1 6555 2180
fir_decimate_sym cint16 int16 9 1 3 192 1 0 0 77 2493 MSa/s 196 979 MSa/s 5 1 4507 2164
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 388 989 MSa/s 5 1 6555 2180
fir_decimate_sym cint16 int16 99 1 3 384 5 0 0 493 778 MSa/s 514 747 MSa/s 13 5 26764 2496 2232 2100 2106 2262
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 388 989 MSa/s 5 1 6555 2180
fir_decimate_sym cint16 int16 9 1 3 384 1 0 1 146 2630 MSa/s 390 984 MSa/s 8 1 6583 2452
fir_decimate_sym cint32 cint16 24 1 2 384 1 0 0 900 426 MSa/s 970 395 MSa/s 5 1 12139 2308
fir_decimate_sym cint32 int32 8 1 2 384 1 0 0 372 1032 MSa/s 775 495 MSa/s 5 1 11692 1814
fir_decimate_sym cint32 cint16 8 1 2 384 1 0 0 372 1032 MSa/s 775 495 MSa/s 5 1 11692 1814
fir_decimate_sym cint32 cint16 96 1 2 384 1 0 0 2630 146 MSa/s 2714 141 MSa/s 5 1 13963 3562
fir_decimate_sym cint32 cint32 21 1 3 384 1 0 0 1135 338 MSa/s 1201 319 MSa/s 5 1 11284 2670
fir_decimate_sym cint32 cint32 30 1 3 384 1 0 0 1248 307 MSa/s 1315 292 MSa/s 5 1 11580 2870
fir_decimate_sym cint32 cint32 9 1 3 384 1 0 0 496 774 MSa/s 776 494 MSa/s 5 1 10900 2090
fir_decimate_sym cint32 cint32 99 1 3 384 1 0 0 4351 88 MSa/s 4435 86 MSa/s 5 1 14084 3906
fir_decimate_sym cint32 int32 24 1 2 384 1 0 0 900 426 MSa/s 970 395 MSa/s 5 1 12139 2296
fir_decimate_sym cint32 int32 30 1 2 384 1 0 0 1619 237 MSa/s 1691 227 MSa/s 5 1 12291 2368
fir_decimate_sym cint32 int32 96 1 2 384 1 0 0 2630 146 MSa/s 2714 141 MSa/s 5 1 13963 3562
fir_decimate_sym int32 int32 21 1 3 384 1 0 0 548 700 MSa/s 609 630 MSa/s 5 1 6816 2106
fir_decimate_sym int32 int32 30 1 3 384 1 0 0 740 518 MSa/s 803 478 MSa/s 5 1 6916 2246
fir_decimate_sym int32 int32 9 1 3 384 1 0 0 147 2612 MSa/s 389 987 MSa/s 5 1 6607 2376
fir_decimate_sym int32 int32 99 1 3 384 1 0 0 1132 339 MSa/s 1203 319 MSa/s 5 1 8248 3040
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 388 989 MSa/s 5 1 6555 2164
fir_decimate_sym cint32 cint16 30 1 2 384 1 0 0 1619 237 MSa/s 1691 227 MSa/s 5 1 12291 2380
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 388 989 MSa/s 5 1 6555 2160
fir_decimate_sym cint16 int16 60 1 3 384 1 0 0 484 793 MSa/s 550 698 MSa/s 5 1 7170 2680
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 388 989 MSa/s 5 1 6555 2160
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 388 989 MSa/s 5 1 6555 2160
fir_decimate_sym cint16 cint16 21 1 3 384 1 0 0 548 700 MSa/s 609 630 MSa/s 5 1 6814 2118
fir_decimate_sym cint16 cint16 30 1 3 384 1 0 0 740 518 MSa/s 803 478 MSa/s 5 1 6914 2258
fir_decimate_sym cint16 cint16 9 1 3 384 1 0 0 147 2612 MSa/s 389 987 MSa/s 5 1 6606 2380
fir_decimate_sym cint16 cint16 99 1 3 384 1 0 0 1132 339 MSa/s 1203 319 MSa/s 5 1 8246 3056
fir_decimate_sym cint16 int16 100 1 2 256 1 0 0 1384 184 MSa/s 1455 175 MSa/s 5 1 6738 2802
fir_decimate_sym cint16 int16 12 1 3 384 1 0 0 118 3254 MSa/s 388 989 MSa/s 5 1 6561 2224
fir_decimate_sym cint16 int16 12 1 3 384 1 0 0 118 3254 MSa/s 388 989 MSa/s 5 1 6561 2224
fir_decimate_sym cint16 int16 12 1 3 384 1 0 1 144 2666 MSa/s 390 984 MSa/s 8 1 6589 2520
fir_decimate_sym cint16 int16 15 1 3 384 1 0 0 258 1488 MSa/s 390 984 MSa/s 5 1 6567 1914
fir_decimate_sym cint16 int16 15 1 3 384 1 0 1 285 1347 MSa/s 392 979 MSa/s 8 1 6603 2204
fir_decimate_sym cint16 int16 16 1 2 384 1 0 0 134 2865 MSa/s 394 974 MSa/s 5 1 7081 2086
fir_decimate_sym cint16 int16 18 1 3 384 1 0 0 259 1482 MSa/s 390 984 MSa/s 5 1 6670 1986
fir_decimate_sym cint16 int16 12 1 2 256 1 0 0 102 2509 MSa/s 260 984 MSa/s 5 1 5537 2076
fir_decimate_sym cint16 int16 237 1 3 384 1 0 0 1324 290 MSa/s 1413 271 MSa/s 5 1 9508 3612
fir_decimate_sym cint16 int16 9 1 3 384 1 0 0 117 3282 MSa/s 388 989 MSa/s 5 1 6555 2180
fir_decimate_sym cint16 int16 18 1 3 384 1 0 1 285 1347 MSa/s 393 977 MSa/s 8 1 6682 2276
fir_decimate_sym cint16 int16 8 1 2 256 1 0 0 69 3710 MSa/s 260 984 MSa/s 5 1 5464 1924
fir_decimate_sym cint16 int16 60 1 2 384 1 1 0 661 580 MSa/s 773 496 MSa/s 7 1 11266 3036
fir_decimate_sym cint16 int16 60 1 2 384 1 0 0 1377 278 MSa/s 1443 266 MSa/s 5 1 7682 2636
fir_decimate_sym cint16 int16 27 1 3 384 1 0 1 387 992 MSa/s 478 803 MSa/s 8 1 6780 2456
fir_decimate_sym cint16 int16 28 1 2 256 1 0 0 606 422 MSa/s 666 384 MSa/s 5 1 5730 2250
fir_decimate_sym cint16 int16 26 1 2 256 1 0 0 355 721 MSa/s 415 616 MSa/s 5 1 5726 2164
fir_decimate_sym cint16 int16 24 1 2 256 1 0 0 259 988 MSa/s 319 802 MSa/s 5 1 5658 2010
fir_decimate_sym cint16 int16 240 1 2 256 1 0 0 1298 197 MSa/s 1386 184 MSa/s 5 1 8490 3298
fir_decimate_sym cint16 int16 237 1 3 768 1 0 0 2604 294 MSa/s 2693 285 MSa/s 5 1 13604 3612
fir_decimate_sym cint16 int16 237 1 3 384 1 0 1 1404 273 MSa/s 1525 251 MSa/s 8 1 9736 4084
fir_decimate_sym cint16 int16 27 1 3 384 1 0 0 361 1063 MSa/s 422 909 MSa/s 5 1 6752 2186
fir_interpolate_asym cint32 cint32 58 2 1 256 1 0 0 8666 29 MSa/s 8731 29 MSa/s 5 1 16974 4088
fir_interpolate_asym cint32 cint32 32 2 1 256 1 0 0 5312 48 MSa/s 5374 47 MSa/s 5 1 15838 3074
fir_interpolate_asym cint32 cint32 48 2 1 256 1 0 0 7377 34 MSa/s 7441 34 MSa/s 5 1 16478 3736
fir_interpolate_asym cint32 cint32 44 2 1 256 1 0 0 6347 40 MSa/s 6411 39 MSa/s 5 1 16350 3500
fir_interpolate_asym cint32 cint32 40 2 1 256 1 0 0 6344 40 MSa/s 6407 39 MSa/s 5 1 16158 3432
fir_interpolate_asym cint32 cint32 20 2 1 256 1 0 0 3251 78 MSa/s 3312 77 MSa/s 5 1 15390 2330
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 0 197 1299 MSa/s 517 495 MSa/s 5 1 8590 2100
fir_interpolate_asym cint32 cint32 12 2 1 256 1 0 0 1834 139 MSa/s 1898 134 MSa/s 5 1 15070 1986
fir_interpolate_asym cint32 cint16 8 2 1 256 1 0 0 596 429 MSa/s 1029 248 MSa/s 5 1 14782 2460
fir_interpolate_asym cint32 cint16 64 2 1 256 1 0 0 4577 55 MSa/s 4642 55 MSa/s 5 1 16350 4292
fir_interpolate_asym cint32 cint16 32 2 1 256 1 0 0 2495 102 MSa/s 2558 100 MSa/s 5 1 15454 2980
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 0 199 1286 MSa/s 517 495 MSa/s 5 1 8590 2116
fir_interpolate_asym cint32 cint32 64 2 1 256 1 0 0 9442 27 MSa/s 9507 26 MSa/s 5 1 17118 4330
fir_interpolate_asym cint32 cint32 16 2 1 256 1 0 0 3243 78 MSa/s 3307 77 MSa/s 5 1 15198 2142
fir_interpolate_asym cint32 cint32 72 2 1 256 1 0 0 10470 24 MSa/s 10536 24 MSa/s 5 1 17438 4682
fir_interpolate_asym int32 int32 64 2 1 256 1 0 0 2209 115 MSa/s 2268 112 MSa/s 5 1 10462 4242
fir_interpolate_asym cint32 cint32 8 2 1 256 1 0 0 1065 240 MSa/s 1127 227 MSa/s 5 1 14878 2878
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 0 199 1286 MSa/s 517 495 MSa/s 5 1 8590 2116
fir_interpolate_asym int32 int32 8 2 1 256 1 0 0 314 815 MSa/s 519 493 MSa/s 5 1 8702 2544
fir_interpolate_asym int32 int32 32 2 1 256 1 0 0 1150 222 MSa/s 1208 211 MSa/s 5 1 9438 2928
fir_interpolate_asym int32 int16 8 2 1 256 1 0 0 165 1551 MSa/s 516 496 MSa/s 5 1 8590 1990
fir_interpolate_asym int32 int16 64 2 1 256 1 0 0 1278 200 MSa/s 1337 191 MSa/s 5 1 9566 3142
fir_interpolate_asym int32 int16 32 2 1 256 1 0 0 749 341 MSa/s 809 316 MSa/s 5 1 8990 2158
fir_interpolate_asym cint32 cint32 80 2 1 256 1 0 0 11507 22 MSa/s 11575 22 MSa/s 5 1 17758 5000
fir_interpolate_asym cint32 int32 8 2 1 256 1 0 0 596 429 MSa/s 1029 248 MSa/s 5 1 14782 2460
fir_interpolate_asym cint32 int32 32 2 1 256 1 0 0 2495 102 MSa/s 2558 100 MSa/s 5 1 15454 2980
fir_interpolate_asym cint32 int16 8 2 1 256 1 0 0 295 867 MSa/s 1023 250 MSa/s 5 1 14702 1878
fir_interpolate_asym cint32 int16 64 2 1 256 1 0 0 2688 95 MSa/s 2751 93 MSa/s 5 1 15710 3092
fir_interpolate_asym cint32 int16 32 2 1 256 1 0 0 1645 155 MSa/s 1708 149 MSa/s 5 1 15134 2174
fir_interpolate_asym cint32 cint32 96 2 1 256 1 0 0 14711 17 MSa/s 14782 17 MSa/s 5 1 18398 4024
fir_interpolate_asym cint32 cint32 88 2 1 256 1 0 0 13681 18 MSa/s 13751 18 MSa/s 5 1 18078 3848
fir_interpolate_asym cint32 int32 64 2 1 256 1 0 0 4577 55 MSa/s 4642 55 MSa/s 5 1 16350 4292
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 0 201 1273 MSa/s 517 495 MSa/s 5 1 8590 2084
fir_interpolate_asym cint16 int16 8 2 1 64 1 0 0 87 735 MSa/s 143 447 MSa/s 5 1 3982 2100
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 0 199 1286 MSa/s 517 495 MSa/s 5 1 8590 2084
fir_interpolate_asym cint16 cint16 32 2 1 256 1 0 0 1150 222 MSa/s 1208 211 MSa/s 5 1 9438 2928
fir_interpolate_asym cint16 cint16 64 2 1 256 1 0 0 2209 115 MSa/s 2268 112 MSa/s 5 1 10462 4242
fir_interpolate_asym cint16 cint16 8 2 1 256 1 0 0 334 766 MSa/s 519 493 MSa/s 5 1 8702 2604
fir_interpolate_asym cint16 int16 128 2 1 256 1 0 0 2338 109 MSa/s 2401 106 MSa/s 5 1 10718 4500
fir_interpolate_asym cint16 int16 128 2 1 256 2 0 0 1324 193 MSa/s 1406 182 MSa/s 7 2 15003 3232 3086
fir_interpolate_asym cint16 int16 128 2 1 256 3 0 0 1095 233 MSa/s 1228 208 MSa/s 9 3 19320 2876 3360 3544
fir_interpolate_asym cint16 int16 128 2 1 256 4 0 0 812 315 MSa/s 934 274 MSa/s 11 4 23573 2224 1968 1968 2112
fir_interpolate_asym cint16 int16 128 2 1 256 5 0 0 823 311 MSa/s 966 265 MSa/s 13 5 27922 3320 3086 3086 3098 2000
fir_interpolate_asym cint16 int16 16 2 1 256 1 0 0 334 766 MSa/s 519 493 MSa/s 5 1 8702 2452
fir_interpolate_asym cint16 int16 16 2 1 256 1 0 1 334 766 MSa/s 531 482 MSa/s 8 1 8738 4432
fir_interpolate_asym cint16 int16 240 2 1 256 1 0 0 4528 56 MSa/s 4600 55 MSa/s 5 1 12734 4722
fir_interpolate_asym cint16 int16 240 2 1 256 1 0 1 4900 52 MSa/s 5002 51 MSa/s 8 1 13218 5620
fir_interpolate_asym cint16 int16 24 2 1 256 1 0 0 618 414 MSa/s 678 377 MSa/s 5 1 8910 2000
fir_interpolate_asym cint16 int16 24 3 1 256 1 0 0 591 433 MSa/s 774 330 MSa/s 5 1 12122 3588
fir_interpolate_asym cint16 int16 24 4 1 256 1 0 0 742 345 MSa/s 1029 248 MSa/s 5 1 14918 2004
fir_interpolate_asym cint16 int16 128 2 1 256 1 0 1 2539 100 MSa/s 2634 97 MSa/s 8 1 10978 5164
fir_interpolate_asym cint16 int16 24 8 1 256 1 0 0 1376 186 MSa/s 2044 125 MSa/s 6 1 27222 3226
fir_interpolate_asym cint16 int16 24 6 1 256 1 0 0 1066 240 MSa/s 1536 166 MSa/s 5 1 21182 3838
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 0 199 1286 MSa/s 517 495 MSa/s 5 1 8590 2084
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 0 201 1273 MSa/s 517 495 MSa/s 5 1 8590 2100
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 1 261 980 MSa/s 522 490 MSa/s 8 1 8626 3288
fir_interpolate_asym cint16 int16 8 2 1 256 1 0 0 201 1273 MSa/s 517 495 MSa/s 5 1 8590 2084
fir_interpolate_asym cint16 int16 8 2 1 128 1 0 0 125 1024 MSa/s 263 486 MSa/s 5 1 5518 2084
fir_interpolate_asym cint16 int16 8 2 1 1024 1 0 0 657 1558 MSa/s 2039 502 MSa/s 5 1 27022 2084
fir_interpolate_asym cint16 int16 8 2 1 512 1 0 0 353 1450 MSa/s 1024 500 MSa/s 5 1 14734 2084
fir_interpolate_asym cint16 int16 64 2 1 256 1 0 0 1278 200 MSa/s 1337 191 MSa/s 5 1 9566 3142
fir_interpolate_asym cint16 int16 32 2 1 256 1 0 1 815 314 MSa/s 903 283 MSa/s 8 1 9058 3022
fir_interpolate_asym cint16 int16 32 2 1 256 1 0 0 749 341 MSa/s 809 316 MSa/s 5 1 8990 2158
fir_interpolate_asym cint16 int16 30 5 1 256 1 0 0 915 279 MSa/s 1282 199 MSa/s 5 1 18398 3850
fir_interpolate_asym cint16 int16 30 3 1 256 1 0 0 857 298 MSa/s 917 279 MSa/s 5 1 12294 3838
fir_interpolate_asym cint16 int16 64 2 1 256 1 0 1 1406 182 MSa/s 1497 171 MSa/s 8 1 9698 3750
fir_interpolate_asym cint16 int16 30 2 1 256 1 0 0 811 315 MSa/s 870 294 MSa/s 5 1 8986 3400
fir_interpolate_fract_asym cint16 int16 60 5 2 256 1 0 0 637 401 MSa/s 705 363 MSa/s 5 1 13302 3730
fir_interpolate_fract_asym cint16 int16 60 5 2 240 1 0 0 601 399 MSa/s 670 358 MSa/s 5 1 12854 3730
fir_interpolate_fract_asym cint16 int16 48 4 3 144 1 0 0 269 535 MSa/s 332 433 MSa/s 5 1 7638 4532
fir_interpolate_fract_asym cint16 int16 41 5 4 256 1 0 0 354 723 MSa/s 422 606 MSa/s 5 1 10608 4904
fir_interpolate_fract_asym cint16 int16 40 10 3 120 1 0 0 272 441 MSa/s 402 298 MSa/s 6 1 15374 4740
fir_interpolate_fract_asym cint16 int16 40 10 7 280 1 0 0 315 888 MSa/s 408 686 MSa/s 6 1 16654 4964
fir_interpolate_fract_asym cint16 int16 36 9 7 504 1 0 0 490 1028 MSa/s 647 778 MSa/s 6 1 19750 5514
fir_interpolate_fract_asym cint16 int16 60 5 3 240 1 0 0 482 497 MSa/s 551 435 MSa/s 5 1 11254 3754
fir_interpolate_fract_asym cint16 int16 36 9 5 360 1 0 0 442 814 MSa/s 640 562 MSa/s 6 1 18598 5342
fir_interpolate_fract_asym cint16 int16 40 10 9 360 1 0 0 308 1168 MSa/s 407 884 MSa/s 6 1 17294 4978
fir_interpolate_fract_asym cint16 int16 60 5 4 480 1 0 0 720 666 MSa/s 788 609 MSa/s 5 1 14774 3778
fir_interpolate_fract_asym cint16 int16 95 8 5 320 1 0 0 581 550 MSa/s 645 496 MSa/s 6 1 16036 3958
fir_interpolate_fract_asym cint16 int16 75 5 4 160 3 0 1 476 336 MSa/s 525 304 MSa/s 11 3 22254 4290 3850 3762
fir_interpolate_fract_asym cint16 int16 84 7 2 336 1 0 0 1223 274 MSa/s 1294 259 MSa/s 6 1 20502 4310
fir_interpolate_fract_asym cint16 int16 84 7 3 168 1 0 0 477 352 MSa/s 547 307 MSa/s 6 1 12886 4210
fir_interpolate_fract_asym cint16 int16 84 7 4 672 1 0 0 1368 491 MSa/s 1437 467 MSa/s 6 1 23190 4386
fir_interpolate_fract_asym cint16 int16 84 7 5 840 1 0 0 1391 603 MSa/s 1462 574 MSa/s 6 1 24534 4346
fir_interpolate_fract_asym cint16 int16 84 7 6 336 1 0 0 493 681 MSa/s 564 595 MSa/s 6 1 14230 4344
fir_interpolate_fract_asym cint16 int16 96 8 3 192 1 0 0 500 384 MSa/s 564 340 MSa/s 6 1 15014 3406
fir_interpolate_fract_asym cint16 int16 96 8 5 480 1 0 0 845 568 MSa/s 909 528 MSa/s 6 1 19366 3450
fir_interpolate_fract_asym cint16 int16 96 8 7 672 1 0 0 892 753 MSa/s 956 702 MSa/s 6 1 20902 3454
fir_interpolate_fract_asym cint32 cint32 60 5 2 256 1 0 0 4706 54 MSa/s 4770 53 MSa/s 5 1 22366 5000
fir_interpolate_fract_asym cint16 int16 36 9 4 288 1 0 0 455 632 MSa/s 643 447 MSa/s 6 1 18022 4742
fir_interpolate_fract_asym cint16 int16 72 6 5 360 1 0 0 527 683 MSa/s 589 611 MSa/s 5 1 13534 3282
fir_interpolate_fract_asym cint16 int16 36 9 2 144 1 0 0 397 362 MSa/s 635 226 MSa/s 6 1 16870 4574
fir_interpolate_fract_asym cint16 int16 36 9 8 576 1 0 0 490 1175 MSa/s 647 890 MSa/s 6 1 20326 4862
fir_interpolate_fract_asym cint16 int16 32 8 7 224 1 0 0 203 1103 MSa/s 276 811 MSa/s 6 1 12646 1942
fir_interpolate_fract_asym cint16 int16 36 3 2 144 1 0 0 301 478 MSa/s 363 396 MSa/s 5 1 6806 4750
fir_interpolate_fract_asym cint16 int16 108 9 2 432 1 0 0 1904 226 MSa/s 1967 219 MSa/s 6 1 31158 6122
fir_interpolate_fract_asym cint16 int16 108 9 4 864 1 0 0 2042 423 MSa/s 2111 409 MSa/s 6 1 34614 6182
fir_interpolate_fract_asym cint16 int16 108 9 5 1080 1 0 0 2112 511 MSa/s 2175 496 MSa/s 6 1 36342 6046
fir_interpolate_fract_asym cint16 int16 108 9 8 1728 1 0 0 2222 777 MSa/s 2285 756 MSa/s 6 1 41526 6190
fir_interpolate_fract_asym cint16 int16 120 10 3 240 1 0 0 808 297 MSa/s 870 275 MSa/s 6 1 20718 4906
fir_interpolate_fract_asym cint16 int16 120 10 7 840 1 0 0 1435 585 MSa/s 1497 561 MSa/s 6 1 28718 4924
fir_interpolate_fract_asym cint16 int16 120 10 9 720 1 0 0 965 746 MSa/s 1027 701 MSa/s 6 1 24558 4890
fir_interpolate_fract_asym cint16 int16 12 3 2 48 1 0 0 86 558 MSa/s 147 326 MSa/s 5 1 4518 1874
fir_interpolate_fract_asym cint16 int16 16 4 3 48 1 0 0 82 585 MSa/s 142 338 MSa/s 5 1 5462 1824
fir_interpolate_fract_asym cint16 int16 20 5 2 80 1 0 0 170 470 MSa/s 230 347 MSa/s 5 1 7974 2172
fir_interpolate_fract_asym cint16 int16 20 5 3 120 1 0 0 182 659 MSa/s 244 491 MSa/s 5 1 8294 2184
fir_interpolate_fract_asym cint16 int16 108 9 7 1512 1 0 0 2277 664 MSa/s 2346 644 MSa/s 6 1 39798 6118
fir_interpolate_fract_asym cint16 int16 20 5 4 256 1 0 0 257 996 MSa/s 341 750 MSa/s 5 1 10342 2216
fir_interpolate_fract_asym cint16 int16 20 5 4 160 1 0 0 182 879 MSa/s 249 642 MSa/s 5 1 8614 2216
fir_interpolate_fract_asym cint16 int16 32 8 3 96 1 0 0 181 530 MSa/s 266 360 MSa/s 6 1 11622 1894
fir_interpolate_fract_asym cint16 int16 28 7 6 336 1 0 0 314 1070 MSa/s 402 835 MSa/s 5 1 13670 3642
fir_interpolate_fract_asym cint16 int16 28 7 5 280 1 0 0 310 903 MSa/s 401 698 MSa/s 5 1 13222 3642
fir_interpolate_fract_asym cint16 int16 32 8 5 160 1 0 0 197 812 MSa/s 272 588 MSa/s 6 1 12134 1934
fir_interpolate_fract_asym cint16 int16 28 7 3 168 1 0 0 296 567 MSa/s 399 421 MSa/s 5 1 12326 3706
fir_interpolate_fract_asym cint16 int16 28 7 2 112 1 0 0 278 402 MSa/s 397 282 MSa/s 5 1 11878 3616
fir_interpolate_fract_asym cint16 int16 24 6 5 120 1 0 0 131 916 MSa/s 192 625 MSa/s 5 1 8830 2988
fir_interpolate_fract_asym cint16 int16 24 3 2 256 1 0 0 472 542 MSa/s 556 460 MSa/s 7 1 11902 2382
fir_interpolate_fract_asym cint16 int16 28 7 4 224 1 0 0 312 717 MSa/s 401 558 MSa/s 5 1 12774 3670
fir_interpolate_hb cint32 cint32 99 2 1 256 1 0 0 6315 40 MSa/s 6388 40 MSa/s 5 1 16893 3078
fir_interpolate_hb cint32 cint16 239 2 1 256 1 0 0 6711 38 MSa/s 6798 37 MSa/s 5 1 18209 4086
fir_interpolate_hb cint32 cint32 239 2 1 256 1 0 0 8255 31 MSa/s 8342 30 MSa/s 5 1 19965 4122
fir_interpolate_hb cint32 cint32 15 2 1 256 1 0 0 679 377 MSa/s 1031 248 MSa/s 5 1 15037 2622
fir_interpolate_hb cint32 cint32 11 2 1 256 1 0 0 552 463 MSa/s 1028 249 MSa/s 5 1 14909 2362
fir_interpolate_hb cint32 cint16 99 2 1 256 1 0 0 2027 126 MSa/s 2100 121 MSa/s 5 1 16177 3088
fir_interpolate_hb cint32 cint16 7 2 1 256 1 0 0 391 654 MSa/s 1025 249 MSa/s 5 1 14721 2120
fir_interpolate_hb cint32 cint32 7 2 1 256 1 0 0 422 606 MSa/s 1026 249 MSa/s 5 1 14813 2262
fir_interpolate_hb cint16 int16 99 2 1 128 5 0 0 363 352 MSa/s 464 275 MSa/s 13 5 19667 1944 1810 1770 1770 1850
fir_interpolate_hb cint32 cint16 11 2 1 256 1 0 0 392 653 MSa/s 1024 250 MSa/s 5 1 14801 2250
fir_interpolate_hb cint16 int16 99 2 1 256 1 0 1 1520 168 MSa/s 1651 155 MSa/s 8 1 9303 2786
fir_interpolate_hb cint16 int16 99 2 1 256 1 0 0 1520 168 MSa/s 1588 161 MSa/s 5 1 9259 2518
fir_interpolate_hb cint16 int16 99 2 1 128 4 0 0 368 347 MSa/s 461 277 MSa/s 11 4 16297 1944 1808 1808 1850
fir_interpolate_hb cint16 int16 99 2 1 128 3 0 0 597 214 MSa/s 673 190 MSa/s 9 3 12927 1944 1810 1848
fir_interpolate_hb cint16 int16 99 2 1 128 2 0 0 662 193 MSa/s 720 177 MSa/s 7 2 9557 2014 1880
fir_interpolate_hb int32 int32 99 2 1 256 1 0 0 1521 168 MSa/s 1589 161 MSa/s 5 1 9649 2680
fir_interpolate_hb cint32 int16 11 2 1 256 1 0 0 502 509 MSa/s 1027 249 MSa/s 5 1 14779 2198
fir_interpolate_hb cint32 cint16 15 2 1 256 1 0 0 423 605 MSa/s 1026 249 MSa/s 5 1 14849 2232
fir_interpolate_hb cint32 int16 15 2 1 256 1 0 0 503 508 MSa/s 1027 249 MSa/s 5 1 14787 2198
fir_interpolate_hb int32 int16 19 2 1 256 1 0 0 214 1196 MSa/s 517 495 MSa/s 5 1 8651 2146
fir_interpolate_hb cint32 int16 7 2 1 256 1 0 0 422 606 MSa/s 1026 249 MSa/s 5 1 14707 1942
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2152
fir_interpolate_hb int32 int32 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8577 2112
fir_interpolate_hb int32 int32 27 2 1 256 1 0 0 359 713 MSa/s 520 492 MSa/s 5 1 8753 2726
fir_interpolate_hb int32 int32 239 2 1 256 1 0 0 2412 106 MSa/s 2486 102 MSa/s 5 1 11105 3162
fir_interpolate_hb int32 int32 23 2 1 256 1 0 0 296 864 MSa/s 519 493 MSa/s 5 1 8737 2534
fir_interpolate_hb int32 int16 99 2 1 256 1 0 0 873 293 MSa/s 941 272 MSa/s 5 1 9259 2564
fir_interpolate_hb int32 int16 7 2 1 256 1 0 0 229 1117 MSa/s 517 495 MSa/s 5 1 8563 1910
fir_interpolate_hb int32 int16 239 2 1 256 1 0 0 1168 219 MSa/s 1242 206 MSa/s 5 1 10243 3118
fir_interpolate_hb cint32 int16 239 2 1 256 1 0 0 2429 105 MSa/s 2516 101 MSa/s 5 1 17347 3356
fir_interpolate_hb int32 int16 15 2 1 256 1 0 0 231 1108 MSa/s 517 495 MSa/s 5 1 8579 2054
fir_interpolate_hb float float 239 2 1 256 1 0 0 14694 17 MSa/s 14768 17 MSa/s 5 1 11361 5668
fir_interpolate_hb float float 15 2 1 256 1 0 0 1459 175 MSa/s 1517 168 MSa/s 5 1 8897 2356
fir_interpolate_hb cint32 int32 99 2 1 256 1 0 0 2027 126 MSa/s 2100 121 MSa/s 5 1 16177 3088
fir_interpolate_hb cint32 int32 7 2 1 256 1 0 0 391 654 MSa/s 1025 249 MSa/s 5 1 14721 2026
fir_interpolate_hb cint32 int32 239 2 1 256 1 0 0 6711 38 MSa/s 6798 37 MSa/s 5 1 18209 4086
fir_interpolate_hb cint32 int32 15 2 1 256 1 0 0 423 605 MSa/s 1026 249 MSa/s 5 1 14849 2232
fir_interpolate_hb cint32 int32 11 2 1 256 1 0 0 392 653 MSa/s 1025 249 MSa/s 5 1 14801 2120
fir_interpolate_hb cint32 int16 99 2 1 256 1 0 0 2862 89 MSa/s 2937 87 MSa/s 5 1 15787 2664
fir_interpolate_hb float float 7 2 1 256 1 0 0 547 468 MSa/s 605 423 MSa/s 5 1 8833 2440
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2136
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2152
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2152
fir_interpolate_hb cint16 int16 15 2 1 256 1 0 1 175 1462 MSa/s 518 494 MSa/s 8 1 8607 2496
fir_interpolate_hb cint16 int16 15 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8579 2048
fir_interpolate_hb cint16 int16 15 2 1 256 1 0 0 175 1462 MSa/s 499 513 MSa/s 5 1 8579 2048
fir_interpolate_hb cint16 int16 11 2 1 256 1 0 1 175 1462 MSa/s 518 494 MSa/s 8 1 8591 2584
fir_interpolate_hb cint16 int16 11 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8571 2152
fir_interpolate_hb cint16 int16 11 2 1 256 1 0 0 175 1462 MSa/s 499 513 MSa/s 5 1 8571 2152
fir_interpolate_hb cint16 cint16 99 2 1 256 1 0 0 1523 168 MSa/s 1591 160 MSa/s 5 1 9649 2704
fir_interpolate_hb cint16 cint16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8577 2230
fir_interpolate_hb cint16 int16 19 2 1 256 1 0 0 231 1108 MSa/s 517 495 MSa/s 5 1 8651 2202
fir_interpolate_hb cint16 cint16 239 2 1 256 1 0 0 2412 106 MSa/s 2486 102 MSa/s 5 1 11105 3158
fir_interpolate_hb cfloat float 99 2 1 256 1 0 0 21538 11 MSa/s 21611 11 MSa/s 5 1 16433 3914
fir_interpolate_hb cfloat float 7 2 1 256 1 0 0 1060 241 MSa/s 1121 228 MSa/s 5 1 14977 2456
fir_interpolate_hb cfloat float 15 2 1 256 1 0 0 2140 119 MSa/s 2201 116 MSa/s 5 1 15105 2244
fir_interpolate_hb cfloat cfloat 7 2 1 256 1 0 0 1807 141 MSa/s 1868 137 MSa/s 5 1 15069 2288
fir_interpolate_hb cfloat cfloat 239 2 1 256 1 0 0 45362 5 MSa/s 45449 5 MSa/s 5 1 20221 8224
fir_interpolate_hb cfloat cfloat 15 2 1 256 1 0 0 2924 87 MSa/s 2985 85 MSa/s 5 1 15293 2632
fir_interpolate_hb cfloat cfloat 11 2 1 256 1 0 0 2383 107 MSa/s 2444 104 MSa/s 5 1 15165 2456
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2136
fir_interpolate_hb cint16 cint16 23 2 1 256 1 0 0 296 864 MSa/s 519 493 MSa/s 5 1 8737 2566
fir_interpolate_hb cint16 int16 19 2 1 256 1 0 1 231 1108 MSa/s 519 493 MSa/s 8 1 8679 2722
fir_interpolate_hb cint16 cint16 27 2 1 256 1 0 0 359 713 MSa/s 520 492 MSa/s 5 1 8753 2742
fir_interpolate_hb cint16 int16 23 2 1 256 1 0 0 169 1514 MSa/s 499 513 MSa/s 5 1 8663 2048
fir_interpolate_hb cint16 int16 23 2 1 256 1 0 0 231 1108 MSa/s 507 504 MSa/s 5 1 8659 2286
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2136
fir_interpolate_hb cint16 int16 7 2 1 64 1 0 0 73 876 MSa/s 136 470 MSa/s 5 1 3955 2120
fir_interpolate_hb cint16 int16 7 2 1 512 1 0 0 311 1646 MSa/s 1023 500 MSa/s 5 1 14707 2136
fir_interpolate_hb cint16 int16 7 2 1 32 1 0 0 85 376 MSa/s 140 228 MSa/s 5 1 3187 2140
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2136
fir_interpolate_hb cint16 int16 7 2 1 128 1 0 0 107 1196 MSa/s 263 486 MSa/s 5 1 5491 2136
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2136
fir_interpolate_hb cint16 int16 47 2 1 256 1 1 0 244 1049 MSa/s 513 499 MSa/s 7 1 11050 2382
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 1 175 1462 MSa/s 518 494 MSa/s 8 1 8583 2520
fir_interpolate_hb cint16 int16 47 2 1 256 1 0 0 940 272 MSa/s 1015 252 MSa/s 5 1 8803 1888
fir_interpolate_hb cint16 int16 43 2 1 256 1 1 0 248 1032 MSa/s 514 498 MSa/s 7 1 11042 2430
fir_interpolate_hb cint16 int16 43 2 1 256 1 0 0 557 459 MSa/s 631 405 MSa/s 5 1 8795 1914
fir_interpolate_hb cint16 int16 27 2 1 256 1 0 1 231 1108 MSa/s 519 493 MSa/s 8 1 8703 2934
fir_interpolate_hb cint16 int16 27 2 1 256 1 0 0 231 1108 MSa/s 517 495 MSa/s 5 1 8667 2334
fir_interpolate_hb cint16 int16 27 2 1 256 1 0 0 231 1108 MSa/s 507 504 MSa/s 5 1 8667 2334
fir_interpolate_hb cint16 int16 239 2 1 256 1 0 1 2867 89 MSa/s 3028 84 MSa/s 8 1 10367 3348
fir_interpolate_hb cint16 int16 7 2 1 256 1 0 0 175 1462 MSa/s 516 496 MSa/s 5 1 8563 2136
fir_interpolate_hb cint16 int16 239 2 1 256 1 0 0 2867 89 MSa/s 2941 87 MSa/s 5 1 10243 3136
fir_sr_asym cint32 cint32 32 1 1 256 1 0 0 4395 58 MSa/s 4463 57 MSa/s 5 1 11980 2944
fir_sr_asym cint32 cint32 16 1 1 256 1 0 0 2256 113 MSa/s 2322 110 MSa/s 5 1 11212 3162
fir_sr_asym cint32 cint16 32 1 1 256 1 0 0 2226 115 MSa/s 2294 111 MSa/s 5 1 11466 3420
fir_sr_asym cint16 int16 8 1 1 64 1 0 0 69 927 MSa/s 124 516 MSa/s 5 1 3418 1902
fir_sr_asym cint16 int16 8 1 1 256 1 0 0 165 1551 MSa/s 265 966 MSa/s 5 1 6490 1902
fir_sr_asym cint16 int16 8 1 1 256 1 0 0 165 1551 MSa/s 265 966 MSa/s 5 1 6490 1902
fir_sr_asym cint32 int16 16 1 1 256 1 0 0 548 467 MSa/s 612 418 MSa/s 5 1 10826 2180
fir_sr_asym cint32 cint16 16 1 1 256 1 0 0 1060 241 MSa/s 1125 227 MSa/s 5 1 10954 2722
fir_sr_asym cint32 int16 32 1 1 256 1 0 0 1060 241 MSa/s 1127 227 MSa/s 5 1 11210 3024
fir_sr_asym int32 int32 16 1 1 256 1 0 0 548 467 MSa/s 608 421 MSa/s 5 1 6730 3088
fir_sr_asym cint32 int32 32 1 1 256 1 0 0 2226 115 MSa/s 2294 111 MSa/s 5 1 11466 3420
fir_sr_asym float float 16 1 1 256 1 0 0 1643 155 MSa/s 1703 150 MSa/s 5 1 6985 2556
fir_sr_asym float float 32 1 1 256 1 0 0 2603 98 MSa/s 2663 96 MSa/s 5 1 7369 3800
fir_sr_asym int16 int16 16 1 1 256 1 0 0 163 1570 MSa/s 220 1163 MSa/s 5 1 4498 2388
fir_sr_asym int16 int16 32 1 1 256 1 0 0 292 876 MSa/s 349 733 MSa/s 5 1 4690 3264
fir_sr_asym int32 int16 16 1 1 256 1 0 0 292 876 MSa/s 352 727 MSa/s 5 1 6602 2224
fir_sr_asym int32 int16 32 1 1 256 1 0 0 548 467 MSa/s 608 421 MSa/s 5 1 6858 2982
fir_sr_asym int32 int32 32 1 1 256 1 0 0 1128 226 MSa/s 1189 215 MSa/s 5 1 7114 3776
fir_sr_asym cint16 int16 8 1 1 256 1 0 0 165 1551 MSa/s 265 966 MSa/s 5 1 6490 1902
fir_sr_asym cint32 int32 16 1 1 256 1 0 0 1060 241 MSa/s 1125 227 MSa/s 5 1 10954 2722
fir_sr_asym cint16 int16 8 1 1 128 1 0 0 101 1267 MSa/s 157 815 MSa/s 5 1 4442 1902
fir_sr_asym cint16 int16 8 1 1 256 1 0 1 165 1551 MSa/s 275 930 MSa/s 8 1 6526 2514
fir_sr_asym cint16 int16 64 1 1 256 1 0 0 1138 224 MSa/s 1206 212 MSa/s 5 1 7370 3470
fir_sr_asym cint16 int16 64 1 1 256 1 0 1 1138 224 MSa/s 1297 197 MSa/s 8 1 7502 3560
fir_sr_asym cfloat cfloat 16 1 1 256 1 0 0 4166 61 MSa/s 4231 60 MSa/s 5 1 11481 3092
fir_sr_asym cfloat cfloat 32 1 1 256 1 0 0 9318 27 MSa/s 9386 27 MSa/s 5 1 12249 2924
fir_sr_asym cfloat float 16 1 1 256 1 0 0 2337 109 MSa/s 2402 106 MSa/s 5 1 11225 2342
fir_sr_asym cfloat float 32 1 1 256 1 0 0 4594 55 MSa/s 4661 54 MSa/s 5 1 11737 3664
fir_sr_asym cint16 cint16 16 1 1 256 1 0 0 548 467 MSa/s 608 421 MSa/s 5 1 6730 3088
fir_sr_asym cint16 cint16 32 1 1 256 1 0 0 1128 226 MSa/s 1189 215 MSa/s 5 1 7114 3776
fir_sr_asym cint16 int16 128 1 1 256 1 0 0 2123 120 MSa/s 2200 116 MSa/s 5 1 8394 4000
fir_sr_asym cint16 int16 128 1 1 256 1 0 1 2123 120 MSa/s 2319 110 MSa/s 8 1 8654 4088
fir_sr_asym cint16 cint16 24 1 1 256 1 0 0 900 284 MSa/s 975 262 MSa/s 5 1 6922 2788
fir_sr_asym cint16 int16 128 1 1 256 3 0 0 1211 211 MSa/s 1347 190 MSa/s 9 3 17464 3876 2806 3704
fir_sr_asym cint16 int16 128 1 1 256 4 0 0 748 342 MSa/s 885 289 MSa/s 11 4 21951 3212 2882 2882 3012
fir_sr_asym cint16 int16 128 1 1 256 5 0 0 672 380 MSa/s 811 315 MSa/s 13 5 26534 2988 2690 2674 2690 2816
fir_sr_asym cint16 int16 32 1 1 256 1 0 1 548 467 MSa/s 665 384 MSa/s 8 1 6926 3262
fir_sr_asym cint16 int16 16 1 1 256 1 0 0 292 876 MSa/s 352 727 MSa/s 5 1 6602 2224
fir_sr_asym cint16 int16 16 1 1 256 1 0 1 341 750 MSa/s 429 596 MSa/s 8 1 6638 3048
fir_sr_asym cint16 int16 240 1 1 256 1 0 0 4471 57 MSa/s 4559 56 MSa/s 5 1 10186 4098
fir_sr_asym cint16 int16 240 1 1 256 1 0 1 4710 54 MSa/s 4827 53 MSa/s 8 1 10670 4534
fir_sr_asym cint16 int16 32 1 1 256 1 0 0 548 467 MSa/s 608 421 MSa/s 5 1 6858 2982
fir_sr_asym cint16 int16 128 1 1 256 2 0 0 1684 152 MSa/s 1774 144 MSa/s 7 2 12913 3714 3394
fir_sr_sym cint32 cint16 32 1 1 256 1 0 0 1331 192 MSa/s 1403 182 MSa/s 5 1 11270 2092
fir_sr_sym cint32 cint16 16 1 1 256 1 0 0 816 313 MSa/s 880 290 MSa/s 5 1 10854 1888
fir_sr_sym cint16 int16 89 1 1 256 1 0 0 1054 242 MSa/s 1132 226 MSa/s 5 1 7576 2490
fir_sr_sym cint16 int16 8 1 1 256 1 0 1 117 2188 MSa/s 266 962 MSa/s 8 1 6506 2364
fir_sr_sym cint16 int16 8 1 1 256 1 0 0 117 2188 MSa/s 264 969 MSa/s 5 1 6486 1880
fir_sr_sym cint16 int16 8 1 1 64 1 0 0 59 1084 MSa/s 114 561 MSa/s 5 1 3414 1784
fir_sr_sym cint32 cint32 16 1 1 256 1 0 0 1573 162 MSa/s 1637 156 MSa/s 5 1 11014 1764
fir_sr_sym cint16 int16 8 1 1 256 1 0 0 117 2188 MSa/s 264 969 MSa/s 5 1 6486 1880
fir_sr_sym cint32 cint32 32 1 1 256 1 0 0 2599 98 MSa/s 2671 95 MSa/s 5 1 11590 2118
fir_sr_sym int32 int16 16 1 1 256 1 0 0 166 1542 MSa/s 265 966 MSa/s 5 1 6566 2090
fir_sr_sym cint32 int16 32 1 1 256 1 0 0 1071 239 MSa/s 1143 223 MSa/s 5 1 11110 1832
fir_sr_sym cint32 int32 16 1 1 256 1 0 0 816 313 MSa/s 880 290 MSa/s 5 1 10854 1888
fir_sr_sym cint32 int32 32 1 1 256 1 0 0 1331 192 MSa/s 1403 182 MSa/s 5 1 11270 2092
fir_sr_sym float float 16 1 1 256 1 0 0 1794 142 MSa/s 1854 138 MSa/s 5 1 6885 2588
fir_sr_sym float float 32 1 1 256 1 0 0 4549 56 MSa/s 4612 55 MSa/s 5 1 7173 2380
fir_sr_sym int16 int16 16 1 1 256 1 0 0 101 2534 MSa/s 158 1620 MSa/s 5 1 4454 2102
fir_sr_sym int16 int16 32 1 1 256 1 0 0 164 1560 MSa/s 221 1158 MSa/s 5 1 4582 2488
fir_sr_sym int16 int16 96 1 1 512 1 0 0 942 543 MSa/s 1014 504 MSa/s 5 1 7206 2880
fir_sr_sym cint16 int16 8 1 1 256 1 0 0 117 2188 MSa/s 264 969 MSa/s 5 1 6486 1880
fir_sr_sym int32 int16 32 1 1 256 1 0 0 457 560 MSa/s 519 493 MSa/s 5 1 6758 1924
fir_sr_sym cint32 int16 16 1 1 256 1 0 0 687 372 MSa/s 751 340 MSa/s 5 1 10790 1740
fir_sr_sym cint16 int16 8 1 1 128 1 0 0 77 1662 MSa/s 137 934 MSa/s 5 1 4438 1880
fir_sr_sym cint16 int16 16 1 1 256 1 0 0 164 1560 MSa/s 274 934 MSa/s 5 1 6566 1942
fir_sr_sym cint16 int16 64 1 1 256 1 0 0 921 277 MSa/s 989 258 MSa/s 5 1 7174 2044
fir_sr_sym int32 int32 16 1 1 256 1 0 0 292 876 MSa/s 352 727 MSa/s 5 1 6630 2468
fir_sr_sym cint16 cint16 16 1 1 256 1 0 0 292 876 MSa/s 352 727 MSa/s 5 1 6630 2468
fir_sr_sym cint16 cint16 24 1 1 256 1 0 0 421 608 MSa/s 496 516 MSa/s 5 1 6822 2972
fir_sr_sym cint16 cint16 24 1 1 256 2 0 0 303 844 MSa/s 372 688 MSa/s 6 2 9523 2330 2064
fir_sr_sym cint16 cint16 30 1 1 512 3 0 0 888 576 MSa/s 986 519 MSa/s 9 3 20736 1598 1662 1754
fir_sr_sym cint16 cint16 32 1 1 256 1 0 0 652 392 MSa/s 714 358 MSa/s 5 1 6918 2056
fir_sr_sym cint16 int16 128 1 1 256 1 0 0 1438 178 MSa/s 1515 168 MSa/s 5 1 8006 2676
fir_sr_sym cint16 int16 128 1 1 256 1 0 1 1438 178 MSa/s 1606 159 MSa/s 8 1 8138 2758
fir_sr_sym cint16 int16 128 1 1 256 2 0 0 928 275 MSa/s 1045 244 MSa/s 7 2 12427 2432 2218
fir_sr_sym cint16 int16 128 1 1 256 3 0 0 798 320 MSa/s 964 265 MSa/s 9 3 16848 1986 1770 1796
fir_sr_sym cint16 int16 128 1 1 256 4 0 0 670 382 MSa/s 835 306 MSa/s 11 4 21269 1986 1654 1654 1772
fir_sr_sym cint16 int16 128 1 1 256 5 0 0 669 382 MSa/s 836 306 MSa/s 13 5 25690 1986 1600 1600 1600 1734
fir_sr_sym cint16 int16 129 1 1 256 1 0 0 1442 177 MSa/s 1525 167 MSa/s 5 1 8168 2742
fir_sr_sym cint16 int16 16 1 1 256 1 0 0 166 1542 MSa/s 265 966 MSa/s 5 1 6566 1942
fir_sr_sym cint16 int16 16 1 1 256 1 0 1 166 1542 MSa/s 280 914 MSa/s 8 1 6602 2554
fir_sr_sym cint16 int16 199 1 1 256 1 0 0 2017 126 MSa/s 2107 121 MSa/s 5 1 9012 3074
fir_sr_sym cint16 int16 240 1 1 256 1 0 0 2339 109 MSa/s 2427 105 MSa/s 5 1 9510 3108
fir_sr_sym cint16 int16 240 1 1 256 1 0 1 2339 109 MSa/s 2569 99 MSa/s 8 1 9738 3548
fir_sr_sym cint16 int16 32 1 1 256 1 0 0 663 386 MSa/s 724 353 MSa/s 5 1 6758 1788
fir_sr_sym cint16 int16 32 1 1 256 1 0 1 707 362 MSa/s 797 321 MSa/s 8 1 6794 2620
fir_sr_sym cint16 int16 63 1 1 256 1 0 0 858 298 MSa/s 931 274 MSa/s 5 1 7172 2060
fir_sr_sym cint16 int16 64 1 1 256 1 0 1 921 277 MSa/s 1046 244 MSa/s 8 1 7242 2324
fir_sr_sym int32 int32 32 1 1 256 1 0 0 652 392 MSa/s 714 358 MSa/s 5 1 6918 2056

FFT

Following table gives results for the FFT/IFFT function with a wide variety of supported parameters, which are defined in: L2 FFT configuration parameters.

fft_benchmark.csv

FFT benchmark
Library Element DATA_TYPE TWIDDLE_TYPE POINT_SIZE FFT_NIFFT CASC_LEN DYN_PT_SIZE WINDOW_VSIZE API_IO PARALLEL_POWER cycleCountAvg throughputAvg initiationInterval throughputInitIntAvg NUM_BANKS NUM_ME DATA_MEMORY PROGRAM_MEMORY
fft_ifft_dit_1ch cfloat cfloat 1024 1 1 1 1024 0 0 2390 428 MSa/s 2983 343 MSa/s 13 1 55864 9190
fft_ifft_dit_1ch cint32 cint16 1024 1 3 1 1024 0 0 700 1462 MSa/s 2076 493 MSa/s 31 3 101928 9500 2894 2990
fft_ifft_dit_1ch cint32 cint16 1024 1 2 1 1024 0 0 700 1462 MSa/s 2071 494 MSa/s 22 2 76848 3974 9500
fft_ifft_dit_1ch cint32 cint16 1024 1 1 1 1024 0 0 1054 971 MSa/s 2070 494 MSa/s 13 1 51768 12928
fft_ifft_dit_1ch cint32 cint16 1024 1 1 0 1024 0 0 1782 574 MSa/s 2071 494 MSa/s 9 1 51640 5010
fft_ifft_dit_1ch cint16 cint16 64 1 1 1 64 0 0 488 131 MSa/s 532 120 MSa/s 8 1 11832 7660
fft_ifft_dit_1ch cint16 cint16 64 1 1 0 64 0 0 203 315 MSa/s 245 261 MSa/s 8 1 11704 3100
fft_ifft_dit_1ch cint16 cint16 512 1 1 0 512 0 0 933 548 MSa/s 978 523 MSa/s 9 1 25528 4722
fft_ifft_dit_1ch cint16 cint16 512 0 1 0 512 0 0 933 548 MSa/s 978 523 MSa/s 9 1 25528 4722
fft_ifft_dit_1ch cint16 cint16 32 0 1 1 32 0 0 349 91 MSa/s 393 81 MSa/s 8 1 11320 7460
fft_ifft_dit_1ch cint16 cint16 32 0 1 0 32 0 0 130 246 MSa/s 167 191 MSa/s 8 1 11192 2310
fft_ifft_dit_1ch cint16 cint16 256 1 1 0 256 0 0 469 545 MSa/s 512 500 MSa/s 8 1 16824 4322
fft_ifft_dit_1ch cint16 cint16 256 1 3 1 256 0 0 344 744 MSa/s 641 399 MSa/s 15 3 42536 6732 3006 2910
fft_ifft_dit_1ch cint16 cint16 256 1 3 0 256 0 0 299 856 MSa/s 526 486 MSa/s 15 3 42280 2898 2082 1954
fft_ifft_dit_1ch cint16 cint16 256 1 2 1 256 0 0 347 737 MSa/s 596 429 MSa/s 11 2 29744 4006 6732
fft_ifft_dit_1ch cint16 cint16 256 1 2 0 256 0 0 250 1024 MSa/s 440 581 MSa/s 14 2 29552 2754 2898
fft_ifft_dit_1ch cint16 cint16 256 1 1 1 256 0 0 577 443 MSa/s 679 377 MSa/s 8 1 16952 10862
fft_ifft_dit_1ch cint16 cint16 256 1 1 0 256 0 0 469 545 MSa/s 567 451 MSa/s 8 1 16824 4322
fft_ifft_dit_1ch cint32 cint16 1024 1 4 1 1024 0 0 492 2081 MSa/s 2080 492 MSa/s 38 4 127008 2894 2990 6556 3550
fft_ifft_dit_1ch cint32 cint16 1024 1 5 1 1024 0 0 427 2398 MSa/s 2085 491 MSa/s 49 5 152088 2862 2990 3622 3598 3550
fft_ifft_dit_1ch cint32 cint16 128 1 1 0 1024 0 0 2207 463 MSa/s 2268 451 MSa/s 7 1 42424 4098
fft_ifft_dit_1ch cint32 cint16 128 1 2 0 1024 0 0 1219 840 MSa/s 2076 493 MSa/s 12 2 67440 2624 2958
fft_ifft_dit_1ch cint32 cint16 64 1 2 0 1024 0 0 2114 484 MSa/s 2185 468 MSa/s 12 2 67440 2602 1890
fft_ifft_dit_1ch cint32 cint16 64 1 1 0 1024 0 0 3000 341 MSa/s 3060 334 MSa/s 7 1 42424 3118
fft_ifft_dit_1ch cint32 cint16 512 1 2 0 1024 0 0 1115 918 MSa/s 2072 494 MSa/s 14 2 71024 2990 3350
fft_ifft_dit_1ch cint32 cint16 512 1 1 0 1024 0 0 1895 540 MSa/s 2073 493 MSa/s 8 1 46008 4802
fft_ifft_dit_1ch cint32 cint16 512 1 1 1 512 0 0 733 698 MSa/s 1069 478 MSa/s 8 1 29752 12976
fft_ifft_dit_1ch cint32 cint16 512 1 1 0 512 0 0 936 547 MSa/s 1040 492 MSa/s 8 1 29624 4626
fft_ifft_dit_1ch cint32 cint16 32 1 1 1 32 0 0 320 100 MSa/s 364 87 MSa/s 7 1 10808 7280
fft_ifft_dit_1ch cint32 cint16 32 1 1 0 32 0 0 139 230 MSa/s 176 181 MSa/s 7 1 10680 2358
fft_ifft_dit_1ch cint16 cint16 256 0 1 0 256 0 0 469 545 MSa/s 512 500 MSa/s 8 1 16824 4322
fft_ifft_dit_1ch cint32 cint16 32 1 2 0 1024 0 0 3008 340 MSa/s 3078 332 MSa/s 12 2 67440 2172 1680
fft_ifft_dit_1ch cint32 cint16 256 1 1 1 256 0 0 548 467 MSa/s 600 426 MSa/s 7 1 19000 10160
fft_ifft_dit_1ch cint32 cint16 256 1 1 0 256 0 0 465 550 MSa/s 524 488 MSa/s 7 1 18872 4306
fft_ifft_dit_1ch cint32 cint16 256 1 2 0 1024 0 0 963 1063 MSa/s 2072 494 MSa/s 12 2 68464 3058 2974
fft_ifft_dit_1ch cint32 cint16 256 1 1 0 1024 0 0 1871 547 MSa/s 2073 493 MSa/s 7 1 43448 4512
fft_ifft_dit_1ch cint32 cint16 16 1 1 0 16 0 0 79 202 MSa/s 114 140 MSa/s 7 1 10168 1778
fft_ifft_dit_1ch cint32 cint16 16 1 1 0 1024 0 0 3993 256 MSa/s 4054 252 MSa/s 7 1 42424 2230
fft_ifft_dit_1ch cint32 cint16 128 1 1 1 128 0 0 487 262 MSa/s 532 240 MSa/s 7 1 13880 10160
fft_ifft_dit_1ch cint32 cint16 128 1 1 0 128 0 0 282 453 MSa/s 320 400 MSa/s 7 1 13752 3858
fft_ifft_dit_1ch cint32 cint16 32 1 1 0 1024 0 0 4069 251 MSa/s 4129 248 MSa/s 7 1 42424 2524
fft_ifft_dit_1ch cint16 cint16 2048 1 1 0 2048 0 0 3913 523 MSa/s 3974 515 MSa/s 13 1 79288 5490
fft_ifft_dit_1ch cint16 cint16 2048 0 1 0 2048 0 0 3913 523 MSa/s 3974 515 MSa/s 13 1 79288 5490
fft_ifft_dit_1ch cint16 cint16 16 1 1 0 16 0 0 81 197 MSa/s 117 136 MSa/s 8 1 10936 1796
fft_ifft_dit_1ch cfloat cfloat 256 1 3 1 256 0 0 564 453 MSa/s 1076 237 MSa/s 15 3 44584 5738 3796 3562
fft_ifft_dit_1ch cfloat cfloat 256 1 3 0 256 0 0 633 404 MSa/s 1070 239 MSa/s 14 3 44328 3160 2296 2028
fft_ifft_dit_1ch cfloat cfloat 256 1 2 1 256 0 0 652 392 MSa/s 1065 240 MSa/s 10 2 31792 4916 5738
fft_ifft_dit_1ch cfloat cfloat 256 1 2 0 256 0 0 802 319 MSa/s 1155 221 MSa/s 10 2 31600 3160 3020
fft_ifft_dit_1ch cfloat cfloat 256 1 2 0 1024 0 0 2935 348 MSa/s 3021 338 MSa/s 12 2 68464 3278 3248
fft_ifft_dit_1ch cfloat cfloat 256 1 1 1 256 0 0 976 262 MSa/s 1142 224 MSa/s 7 1 19000 7862
fft_ifft_dit_1ch cfloat cfloat 256 1 1 1 256 0 0 976 262 MSa/s 1024 250 MSa/s 7 1 19000 7862
fft_ifft_dit_1ch cfloat cfloat 256 1 1 0 256 0 0 1306 196 MSa/s 1469 174 MSa/s 7 1 18872 4684
fft_ifft_dit_1ch cfloat cfloat 32 1 1 0 1024 0 0 6297 162 MSa/s 6358 161 MSa/s 7 1 42424 3538
fft_ifft_dit_1ch cfloat cfloat 256 1 1 0 256 0 0 1306 196 MSa/s 1351 189 MSa/s 7 1 18872 4684
fft_ifft_dit_1ch cfloat cfloat 16 1 1 0 16 0 0 126 126 MSa/s 168 95 MSa/s 7 1 10168 2344
fft_ifft_dit_1ch cfloat cfloat 128 1 3 1 128 0 0 392 326 MSa/s 754 169 MSa/s 14 3 35368 5146 3796 3562
fft_ifft_dit_1ch cfloat cfloat 128 1 3 0 128 0 0 286 447 MSa/s 571 224 MSa/s 14 3 35112 2720 2296 2028
fft_ifft_dit_1ch cfloat cfloat 128 1 2 1 128 0 0 489 261 MSa/s 771 166 MSa/s 12 2 24624 5850 4250
fft_ifft_dit_1ch cfloat cfloat 128 1 2 0 128 0 0 371 345 MSa/s 590 216 MSa/s 10 2 24432 3144 2580
fft_ifft_dit_1ch cfloat cfloat 128 1 1 1 128 0 0 770 166 MSa/s 872 146 MSa/s 7 1 13880 7158
fft_ifft_dit_1ch cfloat cfloat 128 1 1 1 128 0 0 721 177 MSa/s 766 167 MSa/s 7 1 13880 7158
fft_ifft_dit_1ch cfloat cfloat 128 1 1 0 128 0 0 647 197 MSa/s 746 171 MSa/s 7 1 13752 4204
fft_ifft_dit_1ch cfloat cfloat 256 1 1 0 1024 0 0 5225 195 MSa/s 5286 193 MSa/s 7 1 43448 4956
fft_ifft_dit_1ch cint32 cint16 64 1 1 0 64 0 0 193 331 MSa/s 231 277 MSa/s 7 1 11704 2972
fft_ifft_dit_1ch cfloat cfloat 32 1 1 0 32 0 0 213 150 MSa/s 255 125 MSa/s 7 1 10680 3268
fft_ifft_dit_1ch cfloat cfloat 32 1 2 0 1024 0 0 3774 271 MSa/s 3863 265 MSa/s 12 2 67440 2650 2250
fft_ifft_dit_1ch cint16 cint16 128 1 3 1 128 0 0 274 467 MSa/s 496 258 MSa/s 15 3 34344 6716 3006 2910
fft_ifft_dit_1ch cint16 cint16 128 1 3 0 128 0 0 159 805 MSa/s 335 382 MSa/s 15 3 34088 2498 2082 1954
fft_ifft_dit_1ch cint16 cint16 128 1 2 1 128 0 0 280 457 MSa/s 475 269 MSa/s 13 2 23600 6716 4006
fft_ifft_dit_1ch cint16 cint16 128 1 2 0 128 0 0 161 795 MSa/s 300 426 MSa/s 12 2 23408 2754 2498
fft_ifft_dit_1ch cint16 cint16 128 1 1 1 128 0 0 498 257 MSa/s 568 225 MSa/s 8 1 12856 10532
fft_ifft_dit_1ch cint16 cint16 128 1 1 0 128 0 0 291 439 MSa/s 354 361 MSa/s 8 1 12728 3938
fft_ifft_dit_1ch cint16 cint16 128 1 1 0 128 0 0 291 439 MSa/s 329 389 MSa/s 8 1 12728 3938
fft_ifft_dit_1ch cint16 cint16 128 1 1 0 128 1 1 205 624 MSa/s 592 241 MSa/s 14 4 38624 3618 2210 3618 2210
fft_ifft_dit_1ch cfloat cfloat 32 1 1 1 32 0 0 529 60 MSa/s 573 55 MSa/s 7 1 10808 5600
fft_ifft_dit_1ch cint16 cint16 128 0 1 0 128 0 0 291 439 MSa/s 329 389 MSa/s 8 1 12728 3938
fft_ifft_dit_1ch cint16 cint16 1024 0 1 0 1024 0 0 1790 572 MSa/s 1842 555 MSa/s 10 1 43448 5122
fft_ifft_dit_1ch cfloat cfloat 64 1 2 0 1024 0 0 3011 340 MSa/s 3099 330 MSa/s 12 2 67440 2814 2762
fft_ifft_dit_1ch cfloat cfloat 64 1 1 1 64 0 0 581 110 MSa/s 625 102 MSa/s 7 1 11832 6416
fft_ifft_dit_1ch cfloat cfloat 64 1 1 0 64 0 0 347 184 MSa/s 389 164 MSa/s 7 1 11704 3836
fft_ifft_dit_1ch cfloat cfloat 64 1 1 0 1024 0 0 5453 187 MSa/s 5514 185 MSa/s 7 1 42424 4132
fft_ifft_dit_1ch cfloat cfloat 512 1 2 0 1024 0 0 2804 365 MSa/s 2894 353 MSa/s 14 2 72560 3264 3724
fft_ifft_dit_1ch cfloat cfloat 512 1 1 0 512 0 0 2735 187 MSa/s 2786 183 MSa/s 8 1 31160 5068
fft_ifft_dit_1ch cfloat cfloat 512 1 1 0 1024 0 0 5495 186 MSa/s 5556 184 MSa/s 8 1 47544 5464
fft_ifft_dit_1ch cint16 cint16 1024 1 1 0 1024 0 0 1790 572 MSa/s 1842 555 MSa/s 10 1 43448 5122
fft_ifft_dit_1ch cint32 cint16 64 1 1 1 64 0 0 442 144 MSa/s 486 131 MSa/s 7 1 11832 7216

Matrix Multiply

Following table gives results for the Matrix Multiply function with a wide variety of supported parameters, which are defined in: L2 Matrix Multiply Configuration Parameters.

Note

cycleCountAvg does not include the cycle count information for the additional shuffling/tiling widget kernels, but initiationInterval and PROGRAM_MEMORY do include shuffling/tiling widget kernels.

matrix_mult_benchmark.csv

Matrix Multiply benchmark
Library Element T_DATA_A T_DATA_B P_DIM_A P_DIM_AB P_DIM_B P_ADD_TILING_A P_ADD_TILING_B P_ADD_DETILING_OUT P_INPUT_WINDOW_VSIZE_A P_INPUT_WINDOW_VSIZE_B P_CASC_LEN NITER cycleCountAvg throughputAvg initiationInterval throughputInitIntAvg NUM_BANKS NUM_ME DATA_MEMORY PROGRAM_MEMORY
matrix_mult cfloat cfloat 8 64 4 1 1 1 512 256 4 16 687.0 745 MSa/s 13250.0 40 MSa/s 31 9 49835 1766 3032 1766 3280 1766 3280 1766 1342 3520
matrix_mult cint16 int32 8 64 4 1 1 1 512 256 4 16 274.0 1868 MSa/s 6439.0 86 MSa/s 33 9 35759 1782 2750 1782 2862 1782 2862 1782 1326 3100
matrix_mult cint32 cint16 8 64 4 1 1 1 512 256 4 16 235.0 2178 MSa/s 5775.0 97 MSa/s 28 9 43951 1886 1684 1886 1750 1886 1750 1886 1198 1952
matrix_mult cint32 cint32 8 64 4 1 1 1 512 256 4 16 369.0 1387 MSa/s 8035.0 67 MSa/s 27 9 48553 1870 1924 1870 2084 1870 2084 1870 1198 2244
matrix_mult cint32 int32 8 64 4 1 1 1 512 256 4 16 235.0 2178 MSa/s 5775.0 97 MSa/s 28 9 43951 1886 1684 1886 1750 1886 1750 1886 1198 1952
matrix_mult float cfloat 8 64 4 1 1 1 512 256 4 16 425.0 1204 MSa/s 8976.0 60 MSa/s 25 9 41134 2710 2354 2710 2496 2710 2496 2710 1198 2720
matrix_mult cint16 cint16 8 8 8 1 1 1 64 64 1 100 106.0 4830 MSa/s 18575.0 27 MSa/s 9 3 10257 1854 2334 1358
matrix_mult int16 cint32 8 64 4 1 1 1 512 256 4 16 172.0 2976 MSa/s 4701.0 121 MSa/s 25 9 36015 2318 1798 2302 1806 2318 1806 2350 1198 2024
matrix_mult int16 int16 16 16 16 1 1 1 256 256 1 16 240.0 17066 MSa/s 5401.0 795 MSa/s 11 3 13326 1854 2012 1390
matrix_mult int32 cint16 8 64 4 1 1 1 512 256 4 16 274.0 1868 MSa/s 6439.0 86 MSa/s 33 9 35759 1782 2750 1782 2862 1782 2862 1782 1326 3100
matrix_mult int16 cint16 16 16 16 1 1 1 256 256 1 16 336.0 12190 MSa/s 6837.0 621 MSa/s 11 3 17424 2198 2408 1662
matrix_mult cint16 cint16 8 8 8 1 1 1 64 64 1 100 106.0 4830 MSa/s 18575.0 27 MSa/s 9 3 10257 1854 2334 1358
matrix_mult cint16 cint16 8 64 4 1 1 1 512 256 4 16 126.0 4063 MSa/s 4008.0 145 MSa/s 30 9 35375 1782 1848 1782 2012 1782 2012 1782 1198 2146
matrix_mult cint16 cint16 8 64 4 1 1 1 512 256 1 16 287.0 7135 MSa/s 9127.0 230 MSa/s 8 3 19985 1766 2030 1214
matrix_mult cfloat float 8 64 4 1 1 1 512 256 4 16 336.0 1523 MSa/s 7469.0 73 MSa/s 25 9 45230 1982 2252 1982 2268 1982 2268 1982 1198 2488
matrix_mult cint16 cint16 1024 4 4 0 0 0 4096 16 1 100 2462.0 6654 MSa/s 412473.0 39 MSa/s 11 1 67849 1892
matrix_mult cint16 cint16 1024 4 4 1 1 1 4096 16 1 16 2462.0 6654 MSa/s 71653.0 229 MSa/s 16 3 105553 1102 1876 1230
matrix_mult cint16 cint16 16 16 16 0 0 0 256 256 1 16 593.0 6907 MSa/s 10486.0 390 MSa/s 7 1 8329 2686
matrix_mult cint16 cint16 16 16 16 1 1 1 256 256 1 100 609.0 6725 MSa/s 66519.0 61 MSa/s 8 3 19473 2246 2646 1662
matrix_mult cint16 cint16 16 16 16 1 1 1 256 256 1 16 608.0 6736 MSa/s 11289.0 371 MSa/s 8 3 19473 2246 2646 1662
matrix_mult cint16 cint16 16 256 16 0 0 1 4096 4096 1 100 8279.0 7915 MSa/s 836981.0 78 MSa/s 14 2 73997 1662 2702
matrix_mult cint16 cint16 24 4 4 1 1 1 96 16 1 16 87.0 4413 MSa/s 2607.0 162 MSa/s 11 3 9553 1876 1246 1102
matrix_mult cint16 cint16 32 32 32 0 0 0 1024 1024 1 16 4227.0 7752 MSa/s 69398.0 472 MSa/s 7 1 26761 3798
matrix_mult cint16 cint16 32 32 32 1 0 0 1024 1024 1 100 4227.0 7752 MSa/s 428974.0 76 MSa/s 10 2 37581 1406 3798
matrix_mult cint16 cint16 32 32 64 0 0 0 1024 2048 1 16 8357.0 7842 MSa/s 136502.0 480 MSa/s 7 1 43145 3798
matrix_mult cint16 cint16 32 64 32 0 0 0 2048 2048 1 16 8325.0 7872 MSa/s 135990.0 481 MSa/s 7 1 43145 3798
matrix_mult cint16 cint16 64 64 64 0 0 0 4096 4096 1 100 33069.0 7927 MSa/s 3315672.0 79 MSa/s 13 1 100489 3774
matrix_mult cint16 cint16 64 64 64 0 0 0 4096 4096 1 16 33069.0 7927 MSa/s 533928.0 490 MSa/s 13 1 100489 3774
matrix_mult cint16 cint16 8 4 4 1 1 1 32 16 1 16 46.0 2782 MSa/s 1881.0 78 MSa/s 11 3 8017 1678 1198 1102
matrix_mult cint16 cint16 8 4 512 1 0 1 32 2048 1 100 2079.0 7880 MSa/s 408624.0 40 MSa/s 13 2 86541 1846 1478
matrix_mult cint16 cint16 8 4 512 1 1 1 32 2048 1 16 2079.0 7880 MSa/s 64546.0 254 MSa/s 15 3 105553 1278 1846 1478
matrix_mult cint16 cint16 8 4 64 1 1 1 32 256 1 16 289.0 7086 MSa/s 8563.0 246 MSa/s 10 3 19537 1278 1846 1462
matrix_mult int32 int32 8 64 4 1 1 1 512 256 4 16 126.0 4063 MSa/s 4008.0 145 MSa/s 30 9 35375 1782 1848 1782 2012 1782 2012 1782 1198 2146

Widgets

Following table gives results for the Widgets with a wide variety of supported parameters, which are defined in: L2 Widgets Configuration Parameters.

widget_benchmark.csv

Widgets benchmark
Library Element DATA_TYPE IN_API OUT_API NUM_INPUTS WINDOW_VSIZE NUM_OUTPUT_CLONES PATTERN cycleCountAvg throughputAvg initiationInterval throughputInitIntAvg NUM_BANKS NUM_ME DATA_MEMORY PROGRAM_MEMORY DATA_OUT_TYPE
widget_api_cast cfloat 0.0 0.0 1.0 256 1.0 0.0 91 2813 MSa/s 517 495 MSa/s 5 1 8980 1372  
widget_api_cast int32 1.0 0.0 2.0 256 2.0 0.0 274 934 MSa/s 309 828 MSa/s 5 1 4884 1358  
widget_api_cast int32 0.0 1.0 1.0 256 2.0 0.0 275 930 MSa/s 300 853 MSa/s 3 1 2836 1176  
widget_api_cast cint32 1.0 0.0 1.0 256 4.0 0.0 535 478 MSa/s 592 432 MSa/s 9 1 17172 1792  
widget_api_cast cint32 0.0 1.0 1.0 256 2.0 0.0 531 482 MSa/s 559 457 MSa/s 3 1 4884 1176  
widget_api_cast cint16 1.0 0.0 2.0 256 1.0 0.0 273 937 MSa/s 296 864 MSa/s 3 1 2836 1170  
widget_api_cast cint16 1.0 0.0 1.0 256 3.0 0.0 280 914 MSa/s 327 782 MSa/s 7 1 6932 1566  
widget_api_cast cint16 1.0 0.0 1.0 256 2.0 0.0 275 930 MSa/s 311 823 MSa/s 5 1 4884 1374  
widget_api_cast cint32 1.0 0.0 2.0 256 4.0 0.0 535 478 MSa/s 593 431 MSa/s 9 1 17172 1824  
widget_api_cast cint16 0.0 1.0 1.0 256 1.0 0.0 274 934 MSa/s 299 856 MSa/s 3 1 2836 1160  
widget_api_cast cint16 0.0 0.0 1.0 256 3.0 0.0 127 2015 MSa/s 264 969 MSa/s 9 1 8980 1792  
widget_api_cast cint16 0.0 0.0 1.0 256 2.0 0.0 93 2752 MSa/s 264 969 MSa/s 7 1 6932 1612  
widget_api_cast cint16 0.0 0.0 1.0 256 1.0 0.0 59 4338 MSa/s 263 973 MSa/s 5 1 4884 1372  
widget_api_cast cfloat 1.0 0.0 2.0 256 3.0 0.0 531 482 MSa/s 577 443 MSa/s 7 1 13076 1606  
widget_api_cast cfloat 0.0 1.0 1.0 256 2.0 0.0 531 482 MSa/s 559 457 MSa/s 3 1 4884 1176  
widget_api_cast cint16 1.0 0.0 1.0 256 1.0 0.0 272 941 MSa/s 295 867 MSa/s 3 1 2836 1154  
widget_real2complex int16       256     87 2942 MSa/s 262 977 MSa/s 5 1 3860 1444 cint16
widget_real2complex int16       1024     279 3670 MSa/s 1022 1001 MSa/s 5 1 13076 1444 cint16
widget_real2complex float       256     404 633 MSa/s 520 492 MSa/s 5 1 6932 1422 cfloat
widget_real2complex float       1024     1556 658 MSa/s 2056 498 MSa/s 5 1 25364 1422 cfloat
widget_real2complex cint32       256     86 2976 MSa/s 517 495 MSa/s 5 1 6932 1412 int32
widget_real2complex cint16       1024     150 6826 MSa/s 1029 995 MSa/s 5 1 13076 1412 int16
widget_real2complex cint16       256     54 4740 MSa/s 260 984 MSa/s 5 1 3860 1428 int16
widget_real2complex cfloat       256     86 2976 MSa/s 517 495 MSa/s 5 1 6932 1412 float
widget_real2complex cfloat       1024     278 3683 MSa/s 2055 498 MSa/s 5 1 25364 1412 float
widget_real2complex int32       1024     1556 658 MSa/s 2056 498 MSa/s 5 1 25364 1422 cint32
widget_real2complex cint32       1024     278 3683 MSa/s 2055 498 MSa/s 5 1 25364 1412 int32
widget_real2complex int32       256     404 633 MSa/s 520 492 MSa/s 5 1 6932 1422 cint32

DDS/Mixer

Following table gives results for the DDS/Mixer with a wide variety of supported parameters, which are defined in: L2 DDS/Mixer Configuration Parameters.

dds_mixer_benchmark.csv

DDS/Mixer benchmark
Library Element DATA_TYPE INPUT_WINDOW_VSIZE MIXER_MODE P_API cycleCountAvg throughputAvg initiationInterval throughputInitIntAvg NUM_BANKS NUM_ME DATA_MEMORY PROGRAM_MEMORY
dds_mixer cint16 256 1 1 295 867 MSa/s 313 817 MSa/s 1 1 2265 1478
dds_mixer cint16 320 2 0 206 1553 MSa/s 337 949 MSa/s 7 1 9933 2120
dds_mixer cint16 4096 0 0 1056 3878 MSa/s 3730 1098 MSa/s 5 1 35015 1490
dds_mixer cint16 4096 0 1 4120 994 MSa/s 4138 989 MSa/s 1 1 2255 1334
dds_mixer cint16 4096 2 0 2094 1956 MSa/s 4231 968 MSa/s 13 1 100557 2136
dds_mixer cint16 4096 2 1 8227 497 MSa/s 8245 496 MSa/s 1 1 2269 1460
dds_mixer cint16 256 0 0 96 2666 MSa/s 250 1024 MSa/s 3 1 4295 1490
dds_mixer cint16 256 1 0 107 2392 MSa/s 266 962 MSa/s 5 1 6345 1940
dds_mixer cint16 256 1 1 295 867 MSa/s 313 817 MSa/s 1 1 2265 1478
dds_mixer cint16 8 1 1 42 190 MSa/s 60 133 MSa/s 1 1 2265 1262
dds_mixer cint16 8 2 0 41 195 MSa/s 93 86 MSa/s 7 1 2445 1810