template class xf::dsp::aie::fft::dit_1ch::fft_ifft_dit_1ch_graph¶
#include "fft_ifft_dit_1ch_graph.hpp"
Overview¶
fft_dit_1ch is a single-channel, decimation-in-time, fixed point size FFT.
This class definition is only used with stream interfaces (TP_API == 1). Stream interface FFT graph is offered with a dual input stream configuration, which interleaves data samples betwwen the streams. Stream interface FFT implementation is capable of supporting parallel computation (TP_PARALLEL_POWER > 0). Dynamic point size, with a header embedded in the data stream, is not supported when TP_API==1.
These are the templates to configure the single-channel decimation-in-time class.
Parameters:
TT_DATA | describes the type of individual data samples input to and output from the transform function. This is a typename and must be one of the following: int16, cint16, int32, cint32, float, cfloat. |
TT_TWIDDLE | describes the type of twiddle factors of the transform. It must be one of the following: cint16, cint32, cfloat and must also satisfy the following rules:
|
TP_POINT_SIZE | is an unsigned integer which describes the number of samples in the transform. This must be 2^N where N is an integer in the range 4 to 16 inclusive. When TP_DYN_PT_SIZE is set, TP_POINT_SIZE describes the maximum point size possible. |
TP_FFT_NIFFT | selects whether the transform to perform is an FFT (1) or IFFT (0). |
TP_SHIFT | selects the power of 2 to scale the result by prior to output. |
TP_CASC_LEN | selects the number of kernels the FFT will be divided over in series to improve throughput |
TP_DYN_PT_SIZE | selects whether (1) or not (0) to use run-time point size determination. When set, each window of data must be preceeded, in the window, by a 256 bit header. The output frame will also be preceeded by a 256 bit vector which is a copy of the input vector, but for the top byte, which is 0 to indicate a legal frame or 1 to indicate an illegal frame. The lowest significance byte of the input header field describes forward (non-zero) or inverse(0) direction. The second least significant byte 8 bits of this field describe the Radix 2 power of the following frame. e.g. for a 512 point size, this field would hold 9, as 2^9 = 512. Any value below 4 or greater than log2(TP_POINT_SIZE) is considered illegal. When this occurs the top byte of the output header will be set to 1 and the output samples will be set to 0 for a frame of TP_POINT_SIZE |
TP_WINDOW_VSIZE | is an unsigned integer which describes the number of samples to be processed in each call to the function. By default, TP_WINDOW_SIZE is set to match TP_POINT_SIZE. TP_WINDOW_SIZE may be set to be an integer multiple of the TP_POINT_SIZE, in which case multiple FFT iterations will be performed on a given input window, resulting in multiple iterations of output samples, reducing the numer of times the kernel needs to be triggered to process a given number of input data samples. As a result, the overheads inferred during kernel triggering are reduced and overall performance is increased. |
TP_API | is an unsigned integer to select window (0) or stream (1) interfaces. When stream I/O is selected, one sample is taken from, or output to, a stream and the next sample from or two the next stream. Two streams mimimum are used. In this example, even samples are read from input stream[0] and odd samples from input stream[1]. |
TP_PARALLEL_POWER | is an unsigned integer to describe N where 2^N is the numbers of subframe processors to use, so as to achieve higher throughput. The default is 0. With TP_PARALLEL_POWER set to 2, 4 subframe processors will be used, each of which takes 2 streams in for a total of 8 streams input and output. Sample[p] must be written to stream[p modulus q] where q is the number of streams. |
TP_INDEX |
template < typename TT_DATA, typename TT_TWIDDLE, unsigned int TP_POINT_SIZE, unsigned int TP_FFT_NIFFT = 1, unsigned int TP_SHIFT = 0, unsigned int TP_CASC_LEN = 1, unsigned int TP_DYN_PT_SIZE = 0, unsigned int TP_WINDOW_VSIZE = TP_POINT_SIZE, unsigned int TP_API = 0, unsigned int TP_PARALLEL_POWER = 0, unsigned int TP_INDEX = 0 > class fft_ifft_dit_1ch_graph: public graph // fields static constexpr int kParallel_factor static constexpr int kWindowSize static constexpr int kNextParallelPower static constexpr int kR2Shift static constexpr int kFFTsubShift port_array <input, 2*kParallel_factor> in port_array <output, 2*kParallel_factor> out parameter r2comb_tw_lut kernel m_combInKernel[kParallel_factor] kernel m_r2Comb[kParallel_factor] kernel m_combOutKernel[kParallel_factor] fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, (TP_POINT_SIZE>> 1), TP_FFT_NIFFT, kFFTsubShift, TP_CASC_LEN, TP_DYN_PT_SIZE, (TP_WINDOW_VSIZE>> 1), kStreamAPI, kNextParallelPower, TP_INDEX> FFTsubframe0 fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, (TP_POINT_SIZE>> 1), TP_FFT_NIFFT, kFFTsubShift, TP_CASC_LEN, TP_DYN_PT_SIZE, (TP_WINDOW_VSIZE>> 1), kStreamAPI, kNextParallelPower, TP_INDEX+kParallel_factor/2> FFTsubframe1
Fields¶
port_array <input, 2*kParallel_factor> in
The input data to the function. I/O is two parallel streams each TT_DATA type.
port_array <output, 2*kParallel_factor> out
The output data from the function. I/O is two parallel streams each TT_DATA type.
kernel m_combInKernel [kParallel_factor]
FFT recursive decomposition. Widget kernel Widgets are used to reorder data as per FFT algorithm requirements.
kernel m_r2Comb [kParallel_factor]
FFT recursive decomposition. R2Combiner kernel R2combiner kernels connect 2 subframe processors with next stage’s r2combiner kernels
kernel m_combOutKernel [kParallel_factor]
FFT recursive decomposition. Widget kernel Widgets are used to reorder data as per FFT algorithm requirements.
fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, (TP_POINT_SIZE>> 1), TP_FFT_NIFFT, kFFTsubShift, TP_CASC_LEN, TP_DYN_PT_SIZE, (TP_WINDOW_VSIZE>> 1), kStreamAPI, kNextParallelPower, TP_INDEX> FFTsubframe0
FFT recursive decomposition. Subframe0 FFT is split into 2 subframe processors with a stage of r2cominers connecting subframe processors. This is a recursive call with decrementing TP_PARALLEL_POWER template parameter.
fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, (TP_POINT_SIZE>> 1), TP_FFT_NIFFT, kFFTsubShift, TP_CASC_LEN, TP_DYN_PT_SIZE, (TP_WINDOW_VSIZE>> 1), kStreamAPI, kNextParallelPower, TP_INDEX+kParallel_factor/2> FFTsubframe1
FFT recursive decomposition. Subframe1. FFT is split into 2 subframe processors with a stage of r2cominers connecting subframe processors. This is a recursive call with decrementing TP_PARALLEL_POWER template parameter.