Kria™ KV260 Vision AI Starter Kit Smart Camera Tutorial

Design Overview

Design Overview

Introduction

The Smart Camera application design built on the KV260 Vision AI Starter Kit provides a framework for building and customizing video platforms that consist of four pipeline stages:

  • Capture pipeline

  • Video processing pipeline

  • Acceleration pipeline

  • Output pipeline

The design has a platform and integrated accelerator functions. The platform consists of capture pipeline, output pipeline, and some video processing functions. This approach makes the design leaner and provides maximum programmable logic (PL) for the accelerator development. The platform supports capture from MIPI single sensor device, a USB webcam, and a file source. The output can be stored as files, passed forward via ethernet using the real time transport protocol (RTP) or displayed on DisplayPort/HDMI monitor. Along with video, the platform also supports audio capture and playback.

Some video processing functions are performed on hard blocks like the video codec unit (VCU) because it is most performant to do so. Video decoding/decompression and encoding/compression is done using the VCU.

The following example acceleration functions can be run on this platform using programmable deep learning processor units (DPU).

  • Face Detection - Network model: Densebox_640_360

  • Cars, Bicycles, and Person Detection for ADAS - Network model: ssd_adas_pruned_0_95

  • Pedestrian Detection - Network model: refinedet_pruned_0_96

An example use case for this design is as an endpoint security camera.

The following figure shows the various pipelines supported by the design.

Pipelines Supported

The application processing unit (APU) in the processing system (PS) consists of four Arm® Cortex®-A53 cores and is configured to run in a symmetric multi-processing (SMP) Linux mode in the design. The application running on Linux is responsible for configuring and controlling the audio/video pipelines and accelerators using Jupyter notebooks or the smartcam application.

The APU application controls the following video data paths implemented in a combination of the PS and PL:

  • Capture pipeline capturing video frames into double-data rate (DDR) memory from:

    • A file on a storage device such as an SD card

    • A USB webcam using the USB interface inside the PS

    • An image sensor connected via MIPI CSI-2 RX through the PL

  • I2S Rx subsystem via Digilent PMOD I2S2 captures audio along with video.

  • Memory-to-memory (M2M) pipeline implementing a neural net inference application. In this design, the neural net is implemented in the DPU, preprocessed video frames are read from DDR memory, processed by the DPU, and then written back to memory.

  • An output pipeline reads video frames from memory and sends the frames to a sink.

    • In this case the sink is a display or VCU-encoded stream through Ethernet.

    • In the display pipeline sink is a monitor, the DP controller subsystem in the PS is coupled to the STDP4320 De-multiplexer on the carrier card. STDP4320 consists of dual mode output ports configured as DP and HDMI.

  • Along with video, the I2S TX subsystem via Digilent PMOD I2S2 forwards audio data to a speaker.

The following figure shows an example end-to-end pipeline which could be a single image sensor as the video source, pre-process and DPU IPs for application NN Inference. The inferred frames are either VCU encoded and streamed via RTP network protocol for delivering audio and video over IP networks, or the video frames are displayed via aDP splitter onto a DP and HDMI port for display, as the video sink. The figure also shows the image processing blocks used in the capture path. The video format in the figure is the output format on each block. Details are described in the Hardware Architecture document.

End to end example pipelines

Design Components

Hardware components
Interfaces and IP
  • Video inputs

    • File

    • USB webcam

    • MIPI CSI-2 Rx

  • Video outputs

    • DisplayPort/ HDMI

    • File

    • Ethernet - Jupyter notebook/RTSP

  • Audio inputs

    • I2S receiver

  • Audio outputs

    • I2S transmitter

  • Video processing

    • VCU decoding and encoding

    • Accelerator functions on the DPU

    • PL and PS based pre and post processing specific to a accelerator function

  • Auxiliary Peripherals

    • QSPI

    • SD

    • I2C

    • Universal asynchronous receiver-transmitter (UART)

    • Ethernet

    • General purpose I/O (GPIO)

Software components
  • Operating system

    • APU: SMP Linux

  • Linux kernel subsystems

    • Video source: Video4 Linux (V4L2)

    • Display: Direct Rendering Manager (DRM)/Kernel Mode Setting (KMS)

  • Linux user space frameworks

    • Jupyter

    • GStreamer/VVAS

    • AMD Vitis™ AI

    • Xilinx run-time (XRT)

Resolution and Format Supported
  • Resolutions

    • 1080p30

    • 2160p30

    • Lower resolution and lower frame rates for USB and file I/O

  • Pixel format

    • YUV 4:2:0 (NV12)

 

References

  • Kria KV260 Vision AI Starter Kit User Guide (UG1089)

  • Kria SOM Carrier Card Design Guide (UG1091)

  • Kria KV260 Vision AI Starter Kit Data Sheet (DS986)

  • Kria K26 SOM Data Sheet (DS987)

Copyright © 2021-2024 Advanced Micro Devices, Inc

Terms and Conditions