|
VLSI IDEA INNOVATORS are an R&D organization who was in to research and development in the electronics field for Image Processing. Currently we are performing research and development on
( i ) DCTQ Processor And IQIDCT Processor and Controller
( ii ) FFT & IFFT PROCESSOR AND CONTROLLER INTRODUCTION
DCTQ Processor And IQIDCT Processor and Controller
Introduction:
This is especially true for computationally intensive applications such as the discrete cosine
transform (DCT), modulation/demodulation, etc., where we need to process the algorithms at real
time rates. For instance, in video codecs conforming to MPEG 2 standards, the computationally
intensive DCT algorithm needs to be computed at the rate of one coefficient per clock cycle running
at the rate of 100 MHz or more if we are to meet the real time processing rate of 30 frames per
second for a color picture of size: 1024 × 768 pixels or higher.
2D-Discrete Cosine Transform and Quantization
The discrete cosine transform closely approximates the Karhunen Loeve Transform (KLT) ,
which is known to be optimal in the sense of de-correlating the data and maximizing the energy
packed into the lowest order coefficients. However, unlike the KLT, the DCT involves much less
computational complexity in implementation, and is, therefore, preferred in image and video
compression work. A comprehensive treatment of DCT algorithms and applications can be found in
reference . The DCT, which exploits the spatial redundancy to prepare the ground for effective
compression, has played a key role in video data compression standards such as JPEG , MPEG 1 ,
MPEG 2 , and H.26X
.
A linear, highly pipelined, parallel algorithm and architecture have been proposed and
implemented for 2D-DCT and Quantization on FPGAs. This architecture eliminates or minimizes
the limitations cited in the earlier references. The scheme is further improved and incorporates
dual-redundant input image memory, 45 stages of pipelining, and an optimized controller design
yielding a throughput of one coefficient per clock cycle at 100 MHz. The use of dual input memory
eliminates the input loading time of the host processor. The following section describes this DCTQ
algorithm for fast implementation on an FPGA or an ASIC. The DCTQ output is 9 bits wide and in twos complement form as per the requirements of
JPEG and MPEG standards.
FFT & IFFT PROCESSOR AND CONTROLLER INTRODUCTION
INTRODUCTION
FFT and IFFT are one of the most useful blocks in DSP systems. Since the speed of
calculation of the result is an important factor of this basic block, in this project we
develop a new method for FFT/IFFT implementation, which leads to better
computation speed in comparison with other common implementation methods.
High performance Fast Fourier transform (FFT) is widely used in different
areas of applications such as communications, radars, imaging, etc. One of the
major concerns for researchers is the enhancement of processing speed. However
according to use of portable systems working with limited power supplies, lowpower
techniques are of great interest in implementation of this block. Fast Fourier
Transform (FFT) plays an important role in the Orthogonal Frequency Division
Multiplexing (OFDM) communication systems. OFDM is a multi carrier modulation
technique which provides high bandwidth efficiency and is employed in Software
Defined Radio (SDR) for the same purpose. With the increasing computing power of
modern microprocessors it becomes feasible to process radio signals completely in
software reducing the complexity of the hardware. FFT and iFFT blocks are used in
OFDM links such as Terrestrial Digital Video Broadcasting (DVB-T) systems, Digital
Audio Broadcasting (DAB) systems and microwave portable links. Of course there
are many ways to measure the complexity and efficiency of an algorithm, and the
final assessment depends on both the available technology and the intended
application. Normally the number of arithmetic multiplications and additions are
used as a measure of computational complexity.
OFDM is a special case of Frequency Division Multiplexing, which we employ
in the SDR. OFDM is a combination of modulation and multiplexing. OFDM provides
high bandwidth efficiency because the carriers are orthogonal to each others and
multiple carriers share the data among themselves. The main advantage of this
transmission technique is their robustness to channel fading in wireless
communication environment. It has been adopted by various standards in recent
years including DSL and 802.11a wireless LAN standards. This project focuses on
the core processing blocks of an OFDM system, which are the Fast Fourier
Transform block and the Inverse Fast Fourier Transform.
Architecture:
Fast Fourier Transform (FFT) is one of the most useful blocks in DSP
systems. Since the speed of calculation of the result is an important factor of this
basic block, in this project we develop a new method for FFT implementation, which
leads to better computation speed in comparison with other common
implementation methods. Typically conventional FFT algorithms are developed to
minimize the number of multiplications and additions. However, the memory
operations are usually ignored. Hidden memory operations might take half of the
power consumptions in the whole FFT calculations. To reduce the number of
memory access, we choose the cached Memory architecture to realize the
programmable 64-2048 point FFT processor.
This project provides a high-level compiler that generates hardware
implementations of the discrete Fourier transform (DFT) from mathematical
specifications at algorithmic level. Our approach is to contemporarily reduce the
high degree of complexity in computation and regularity of algorithm where input
specification is of radix4 radix8 or even radix16. It is a combined progression of
DSP incorporating VLSI.
The given matrix sequence is enumerated and programmed in Verilog HDL.
The synthesized Verilog codes is then compiled and estimated into RTL netlist
specification which is further implemented in a FPGA board. By selecting the
appropriate formula, the resulting hardware implementations can achieve a wide
range of tradeoffs between implementation cost and performance. The cooley –
tukey and Paese algorithms parameterize the compiler for a set of technologyspecific
codes targeting specific implementation platforms. This paper gives a brief
overview of the system and presents synthesis results.
|