High Throughput Direct Two Dimensional Discrete Cosine Transform (DCT) and Inverse DCT with No Transpositional Buffer and No Multiplier
There wasn’t so many courses, I took in the last semester of my bachelor study. That condition gave me, like, a lot of spare time. During that period, I did three researches simultaneously. One of them was related to image processing.
I worked in team with Aidilla Pradini and Teuku Muhammad Roffi. Ideally, we had divided our part in the research. I mostly did the algorithm improvement, developed the objective and prove the mathematical equation. Aidilla mostly worked on the functional programming and RTL. While, Roffi mostly worked on the implementation to the FPGA. However, in fact, we did help each other in various areas.
This research was mostly conducted in Lab IC Design, PAU, ITB under supervisory of Dr. Trio Adiono. The purpose of this research was to attend the LSI Design Contest in Okinawa. However, before being in Okinawa, we need to ensure that we pass the country round. Fortunately, we are selected as the main team that will represent Indonesia in the final round in Okinawa, Japan. Even I’ve joined an O&G company which of course has no intention of processing image, I got a really good opportunity at that time because the company allowed me to attend the final round as business trip (means the company will pay for everything).
The story went well until D-12 hours. Right 12 hours before our flight to Taipei (a flight to Okinawa will be transited via Taipei according the Airline Schedule), our campus cancelled the trip due to Fukushima Nuclear Power Plant tragedy in Japan. However, it wouldn’t be the “the end” of our research. The paper itself actually consists of three different researches (titles). One of those three titles had been accepted and will be presented in July 2011 (I’ll talk about it later).
Basically, the main point of the paper is to optimize all the main variables of an image processing to be as efficient as possible. The abstract is as follows:
An efficient algorithm and hardware implementation for a direct 2-D Discrete Cosine Transform (DCT) and inverse DCT is presented. A unique combination and sophisticated adaptation of algebraic integer encoding and butterfly structured algorithm is employed to achieve high troughput, bufferless, and multiplierless design. Eight 1-D 8 points DCT modules are employed each consists of so called modified 2-D algebraic integer encoding of a 1-D radix-8 DCT. The scaling and quantizer-dequantizer modules are also improved by approximation method. These algorithmic improvements result in a bufferless, multiplierless, zero memory usage, and direct processing 2-D DCT and inverse DCT designs. Simulations with MATLAB and ModelSim softwares prove that the proposed design have maintained PSNR and MSE values compared to that of conventional design. The design is further improved by employing a 5 stages pipelined implementation. The pipelined implementation results in a higher clock frequancy with high throughput. The system has a maximum frequancy of 210.084 MHz. Synthesis using Synopsis software shows that the design is 6.8 times faster in processing a token of 64 pixels compared to the conventional design. This improvement trades off with only 2.1 times increase in size compared to the conventional (refers to Level 1) design. Verification has been conducted using Altera DE2 FPGA Board and satisfying results have been obtained.
To see/download the paper, click here.
Happy to discuss,