# Parallel Hardware Implementation of Walsh Hadamard Transform 

Pulak Mazumder ${ }^{1 *}$, Soumyadeep Chandra ${ }^{2}$, Sekhar Rana ${ }^{3}$, Mainak Mukhopadhyay ${ }^{4}$ \& Mrinal Kanti Naskar ${ }^{2}$<br>${ }^{1}$ Department of ECE, Regent Education and Research Foundation, Kolkata 700 121, West Bengal, India<br>${ }^{2}$ Department of ETCE, Jadavpur University, Kolkata 700 032, West Bengal, India<br>${ }^{3}$ Department of Electronics and Communication Engineering, MCKV Institute of Engineering, Howrah 711 104, West Bengal, India<br>${ }^{4}$ Department of ECE, Birla Institute of Technology, Mesra, Ranchi 835 215, Jharkhand, India

Received 18 February 2021; revised 15 June 2022; accepted 20 June 2022


#### Abstract

The Walsh Hadamard Transform is a powerful notion in digital signal processing. This paper explains the construction of parallel hardware architecture using the mathematical concept of Kronecker product based approach to Walsh Hadamard Transform and its simulation using Verilog. This architecture is simulated here using Field Programmable Gate Array (FPGA) technology in Verilog Spartan 3e platform. Furthermore, this paper illustrates the fast algorithm and parallel computational result of both one-dimensional and two-dimensional transforms using the Kronecker product.This algorithm can be used to implement a systolic array based dedicated hardware for computation of the transform. Our proposed hardware design for the Walsh Hadamard Transform will be used in various digital signal processing applications. The systematic derivation of parallel architecture design using the concept of Kronecker product and stride permutation would depict the real time processing rather than conventional way and reducing time complexity using minimal resources is a challenging task.


Keywords: FPGA, Kronecker product, Systolic architecture, Verilog

## Introduction

The Walsh Hadamard Transform (WHT) is a mathematical construct that finds wide applications in the fields of digital signal processing, data compression, and encryption. ${ }^{1}$ It also finds application in quantum computer information processing and it is more often called Hadamard gate in this context. The transform is particularly useful in feature extraction for pattern recognition and digital image processing because of its easy implementation using simple arithmetic stages. ${ }^{2}$ Also, the binary nature of the Walsh functions and the Hadamard matrix allow easy semiconductor-based computer implementation. ${ }^{3}$

The usage of the Kronecker product allows the deimplementation of WHT using an algorithm that is both recursive and parallel. For a one-dimensional input of size $2 \alpha$, the algorithm completes in $\alpha$ clock cycles. Moreover, for a two-dimensional input of size ( $2 \alpha \times 2 \alpha$ ), the same is complete in $2 \alpha$ clock cycles. Thus, using this approach can significantly reduce the number of clock cycles. ${ }^{4}$

Kronecker product allows the decomposition of the

[^0]WHT matrix into simpler arithmetic stages comprising of homogeneous recursive array block. It thus provides a parallel computation of systolic based array implementation of WHT for synchronous evaluation. ${ }^{5}$ Decomposing the one-dimensional input into pairs and applying them to the arithmatic stages allows for parallel execution. ${ }^{6}$ The output obtained is stride-permuted and applied back to the same arithmetic stages, continuing the execution recursively. For a two-dimensional input, we design an algorithm that works on the column-major representation of the input. ${ }^{7}$ This representation is one-dimensional, allowing the computation similar to the above. As discussed in our previous work ${ }^{8}$, the step-by-step development beginning from Granata's paper. ${ }^{4}$ from the theoretical approach of expressing the DSP algorithm using the Kronecker product gives insight into developing the parallel hardware architecture.

Although the previous work had focused on modeling the FFT algorithm ${ }^{3,9}$, which threw light onto the plethora of algorithms involving recursion that can have a similar implementation, the major constraint remains the implementation of this derived architecture for practical. This had been greatly aided
by Field-programmable Gate arrays which have experienced pleasant favoritism from researchers with laboratory constraints. Our case resolves around a simple purpose. We implemented a very well-known transform technique from the area of DSP using the Kronecker product, and as a result, we were able to achieve a parallelism technique for the algorithms. This has helped us to propose a new architecture for an Application Specific Integrated Circuit (ASIC) dedicated to this purpose. Our proposed WHT algorithm's computational result is faster than basic WHT and implemented parallel too. As a result, our proposed architecture is faster than our previous work ${ }^{8}$ and reduces the time and space complexity.
This paper is constructed as follows. The Materials and Method section gives a brief description of the basic concept of the Kronecker product and its properties. We use these properties to develop formulae for evaluating WHT of the 1D and 2D inputs recursive and parallel in sub-sections respectively. Then in next sub-section, we propose our systolic array-based WHT architecture based on these recursive formulae. In the following subsection, the paper also contains algorithms for the architecture proposed. And last, the paper contains a section dedicated to the mathematical and time complexity analysis of the algorithms in Results and Discussion section.

## Materials and Methods

## Basic Kronecker Properties

Let $A_{n 1, n 2}$, and $B_{m 1, m 2}$ be two arbitrary matrices of dimension n 1 by n 2 and m 1 by m 2 , respectively. Let F be a field such as R or C . For any matrices, $A=\left[a_{i, j}\right] \in F_{m \times n}$ and $B \in F_{p \times q}$, their Kronecker product ${ }^{4}$ (i.e., the direct product or tensor product) denoted by $C=A \otimes B$, is defined as
$A \otimes B=\left[a_{i, j} B\right]$ of dimension $(n 1 \times m 1) \times(n 2 \times m 2)$
$C=\left(A_{n 1, n 2} \otimes B_{m 1, m 2}\right)$
Thus, $A \otimes B \in F_{(m p) \times(n q)}$

## Mathematical Model: 1-D Transform

The WHT performs an orthogonal, symmetric, involution, linear operation on a set $X_{N}$ of $N=(2 \alpha)$ real numbers. For 1-D WHTs, the set of numbers is represented as a vector $X_{N \times 1}$, and the transform ${ }^{10}$ is given by:
$Y_{N \times 1}=W_{N} \cdot X_{N \times 1}$
$W_{N}$, the Hadamard matrix for $N=(2 \alpha)$, can be recursively defined as follows:
$W_{N}=\left[\begin{array}{cc}W_{\frac{N}{2}} & W_{\frac{N}{2}} \\ W_{\frac{N}{2}} & -W_{\frac{N}{2}}\end{array}\right]$
Alternatively, $W_{2 \alpha}$ can be defined using the Kronecker product as follows,
$W_{2^{\alpha}}=W_{2} \otimes W_{2^{\alpha-1}}$
By solving the recurrence for $W_{2 \alpha}$ using the method of substitution, we can get,
$W_{2^{\alpha}}=\left(W_{2} \otimes W_{2} \otimes W_{2} \ldots \otimes W_{2} \otimes W_{2}\right)$
Now the Product rule implies:

$$
A_{N_{1}} \otimes \ldots \otimes A_{N_{t}}=\prod_{k=1}^{t}\left(I_{N_{k-1}} \otimes A_{N_{k}} \otimes I_{\frac{N}{N_{k}}}\right)
$$

The above generalized product rule can be modified specifically for the WHT as follows:
$W_{2^{\alpha}}=\prod_{i=0}^{\alpha-1}\left(I_{2^{i}} \otimes W_{2} \otimes I_{2^{\alpha-i-1}}\right)$
Further, using the commutation theorem, product rule, and the properties of stride permutationmatrices, we can obtain:
$W_{2^{\alpha}}=\prod_{i=0}^{\alpha-1} P_{2^{\alpha}, 2}\left(I_{2^{\alpha-1}} \otimes W_{2}\right)$
Therefore, putting Eq. (8) in the definition of WHT stated in Eq. (2), we get
$Y_{N \times 1}=\prod_{i=0}^{\alpha-1} P_{2^{\alpha}, 2}\left(I_{2^{\alpha-1}} \otimes W_{2}\right) X_{N \times 1}$
The above formulation ${ }^{8}$ gives an efficient algorithm for computing the Hadamard matrix where $\left(I_{2^{\alpha-1}} \otimes W_{2}\right) X_{N \times 1}$ is a parallel operation and $\mathrm{P}_{2 \alpha, 2}$ is a $2 \alpha$ - point stride 2 permutation matrix.

However, using the recursive Kronecker Product property of $W_{N}$, as shown in Eq. (5), allows parallel execution of lower-order 2-point WHT transform block for generating higher-order WHT transformation on input $X_{N}$.

## 4-point Transform:

The block diagram of section-wise implementation of Fast 4-point WHT is given in Fig. 1.

A 4 point WHT transform matrix, using the Kronecker product rule, is represented by:

$$
\begin{gather*}
W_{4}=\left[\begin{array}{cccc}
1 & 1 & 1 & 1 \\
1 & -1 & 1 & -1 \\
1 & 1 & -1 & -1 \\
1 & -1 & -1 & 1
\end{array}\right] ; W_{2}=\left[\begin{array}{cc}
1 & 1 \\
1 & -1
\end{array}\right] \\
\text { or, } W_{4}=W_{2} \otimes W_{2}=\left[\begin{array}{cc}
W_{2} & W_{2} \\
W_{2} & -W_{2}
\end{array}\right] ; \tag{10}
\end{gather*}
$$

Let us consider a matrix $A_{n}$ of dimension $n \times n$ such that:


Fig. 1 - Block Diagram of Section-wise Implementation of Fast 4-point WHT

$$
\begin{gather*}
W_{4}=\left[\begin{array}{ll}
W_{2} & A_{2} W_{2} \\
W_{2} & -A_{2} W_{2}
\end{array}\right]=\left[\begin{array}{cc}
I_{2} & A_{2} \\
I_{2} & -A_{2}
\end{array}\right]\left[\begin{array}{cc}
W_{2} & 0 \\
0 & W_{2}
\end{array}\right] ; \\
\text { or, } W_{4}=\left[\begin{array}{cc}
I_{2} & I_{2} \\
I_{2} & -I_{2}
\end{array}\right]\left[\begin{array}{cc}
I_{2} & 0 \\
0 & A_{2}
\end{array}\right]\left[\begin{array}{cc}
W_{2} & 0 \\
0 & W_{2}
\end{array}\right] ; \tag{11}
\end{gather*}
$$

Thus, on comparing Eq. (10) with equation Eq. (11), we get that $W_{4}$ can be represented in the Kronecker product as:
$W_{4}=\left(W_{2} \otimes I_{2}\right)\left[\begin{array}{cc}I_{2} & 0 \\ 0 & A_{2}\end{array}\right]\left(I_{2} \otimes W_{2}\right)$
where, $\left[\begin{array}{cc}I_{2} & 0 \\ 0 & A_{2}\end{array}\right]=A_{4}=I_{4}=\left[\begin{array}{llll}1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{array}\right]$
Thus, a 4-point WHT can be represented as:
$W_{4}=\left(W_{2} \otimes I_{2}\right)\left(I_{4}\right)\left(I_{2} \otimes W_{2}\right)$

## 8-point Transform

Similarly, from Fig. 2, using the recursive nature of the Kronecker product, the equation representing an 8 -point WHT transform is:
$W_{8}=\left[\begin{array}{cc}I_{4} & I_{4} \\ I_{4} & -I_{4}\end{array}\right]\left[\begin{array}{cc}I_{4} & 0 \\ 0 & A_{4}\end{array}\right]\left[\begin{array}{cc}W_{4} & 0 \\ 0 & W_{4}\end{array}\right] ;$
$W_{8}=\left(W_{2} \otimes I_{4}\right)\left[\begin{array}{cc}I_{4} & 0 \\ 0 & A_{4}\end{array}\right]\left(I_{2} \otimes W_{4}\right)$
where, $\left[\begin{array}{cc}I_{4} & 0 \\ 0 & A_{4}\end{array}\right]=A_{8}=I_{8}$
Thus, the reduced equation is:
$W_{8}=\left(W_{2} \otimes I_{4}\right)\left(I_{8}\right)\left(I_{2} \otimes W_{4}\right)$

## The Generalised Equation for 1-D Fast WHT

From the above examples, we can generalise the equation for representing a 1 D -point WHT as:

$$
\begin{gather*}
W_{N}=\left(W_{2} \otimes I_{\frac{N}{2}}\right)\left[\begin{array}{cc}
I_{N} & 0 \\
0 & A_{\frac{N}{2}}
\end{array}\right]\left(I_{2} \otimes W_{\frac{N}{2}}\right) \\
W_{N}=\left(W_{2} \otimes I_{\frac{N}{2}}\right)\left(A_{N}\right)\left(I_{2} \otimes W_{\frac{N}{2}}\right) \tag{15}
\end{gather*}
$$

where $A_{N}=\operatorname{diag}(1)=I_{N}$


Fig. 2 - Block Diagram for Section-wise Implementation of Fast 8-point WHT

## Mathematical Model: 2-D Transformation

The WHT of a square matrix $X_{N}$ of order N is given by
$Y_{N \times 1}=W_{N} \times X_{N} \times W_{N}^{T}$
Now, we can observe from Eq. (3) and Eq. (8), that the transpose of $W_{N}$ is the same matrix as $W_{N}^{T}$
$W_{N}=W_{N}^{T}=\left[\begin{array}{cc}W_{N} & W_{\frac{N}{2}} \\ W_{\frac{N}{2}} & -W_{\frac{N}{2}}\end{array}\right]$
or,$W_{N}=W_{N}^{T}=\prod_{i=0}^{\alpha-1} P_{2^{\alpha}, 2}\left(I_{2^{\alpha-1}} \otimes W_{2}\right)$
If we represent the matrix $X_{N}$ as a column matrix $X_{C}$ of order $2^{2 \alpha} \times 1$, we can rewrite Eq. (16) for obtaining 2-D WHT as follows.
$Y_{C}=\left(W_{N} \otimes W_{N}^{T}\right) \times X_{C}$
where, $Y_{C}$ is a column matrix representation for $Y_{N}$.
Now for $N=2^{\alpha}$, in Eq. (19), we can write:
$W_{N} \otimes W_{N}^{T}=W_{2^{\alpha}} \otimes W_{2^{\alpha}}=W_{2^{2^{\alpha}}}=W_{N^{2}}$
Therefore, we can present the following formulation for the computation of 2-D WHT using Eq. (18) and (19).
$Y_{C}=\prod_{i=0}^{2^{\alpha-1}} P_{2^{2 \alpha}, 2}\left(I_{2^{2 \alpha-1}} \otimes W_{2}\right) X_{C}$
The above formulation gives an efficient algorithm for computing the Hadamard matrix where $\left(I_{2^{2 \alpha-1}} \otimes\right.$ $W 2 X C$ is a parallel operationand $P 2 \alpha, 2$ is a $2 \alpha-$ point stride 2 permutation matrix.

However, using properties of Kronecker product, we know from Eq. (15) that in 1D transform $W_{N}$ or $W_{2^{\alpha}}$ can be represented as:
$W_{N}=W_{2^{\alpha}}=\left(W_{2} \otimes I_{\frac{N}{2}}\right)\left(A_{N}\right)\left(I_{2} \otimes W_{\frac{N}{2}}\right)$
Substituting this in Eq. (20), where $\mathrm{N}=2^{\alpha}$, we get:

$$
\begin{equation*}
W_{N^{2}}=\left(W_{2} \otimes I_{\frac{N^{2}}{2}}\right)\left(A_{N^{2}}\right)\left(I_{2} \otimes W_{\frac{N^{2}}{2}}\right) \tag{22}
\end{equation*}
$$

## Proposed Architecture

The Generalised WHT equation has two parts:

1. The parallel computation of inputs to recursive $W_{2}$ blocks.
2. The multiplication of outputs generated from the $W_{2}$ transform block in part 1 generates output.
We present a hardware architecture where each iteration in Eq. (15) is completed in one circuit clock cycle. In part (1), addition and subtraction for computation of recursive 2 -point WHT transform blocks are done in a positive half cycle. ${ }^{11}$ In part(2), the transformed outputs are multiplied to generate output $Y_{N}$ for N -point $W_{N}$ transform in the corresponding negative half cycle.

This hardware implementation requires the following devices: N Arithmetic Units (AU) that consist of an adder and a subtractor for part (1) operation and N/2 Multipliers for part (2) operation. The outputs of multipliers, i.e., Q1 to Q7, are again loaded into X1 to X7 and fed back to the AUs in the next clock cycle for computation of the next iteration in Eq. (15). In Fig. 3 flowchart representation of the algorithm used for computation of (a) 1-D and (b) 2-D Walsh Hadamard Transform is depicted.

## Algorithms

Algorithm for 1D transform
Algorithm 1: Algorithm describing the computation of 1D WHT Transform

Input: $X_{n}$ : N point $\left(2^{\alpha}\right)$ input to 1D Transform
Output: $Y_{n}$ : N point output of 1D Transform
$1 X_{n} \leftarrow$ input of size $n$
2 for $\alpha \leftarrow 1$ times do
3 For every rising edge of the circuit clock
$4 Y_{n}^{T} \leftarrow A U\left(X_{n}\right)$;
5 For the corresponding falling edge of the clock
$6 X_{n} \leftarrow X_{n} \leftarrow \operatorname{Multipliers}\left(Y_{n}^{T}\right)$;
7 end


Fig. 3 - Flowchart representation of the algorithm used for computation of (a) 1-D and (b) 2-D Walsh Hadamard Transform

## Algorithm for 2D transform

Algorithm 2: Algorithm describing the computation of 2D WHT Transform
Input: $X_{n \times n}: \mathrm{N} \times \mathrm{N}$ point input to 2D Transform Output:
$Y_{n \times n}: \mathrm{N} \times \mathrm{N}$ point output to 2D Transform
$1 X_{n \times n} \leftarrow$ input of size $n \times n$
2 Convert $X_{n \times n}$ into a column major representation $X_{C}$
3 for $2 \alpha \leftarrow 1$ times do
4 For every rising edge of the circuit clock
$5 Y_{C}^{T} \leftarrow A U\left(X_{C}\right)$;
6 For the corresponding falling edge of the clock
$7 X_{C} \leftarrow X_{C} \leftarrow \operatorname{Multipliers}\left(Y_{C}^{T}\right)$;
8 end
9 Convert column matrix $Y_{C}$ to $2 D$ matrix $Y_{n \times n}$

## Results and Discussion

The Generalized Recursive Equation for Calculation of Operations in Fast WHT
$S_{N}$ is the number of additions required for fast implementation of N-point WHT.
$S_{N}=\left(\frac{N}{2} \times 2\right)+\left(2 \times S_{\frac{N}{2}}\right)$
$P_{N}$ is the number of multiplications required for fast implementation of N-point WHT.
$P_{N}=\left(\frac{N}{2}\right)+\left(2 \times P_{\frac{N}{2}}\right)$
The comparison based on number of calculation is represented in Table 1.

| Table 1- Number of Calculations |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Comparison of operations of WHT |  |  |  |  |  |  |  |
| N | $P_{N}$ | $S_{N}$ | Net | $P_{N}$ | $S_{N}$ | Net | Net |
|  | $P_{N}$ | $S_{N}$ | orored |  |  |  |  |
| 4-point WHT | 6 | 10 | 16 | 4 | 8 | 12 | 25 |
| 8-point WHT | 18 | 26 | 44 | 12 | 24 | 36 | 18.18 |
| 16-point WHT | 42 | 68 | 110 | 38 | 64 | 102 | 7.27 |

## Time Complexity Analysis

## The Computational Complexity of Previous Works on WHT

The Walsh Hadamard Transform can be regarded as built out of multidimensional Discrete Fourier Transforms (DFTs) of size $2 \times 2 \times \ldots \times 2$. The Walsh Hadamard Matrix $W_{N}$ is a $N \times N$ matrix where $N=2^{m}$ real numbers.

The naive implementation of WHT of order $N=2^{m}$ would have a computational complexity of $\mathcal{O}\left(N^{2}\right) .{ }^{10}$ However, even using the fast Hadamard transform algorithm ${ }^{12,13}$, the Hadamard transform can be computed in $\mathcal{O}\left(N \log _{2} N\right)$.

## 1-D Transform

The expression for the time complexity can be obtained from the algorithm itself. Lines 4 and 6 of algorithm 1 get executed in 1 clock cycle, which we can take as a unit of time, say $t$. Therefore, the total time for computation:
$T=(1+(\alpha-1)) t=\alpha t$
As $\alpha=\log _{2} N$, the complexity of algorithm 1 is given by $\mathcal{O}\left(\log _{2} N\right)$.

## 2-D Transform

Similarly, using the algorithm for 2D transform the total time for computation.
$T=(1+(2 \alpha-1)) t=2 \alpha t=\left(2 \log _{2} N\right) t$
As the input is a $N \times N$, total input size, $M=N \times$ $N=N^{2}$. Therefore,
$T=\left(2 \log _{2} N\right) t=\left(2 \log _{2} M^{\frac{1}{2}}\right) t=\left(\log _{2} M\right) t$

Thus, the asymptotic representation for the time complexity of algorithm 2 , can be given by $\mathcal{O}\left(\log _{2} M\right)$.

## Hardware Implementation

The proposed architecture has been verified using Verilog and tested using the 'Spartan 3e development kit'. The following diagrammatic schematic architecture shown in Figs 4 and 5 is implemented on Verilog to generate the following statistics. The


Fig. 4 - Schematic architecture of 4-point WHT


Fig. 5 - Technology schematic view of WHT implementation

| Table 2 - HDL Synthesis Report |  |  |
| :---: | :---: | :---: |
| Macro Statistics | Number |  |
| Components | 2 |  |
| 4-bit Adder carry out | 1 |  |
| 5-bit Adder carry out | 14 |  |
| 6-bit Adder carry out | 17 |  |
| Total Adder/Subtractors |  |  |

Table 4 - Time taken for computation of WHT with a different sequence length

```
Sequence length Proposed
4-point WHT 14.508ns (12.261ns logic, 2.247ns route, 84.5%
    logic, 15.5% route)
8-point WHT 14.666ns (11.184ns logic, 3.482ns route, 76.3%
    logic, 23.7% route)
```



Fig. 6 - Timing diagram of 4 point WHT hardware implementation


Fig. 7 - Timing diagram of 8 point WHT hardware implementation
complete HDL Synthesis Report Statistics is provided in Table 2. The Devise Utilisation Report is shown in Table 3.

The computational results are formulated in Figs 6 and 7. The corresponding runtime of the proposed WHTs are shown in Table 4.

## Conclusions

The mathematical model developed to design the proposed architecture depicts a new dimension for Walsh Hadamard Transform. The dimensional transformation using the Kronecker product for fast implementation can give an alternate substitution in image processing. Walsh-Hadamard is a combination of two algorithms and it is an efficient method of image compression and signal filtering. It is because these algorithms work on the matrix basis. Since signals and images are a combination of matrices, this method finds it easy to handle them efficiently and fast. The results obtained from the simulation study have been quite promising. The simulation results using Verilog clearly confirms on the efficiency of the proposed method as compared with conventional technique. The next step would be to implement a higher-order Walsh Hadamard transform tested through the FPGA development kit.

## References

1 Crochiere R E \& RabinerL R, Multirate Digital Signal Processing, Englewood Cliffs (NJ Prentice-Hall) 1983.
2 Wang Z, Harmonic analysis with a real frequency function, I Aperiodic case, II Periodic and bounded cases and III Data sequence, Appl Math Comput, 9 (1981) 53-73.
3 Hartley R V L, A more symmetrical Fourier analysis applied to transmission problems, Proc IRE, 30 (1942) 144-150.
4 Granata J, Conner M \& Tolimieri R, Recursive fast algorithm and the role of the tensor product, IEEE Trans Signal Process, 40(12) (1992) 2921-2930.
5 Bi G \& Chen Y Q, Fast DHT Algorithms for Length $\mathrm{N}=\mathrm{q} \times$ 2m, IEEE Trans. on Signal Processing, 47(1999) 900-903.
6 Tolimieri R, An M \& Lu C, Algorithms for Discrete Fourier Transform and Convolution (Springer-Verlag) 1989.
7 Johnson J R, Johnson R W, Rodriguez D \& Tolimieri R, A methodology for designing, modifying, and implementing Fourier transform algorithms on various architectures, Circuits Syst Signal Pro, 9(4) (1990) 449-500.
8 Mazumder P, Middya R \& Naskar M K, Hardware implementation of fast recursive walsh-hadamard transform, Int J Comput Sci Eng, 7(1) (2019) 28-32.
9 Milder P A, Franchetti F, Hoe, J C \& Püschel M, Discrete Fourier Transform Compiler: From Mathematical Representation to Efficient Hardware, CSSI Technical Report \#CSSI-07-01, (Carnegie Mellon University) 2007
10 Chiper D F, Radix-2 fast algorithm for computing discrete hartley transform of type III, IEEE Trans Circuits Syst II: Express Br, 59(5) (2012) 297-301.
11 Chiper D F, A Novel VLSI DHT Algorithm for a highly modular and parallel architecture, IEEE Trans Circuits Syst II: Express Br, 60(5) (2013) 282-286.
12 BracewellR N, The Fast Hartley Transform, Proc IEEE, 72(8) (1984) 1010-1018.
13 Fino, Bernard J \& Ralph AlgaziV, Unified matrix treatment of the fast Walsh-Hadamard transform, IEEE Trans Comput, 11 (1976) 1142-1146.


[^0]:    *Authors for Correspondence
    E-mail: pulak.mazumder@gmail.com

