# A LOW-COMPLEXITY HARDWARE IMPLEMENTATION OF DISCRETE-TIME FREQUENCY-SELECTIVE RAYLEIGH FADING CHANNELS

Fei Ren and Yahong R. Zheng

Missouri University of Science and Technology Department of Electrical Engineering 301 W 16th Street, Rolla, MO, 65409. USA Email: {frrf4, zhengyr}@mst.edu

#### ABSTRACT

A low-complexity hardware implementation method is proposed for discrete-time frequency-selective Rayleigh fading channels. The proposed method first employs the Sum-of-Sinusoids method to generate multiple independent flat fading channel responses, then utilizes a simple weight-delay-sum filtering method to incorporate the fractionally-delayed multipath rays into inter-tap correlated tap gains. It thus achieves accurate correlation properties in both inter-tap correlation and temporal correlation (or Doppler spectrum). The proposed method is implemented by an Altera Stratix II FPGA development kit and the results show excellent performance match with those by MATLAB software simulations.

*Index Terms*— Frequency-selective fading channel, discrete-time channel modeling, Sum-of-Sinusoids method, FPGA implementation.

## 1. INTRODUCTION

Wireless fading channel modeling and simulation provide a low-cost means for testing and verification of transceiver products, new algorithm design, and channel capacity analysis. A most commonly used model is the Rayleigh fading Wide-Sense Stationary Uncorrelated Scattering (WSSUS) channel which is often simulated by one of the two methods: the Sum-of-Sinusoid and the Doppler spectrum filtering method [1]. Hardware and software implementations of frequency-flat fading channels have been well studied and reported by, for example, [1, 2, 3, 4] and reference herein. Software implementation of frequency-selective fading channels has also been well investigated [5, 6, 7]. However, hardware simulation of frequency-selective fading channels still presents some challenge in computational complexity and simulation accuracy[8, 9]. The most difficult aspect of frequency-selective fading channel simulation is to accurately compute and incorporate the cross-correlation between multiple channel taps in the discrete-time model. Although the WSSUS model assumes multiple uncorrelated rays, the sampled discrete-time channel taps are often correlated due to the bandpass nature of wireless communications systems. Many current hardware implementations fail to consider these correlation and result in inaccurate channel characteristics.

In this paper, we propose a simple and elegant method to incorporate inter-tap correlation for hardware implementation of discrete-time frequency-selective fading channels. The proposed method employs the weight-delay-sum filtering method [10] to implement the fractional delays of the multiple WSSUS rays. It combines the weight-delay-sum method with SoS flat fading simulators and ensures low-complexity for real-time hardware implementation. The proposed simulation method is implemented by Altera's Stratix II Field Programmable Gate Array (FPGA) development kit. The results show excellent performance match with those of MATLAB software implementation. The proposed method has advantages in low computational complexity, fast data rate, and more accurate waveforms and correlation properties, in comparison with existing hardware implementation methods.

### 2. DISCRETE-TIME FREQUENCY-SELECTIVE FADING CHANNEL MODELS

The frequency-selective Rayleigh fading channel model is often expressed as the baseband equivalent channel impulse response consisting of multipath[1]

$$h(\tau, t) = \sum_{i=1}^{I} P_i g(\tau - \tau_i) \exp[-j(\omega_i(t - \tau_i) - \phi_i)]$$
(1)

where  $P_i, \omega_i$ , and  $\tau_i$  are the *i*-th multipath gain, angular Doppler frequency, and relative delay, respectively. The pulse shaping filter  $g(\tau)$  is a bandpass filter often implemented by a raised cosine filter[1]. The multipath gains  $P_i$  are normalized to yield unit total power of the response. It is commonly assumed that the multiple rays in (1) are Wide-Sense stationary uncorrelated scattering (WSSUS).

When the delay spread  $\tau_{max} - \tau_{min}$  is much smaller than the symbol interval  $T_{sym}$ , the channel impulse response can be assumed as frequency-flat fading

$$h(t) = \sum_{i=1}^{I} P_{i}g(t) \exp[-j(\omega_{i}t - \phi_{i})]$$
(2)

If sampled at  $T_{sym}$  interval, the discrete-time flat fading channel can be efficiently simulated by several SoS models [1, 3] and a typical one is

$$Z(k) = Z_c(k) + jZ_s(k), \qquad (3)$$

$$Z_c(k) = \sqrt{\frac{2}{M}} \sum_{n=1}^M \cos(\omega_d k \cos \alpha_n + \phi_n),$$

$$Z_s(k) = \sqrt{\frac{2}{M}} \sum_{n=1}^M \cos(\omega_d k \sin \alpha_n + \varphi_n),$$

$$\alpha_n = \frac{2n\pi - \pi + \theta}{4}, \quad n = 1, 2, \cdots, M.$$

where  $\omega_d$  is the maximum angular Doppler frequency, M is the total number of sinusoids, and j = sqrt-1. The angle of arrival  $\alpha_n$  is randomized by a uniformly-distributed  $\theta$ , and  $\phi_n$ and  $\varphi_n$  are the random phases of the in-phase and quadrature components, respectively. The random variables  $\phi_n$ ,  $\varphi_n$ , and  $\theta$  are statistically independent and uniformly distributed on  $[-\pi, \pi)$  for all n.

When the channel coherence time is comparable to or larger than the symbol interval, the fading channel is frequency-selective and inter-symbol interference often spans multiple symbol intervals. The sampled channel response (1) becomes a time-varying FIR system

$$H(l,k) = \sum_{i=1}^{I} P_i g(lT_s - \tau_i) Z_i(k),$$
(4)

where  $Z_i(k), i = 1, \dots, I$ , are independent flat fading CIRs generated by (3). However, the multipath delays  $\tau_i$  are often fractions of the symbol interval, as shown in Fig. **??**. Sampling the fractional delays at  $T_{sym}$  (or at  $T_s = T_{sym}/U$ , where typically the upsampling rate  $U \in [1, 10]$ .) results in *correlated* inter-symbol delay taps [6, 5]

$$E[h(l_1,k)h^{\dagger}(l_2,k)] = \sum_{i=1}^{I} \sum_{k=1}^{I} P_i P_k g(l_1 T_s - \tau_i) g^{\dagger}(l_2 T_s - \tau_k), \quad (5)$$

note that  $R_{gg}(\xi) = E[g(\tau)g^{\dagger}(\tau + \xi)]$  is the autocorrelation of the bandpass filter  $g(\tau)$ . The resulting discrete-time power delay profile is shown in Fig.1.

Several methods have been proposed to incorporate the inter-tap correlation in frequency-selective channel modeling including the spectrum factorization method [7] and the correlation matrix factorization method [5, 6]. It has been shown that these methods yield accurate channel models with low computational complexity in software-based simulation. However, the evaluation of correlation coefficients, and the spectrum and/or correlation matrix factorization are costly in hardware implementation. Therefore, we propose a simple weight-delay-sum filtering method [10] to implement the fractional delays,

$$H(l,k) = H_c(l,k) + jH_s(l,k)$$
 (6)

$$H_{c}(l,k) = \sum_{i=1}^{I} P_{i}E_{l,i}Z_{c_{i}}(k)\delta(l-l_{i})$$
  
$$H_{s}(l,k) = \sum_{i=1}^{I} P_{i}E_{l,i}Z_{s_{i}}(k)\delta(l-l_{i})$$

where  $l_i = \lfloor \tau_i/T_s \rfloor$ , and  $E_{l,i} = g(lT_s - \tau_i)$  are  $T_s$ -spaced samples of the delayed bandpass filter, as shown in Fig. 2, where the raised cosine pulse is truncated to  $\pm L_g T_s$  with  $L_g = 3$ .

The simple weight-delay-sum method captures the intertap correlation of frequency-selective channels with very low computational complexity. The tradeoff is that it requires Iindependent flat fading waveforms rather than  $L = 2L_g + 1 + \lceil \tau_{max}/T_s \rceil$  required in the correlation matrix factorization method [5]. In practice, the number of multipath I is often slightly larger than the total number of taps L.

#### **3. FPGA IMPLEMENTATION**

For real-time hardware implementation, frequency-selective channel waveforms must be sampled at the same rate as the receiver and the received signal (after proper delay) is then

$$y(k) = \sum_{l=0}^{L-1} H(l,k) \cdot x(k-l) + v(k),$$
(7)

where x(k) is the transmitted signal and v(k) is the background white Gaussian noise. If the symbol interval  $T_{sym} = 1\mu s$  and the upsampling rate is U = 10, then L samples of H(l,k) are needed for every  $T_s = 0.1\mu s$ , where L is on the order of tens. This requirement is stringent for sampleby-sample processing. However, in modern communications systems, block transmission is often employed and channel response is often slowly time varying. We exploit this feather and propose an efficient implementation with block processing.

The proposed hardware implementation scheme consists of three major blocks: a parameter generator bank, a flat fading generator, and a selective fading generator module, as shown in Fig. 3, where MUX is a multiplexer. The parameter generator bank generates and stores all random variables needed for each of the I WSSUS rays. These include the random phase vectors  $\Phi_i = [\phi_{1,i}, \phi_{2,i}, \cdots, \phi_{M,i}]$ and  $\Psi_i = [\varphi_{1,i}, \varphi_{2,i}, \cdots, \varphi_{M,i}]$ , the maximum Doppler frequencies  $\omega_{d,i}$ , random phases  $\theta_i$ , and the power delay profile vectors  $\mathbf{P} = \{P_i\}$  and  $\mathbf{D} = \{tau_i\}$ . The parameter generator bank also computes and stores the quantities  $\cos \alpha_{n,i}$  and  $\sin \alpha_{n,i}$  for all n and i. The multiplexer selects the parameters of the *i*-th ray and sends them to the flat fading generator in series. The flat fading generator generates the real and imaginary components of the *i*-th flat fading channel responses according to (3) and outputs  $Z_{c_i}(k)$  and  $Z_{s_i}(k)$  to two buffers of the selective fading generator. When the k-th flat fading samples of all I rays are ready at the buffers, the selective fading generator processes them with the weight-delay-sum filtering method according to (6).

The implementation of the parameter generator bank is straightforward with several uniform random number generators and the sine and cosine functions are generated by Look Up Tables (LUT). The flat fading generator is implemented as in Fig. 4, where M cosine functions are summed in series to generate the real/imaginary component of the fading



Fig. 1. (a) A typical urban channel PDP with multiple WSSUS rays. (b) Average power/tap of  $T_s$ -spaced discrete-time channel response.



**Fig. 3**. Block diagram of FPGA implementation of the frequency-selective Rayleigh fading Simulator



Fig. 4. FPGA implementation of the flat fading generator module

response. Flexible data formats are used for different parameters according to their fixed-point precision. For example, the random phase/Doppler parameters use the format (3:20), the number M uses (2:10), the time-index k uses (21:30), and the channel responses use (3:20). Thus, accuracy of output can reach  $2^{-20} \approx 10^{-6}$ .

The selective fading generator is the core module of the simulator and its structure is shown in Fig. 5. The *i*-th flat fading channel responses are multiplied with its gain  $P_i$  according to the PDP specifications prior to be stored in the buffers. The weights  $E_{l,i} = g(lT_s - \tau_i)$  are computed through multiple LUTs which store the raised cosine pulse for  $\tau = [-L_g T_{sym}] + L_g T_{sym}$  at a high resolution. The LUTs takes



**Fig. 2**. Bandpass filter of the *i*-th ray sampled at  $T_{sym}$ , where the delay  $\tau_i$  is a fraction of  $T_{sym}$ .



**Fig. 5**. FPGA implementation of the frequency-selective fading generator module

the delay parameter  $D_i = \tau_i$  as the inputs and then outputs the corresponding weights  $E_{l,i}$  to the multipliers (MUL). Multiple MULs are used to weigh the corresponding flat fading rays in parallel. The accumulators implement the summation of (6) and output a block of  $H_c(l,k)$  and  $H_s(l,k)$  in parallel.

### 4. IMPLEMENTATION EXAMPLE AND PERFORMANCE EVALUATION

The proposed frequency selective fading channel simulator was implemented by an Altera Stratix II FPGA/DSP development kit. We used Quartus II version 8.0 and DSP Builder version 5.0 for this development. DSP Builder provides a nice interface between the FPGA hardware and MATLAB Simulink so that the parameters of channel specifications were easily input to the channel simulator, and the outputs of the channel simulator were logged in data files in Simulink.

As an example, results for a typical urban channel model of 20 WSSUSrays is presented here. The implementation parameters were: the number of sinusoid M = 16, the upsampling rate U = 10, the output block size is  $10 \times 1$  per accumulator, and the channel length L = 60 (in terms of  $T_s$ ). When



Fig. 6. Autocorrelation and cross-correlation of the *i*-th flat fading ray sampled at at  $T_{sym}$  interval. The normalized Doppler frequency was  $f_d T_{sym} = 0.0008$  and  $f_d = 125$  Hz.



Fig. 7. Cross-correlation between  $H_c(l,k)$  and  $H_c(l+1,k)$  of the frequency-selective channel simulator.

the clock period of the FPGA chip is set to 20ns, it meets the real-time requirements for symbol interval  $T_{sym} = 6.4\mu s$ . The logic utilization of the single FPGA chip was 33%, including 15704 (31%) combinational ALUTs and 1383 (2%) dedicated logic registers. Total block memory bits occupied was 822484 (32%). The proposed low-complexity hardware implementation occupies less than 1/3 resources on the single FPGA chip.

The performance of the hardware simulator was evaluated by its output waveforms. First, the auto- or cross-correlation of the flat fading generators  $Z_{c_i}(k)$  and  $Z_{s_i}(k)$  are computed by averaging over five trails and each trial generated  $k = 2 \times 10^6$  samples. The results are shown in Fig. 6.

The cross-correlation between  $H_c(l, k)$  and  $H_c(l + (1, 2, 4), k)$  are shown in Fig. 7. When the accuracy of MATLAB simulations is set to  $10^{-6}$ , which is the same to the accuracy of FPGA outputs. All FPGA outputs match MATLAB simulations very well.

#### 5. CONCLUSIONS

A low-complexity FPGA implementation of frequency selective Rayleigh fading channels has been proposed, which employs a simple weight-delay-sum processing to incorporate the inter-tap correlation of discrete-time channel models. The proposed simulator has been implemented on Altera's Startix II development kits. The results of the hardware simulator match those by the software simulation. The advantages of the proposed simulator include its flexibility for parameter change and its simple, compact implementation.

#### 6. REFERENCES

- W. C. Jakes, *Microwave Mobile Communications*, Piscataway, NJ: Wiley-IEEE Press, 1994.
- [2] C.S. Patel, G.L. Stuber, and T.G. Pratt, "Comparative analysis of statistical models for the simulation of rayleigh faded cellular channels," *IEEE Trans. Commun*, vol. 53, no. 6, pp. 1017–1026, June 2005.
- [3] Y.R. Zheng and Chengshan Xiao, "Improved models for the generation of multiple uncorrelated rayleigh fading waveforms," *IEEE Trans. Communications Letters*, vol. 6, no. 6, pp. 256–258, Jun 2002.
- [4] A. Alimohammad, S.F. Fard, B.F. Cockburn, and C. Schlegel, "A compact single-fpga fading-channel simulator," *IEEE Trans. Circuits and Systems II: Express Briefs*, vol. 55, no. 1, pp. 84–88, Jan. 2008.
- [5] Chengshan Xiao, J.X. Wu, S-Y. Leong, Y.R. Zheng, and K. B. Letaief, "A discret-time model for triply selective mimo rayleigh fading channels," *IEEE Trans. Wireless Commun*, vol. 3, no. 5, pp. 1678–1688, Sep 2004.
- [6] K.-W. Yip and T.-S. Ng, "Efficient simulation of digital transmission over wssus channels," *IEEE Trans. Commun*, vol. 43, no. 12, pp. 2907–2913, Dec 1995.
- [7] A. Abdi, "Stochastic modeling and simulation of multiple-input multiple-output channels: A unified approach," Monterey, CA, 2004, Proc. IEEE Intl. Symp. Antennas Propagat., pp. 3673–3676.
- [8] M. Kahrs and C. Zimmer, "Digital signal processing in a real-time propagation simulator," *IEEE Trans. Instrumentation and Measurement*, vol. 55, no. 1, pp. 197– 205, Feb. 2006.
- [9] M.A. Wickert and J. Papenfuss, "Implementation of a real-time frequency-selective rf channel simulator using a hybrid dsp-fpga architecture," *IEEE Trans. Microwave Theory and Techniques*, vol. 49, no. 8, pp. 1390–1397, Aug 2001.
- [10] T.I. Laakso, V. Valimaki, M. Karjalainen, and U.K Laine, "Splitting the unit delay [fir/all pass filters design]," *IEEE Signal Processing Magazine*, vol. 13, no. 1, pp. 30 – 60, Jan. 1996.