Project overview
During the past two decades, reliable wireless communication at near-theoretical-limit transmission throughputs has been facilitated by receivers that operate on the basis of the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm. Most famously, this algorithm is employed for turbo error correction in the Long Term Evolution (LTE) standard for cellular telephony, as well as in its previous-generation predecessors. Looking forward, turbo error correction promises transmission throughputs in excess of 1 Gbit/s, which is the goal specified in the IMT-Advanced requirements for next-generation cellular telephony standards. Throughputs of this order have only very recently been achieved by State-Of-the-Art (SOA) LTE turbo decoder implementations. However, this has been achieved by exploiting every possible opportunity to increase the parallelism of the BCJR algorithm at an architectural level, implying that the SOA approach has reached its fundamental limit. This limit may be attributed to the data dependencies of the BCJR algorithm, resulting in an inherently serial nature that cannot be readily mapped to processing architectures having a high degree of parallelism.
Against this background, we propose to redesign turbo decoder implementations at an algorithmic level, rather than at the architectural level of the SOA approach. More specifically, we have recently been successful in devising an alternative to the BCJR algorithm, which has the same error correction capability, but does not have any data dependencies. Owing to this, our algorithm can be mapped to highly-parallel many-core processing architectures, facilitating an LTE turbo decoder processing throughput that is more than an order of magnitude higher than the SOA, satisfying future demands for gigabit throughputs. We will achieve this for the first time by developing a custom Field Programmable Gate Array (FPGA) architecture, comprising hundreds of processing cores that are interconnected using a reconfigurable Benes network. Furthermore, we will develop custom Network-on-Chip (NoC) architectures that facilitate different trade-offs between chip area, energy-efficiency, reconfigurability, processing throughput and latency. In parallel to developing these high-performance custom implementation architectures, we will apply our novel algorithm to both existing Graphics Processing Unit (GPU) and NoC architectures. This will grant us a rapid pace, allowing us to apply our novel algorithm to not only error correction, but to all aspects of receiver operation, including demodulation, equalisation, source decoding, channel estimation and synchronisation. Drawing upon our high-throughput algorithms and highly-parallel processing architectures, we will develop techniques for holistically optimising the algorithmic and implementational parameters of both the transmitter and receiver. This will facilitate practical high-performance schemes, which can pave the way for future generations of wireless communication.
This research addresses key EPSRC priorities in the Information and Communication Technologies theme (http://www.epsrc.ac.uk/ourportfolio/themes/ict), including 'Many-core architectures and concurrency in distributed and embedded systems' and 'Towards an intelligent information infrastructure'. The 'Working together' priority is also addressed, since this cross-disciplinary research will develop new knowledge that spans the gap between high-performance communication theory and high-performance hardware design. This research will offer new insights into the design of many-core architectures, which the hardware design community will be able to apply in the design of general purpose architectures. Furthermore, the communication theory community will be able to apply our algorithms across even wider aspects of receiver operation.
Against this background, we propose to redesign turbo decoder implementations at an algorithmic level, rather than at the architectural level of the SOA approach. More specifically, we have recently been successful in devising an alternative to the BCJR algorithm, which has the same error correction capability, but does not have any data dependencies. Owing to this, our algorithm can be mapped to highly-parallel many-core processing architectures, facilitating an LTE turbo decoder processing throughput that is more than an order of magnitude higher than the SOA, satisfying future demands for gigabit throughputs. We will achieve this for the first time by developing a custom Field Programmable Gate Array (FPGA) architecture, comprising hundreds of processing cores that are interconnected using a reconfigurable Benes network. Furthermore, we will develop custom Network-on-Chip (NoC) architectures that facilitate different trade-offs between chip area, energy-efficiency, reconfigurability, processing throughput and latency. In parallel to developing these high-performance custom implementation architectures, we will apply our novel algorithm to both existing Graphics Processing Unit (GPU) and NoC architectures. This will grant us a rapid pace, allowing us to apply our novel algorithm to not only error correction, but to all aspects of receiver operation, including demodulation, equalisation, source decoding, channel estimation and synchronisation. Drawing upon our high-throughput algorithms and highly-parallel processing architectures, we will develop techniques for holistically optimising the algorithmic and implementational parameters of both the transmitter and receiver. This will facilitate practical high-performance schemes, which can pave the way for future generations of wireless communication.
This research addresses key EPSRC priorities in the Information and Communication Technologies theme (http://www.epsrc.ac.uk/ourportfolio/themes/ict), including 'Many-core architectures and concurrency in distributed and embedded systems' and 'Towards an intelligent information infrastructure'. The 'Working together' priority is also addressed, since this cross-disciplinary research will develop new knowledge that spans the gap between high-performance communication theory and high-performance hardware design. This research will offer new insights into the design of many-core architectures, which the hardware design community will be able to apply in the design of general purpose architectures. Furthermore, the communication theory community will be able to apply our algorithms across even wider aspects of receiver operation.
Staff
Lead researchers
Other researchers
Collaborating research institutes, centres and groups
Research outputs
Mohammed El-Hajjar, Quoc Nguyen, Robert G. Maunder & Soon Xin Ng,
2014, IEEE Communications Magazine, 52(5), 194-201
Type: article
Yongkai Huo, Mohammed El-Hajjar, Robert G. Maunder & L. Hanzo,
2014, IEEE Transactions on Multimedia, 16(3), 697-710
Type: article
T. Wang, W. Zhang, R.G. Maunder & L. Hanzo,
2014, IEEE Transactions on Communications, 62(1), 280-292
Type: article
Yongkai Huo, Tao Wang, Robert G. Maunder & L. Hanzo,
2014, IEEE Transactions on Image Processing, 23(1), 319-331
Type: article
Yongkai Huo, Tao Wang, Robert G. Maunder & Lajos Hanzo,
2014, IEEE Communications Letters, 18(1), 90-93
Type: letterEditorial