Research project

Continuous on-line adaptation in many-core systems: From graceful degradation to graceful amelioration

Project overview

Until recently, the ever-increasing demand of computing power has been met on one hand by increasing the operating frequency of processors and on the other by designing more and more complex processors capable of executing more than one instruction at the same time. However, both these approaches seem to be reaching (or possibly have already reached) their practical limits, mainly due to issues related to design complexity and cost-effectiveness.

The current trend in computer design seems to favour a shift to systems where computational power is achieved not by a single very fast and very complex processor, but through the parallel operation of several on-chip processors, each executing a single thread. This kind of approach is implemented commercially today through multi-core processors and in research through the Network On Chip (NoC) or the Chip Multi-Processors (CMP) paradigms. The natural evolution of these approaches sees the number of cores increasing constantly and it is generally accepted that the next few decades will witness the introduction of many-core systems, that is, systems that integrate hundreds or thousands of cores.

This shift introduces problems common to all massively parallel systems, ranging from the design of applications that can exploit large numbers of processors to technological challenges related to the implementation of such cores in silicon substrates that are increasingly error-prone, due to their size and to the increasing sensitivity to faults of next-generation technologies, and to the dissipation of heat generated by the computational activity in the cores. Current architectures are not suitable for this kind of systems and there is a strong need to devise novel mechanisms and technologies that will allow the development of many-core systems and eventually their commercialization as consumer products.

Imagine then a many-core system with thousands or millions of processors that gets better and better with time at executing an application, "gracefully" providing optimal power usage while maximizing performance levels and tolerating component failures. The proposed project aims at investigating how such mechanisms can represent crucial enabling technologies for many-core systems.

Specifically, this project focuses on how to overcome three critical issues related to the implementation of many-core systems: reliability, energy efficiency, and on-line optimisation. The need for reliability is an accepted challenge for many-core systems, considering the large number of components and the increasing likelihood of faults of next-generation technologies, as is the requirement to reduce the heat dissipation related to energy consumption. On the other hand, on-line optimisation, that is, the ability of the system to improve over time without the need for external intervention (including becoming better at reliability and energy efficiency), is a mechanism that could be vital to enable the implementation of these properties in systems that cannot be managed centrally due to the vast number of cores involved.

The proposed approach is centred around two basic processes: Graceful degradation implies that the system will be able to cope with faults (permanent or temporary) or potentially damaging power consumption peaks by lowering its performance. Graceful amelioration implies that the system will constantly seek for alternative, better ways to execute an application.

Staff

Lead researchers

Professor Bashir Al-Hashimi CBE, FREng, FIEEE, FIET, FBCS

Research interests

  • Energy-efficient mobile computing systems
  • Low-power test and test-data compression of digital integrated circuits and energy-harvesting computing
  • Wearable and Autonomous Computing for Future Smart Cities
Connect with Bashir

Other researchers

Professor Geoff Merrett PhD, BEng, PGCert, FHEA, SMIEEE, MIET

Professor

Research interests

  • Energy management of mobile/embedded systems
  • Self-powered computing
  • Internet of Things
Connect with Geoff

Collaborating research institutes, centres and groups

Research outputs

Domenico Balsamo, Anup Das, Alex Weddell, Davide Brunelli, Bashir M. Al-Hashimi, Geoff V. Merrett & Luca Benini, 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(5), 738-749
Type: article