Project overview
Until recently, the ever-increasing demand of computing power has been met on one hand by increasing the operating frequency of processors and on the other by designing more and more complex processors capable of executing more than one instruction at the same time. However, both these approaches seem to be reaching (or possibly have already reached) their practical limits, mainly due to issues related to design complexity and cost-effectiveness.
The current trend in computer design seems to favour a shift to systems where computational power is achieved not by a single very fast and very complex processor, but through the parallel operation of several on-chip processors, each executing a single thread. This kind of approach is implemented commercially today through multi-core processors and in research through the Network On Chip (NoC) or the Chip Multi-Processors (CMP) paradigms. The natural evolution of these approaches sees the number of cores increasing constantly and it is generally accepted that the next few decades will witness the introduction of many-core systems, that is, systems that integrate hundreds or thousands of cores.
This shift introduces problems common to all massively parallel systems, ranging from the design of applications that can exploit large numbers of processors to technological challenges related to the implementation of such cores in silicon substrates that are increasingly error-prone, due to their size and to the increasing sensitivity to faults of next-generation technologies, and to the dissipation of heat generated by the computational activity in the cores. Current architectures are not suitable for this kind of systems and there is a strong need to devise novel mechanisms and technologies that will allow the development of many-core systems and eventually their commercialization as consumer products.
Imagine then a many-core system with thousands or millions of processors that gets better and better with time at executing an application, "gracefully" providing optimal power usage while maximizing performance levels and tolerating component failures. The proposed project aims at investigating how such mechanisms can represent crucial enabling technologies for many-core systems.
Specifically, this project focuses on how to overcome three critical issues related to the implementation of many-core systems: reliability, energy efficiency, and on-line optimisation. The need for reliability is an accepted challenge for many-core systems, considering the large number of components and the increasing likelihood of faults of next-generation technologies, as is the requirement to reduce the heat dissipation related to energy consumption. On the other hand, on-line optimisation, that is, the ability of the system to improve over time without the need for external intervention (including becoming better at reliability and energy efficiency), is a mechanism that could be vital to enable the implementation of these properties in systems that cannot be managed centrally due to the vast number of cores involved.
The proposed approach is centred around two basic processes: Graceful degradation implies that the system will be able to cope with faults (permanent or temporary) or potentially damaging power consumption peaks by lowering its performance. Graceful amelioration implies that the system will constantly seek for alternative, better ways to execute an application.
The current trend in computer design seems to favour a shift to systems where computational power is achieved not by a single very fast and very complex processor, but through the parallel operation of several on-chip processors, each executing a single thread. This kind of approach is implemented commercially today through multi-core processors and in research through the Network On Chip (NoC) or the Chip Multi-Processors (CMP) paradigms. The natural evolution of these approaches sees the number of cores increasing constantly and it is generally accepted that the next few decades will witness the introduction of many-core systems, that is, systems that integrate hundreds or thousands of cores.
This shift introduces problems common to all massively parallel systems, ranging from the design of applications that can exploit large numbers of processors to technological challenges related to the implementation of such cores in silicon substrates that are increasingly error-prone, due to their size and to the increasing sensitivity to faults of next-generation technologies, and to the dissipation of heat generated by the computational activity in the cores. Current architectures are not suitable for this kind of systems and there is a strong need to devise novel mechanisms and technologies that will allow the development of many-core systems and eventually their commercialization as consumer products.
Imagine then a many-core system with thousands or millions of processors that gets better and better with time at executing an application, "gracefully" providing optimal power usage while maximizing performance levels and tolerating component failures. The proposed project aims at investigating how such mechanisms can represent crucial enabling technologies for many-core systems.
Specifically, this project focuses on how to overcome three critical issues related to the implementation of many-core systems: reliability, energy efficiency, and on-line optimisation. The need for reliability is an accepted challenge for many-core systems, considering the large number of components and the increasing likelihood of faults of next-generation technologies, as is the requirement to reduce the heat dissipation related to energy consumption. On the other hand, on-line optimisation, that is, the ability of the system to improve over time without the need for external intervention (including becoming better at reliability and energy efficiency), is a mechanism that could be vital to enable the implementation of these properties in systems that cannot be managed centrally due to the vast number of cores involved.
The proposed approach is centred around two basic processes: Graceful degradation implies that the system will be able to cope with faults (permanent or temporary) or potentially damaging power consumption peaks by lowering its performance. Graceful amelioration implies that the system will constantly seek for alternative, better ways to execute an application.
Staff
Lead researchers
Other researchers
Collaborating research institutes, centres and groups
Research outputs
Indar Sugiarto, Stephen Furber, Delong Shang, Amit Singh, Bassem Ouni, Geoff Merrett & Bashir Al-Hashimi,
2018
Type: conference
Basireddy Karunakar Reddy, Amit Singh, Dwaipayan Biswas, Geoff Merrett & Bashir Al-Hashimi,
2017, IEEE Transactions on Multiscale Computing Systems, 1-14
Type: article
Charles Leech, Charan Kumar Vala, Amit Acharyya, Sheng Yang, Geoffrey Merrett & Bashir Al-Hashimi,
2017, ACM Transactions on Embedded Computing Systems
Type: article
Alberto Rodriguez Arreola, Domenico Balsamo, Luo Zhenhua, Stephen Beeby, Geoff V. Merrett & Alex S. Weddell,
2017
Type: conference