AI-based multi-modal 3D environment understanding and visualisation

About the project

This project aims to develop an AI-based practical solution for 3D environments understanding from multi-modal (audio/visual) input data and reproducing it in a virtual or augmented reality space allowing real-time 3D interaction with spatial audio adapted to the environment and user locations.

Computer Vision is one of the most active areas where artificial intelligence (AI) is being used. This area is extremely expanding and getting a lot of interests and investments these days. Active perception of a surrounding environment through AI relies heavily on the design of architectures and their extensive training to generate compact representations. Taking advantages of recent advancements in AI technology, these representations have shown significant improvement in building new knowledge and acquiring new skills for AI agents and practical applications in our daily life.

Scene understanding, studies the task of representing a captured scene in a manner emulating human-like understanding of that space. Attaining this understanding is crucial for applications such as robotics, tele-communication, smart home, healthcare and assisted living.

In this project, You will join a team working on a pipeline for modelling and rendering of the full environment including 3D geometry, semantic objects and material attributes from multi-modal inputs such as video, audio and text. You will join this team and investigate topics in AI-based multi-modal 3D environmental scene understanding and visualisation.

Various chances to attend the British vision summer school or major international conferences such as the Conference on Computer Vision and Pattern Recognition (CVPR) and the International Conference on Computer Vision (ICCV).

Potential supervisors

Lead supervisor

Dr Hansung Kim

Associate Professor

Research interests

3D Computer Vision
Artificial intelligence (AI) for scene understanding
Audio-visual data processing

Supervisors

Dr Rahman Attar SMIEEE, MIET, FHEA, PhD, MPhil, BEng

Lecturer

Entry requirements

You must have a UK 2:1 honours degree, or its international equivalent, in computer vision and machine learning, and being proficient in Python.

Experience in camera or VR systems, and experience in academic paper publication are desirable but not essential.

Applicants without MSc or MEng in computer vision or machine learning would have to provide strong justification that they would be able to complete a PhD in this field.

Fees and funding

We offer a range of funding opportunities for both UK and international students. Horizon Europe fee waivers automatically cover the difference between overseas and UK fees for qualifying students.

Competition-based Presidential Bursaries from the University cover the difference between overseas and UK fees for top-ranked applicants.

Competition-based studentships offered by our schools typically cover UK-level tuition fees and a stipend for living costs (minimum of £19,237 in 2024-25) for top-ranked applicants.

Funding will be awarded on a rolling basis, so apply early for the best opportunity to be considered.

How to apply

Apply now

You need to:

choose programme type (Research), 2025/26, Faculty of Engineering and Physical Sciences
select Full time or Part time
choose the relevant PhD in Computer Science
add name of the supervisor in section 2

Applications should include:

personal statement
your CV (resumé)
2 academic references
degree transcripts to date

Contact us

Faculty of Engineering and Physical Sciences

If you have a general question, email our doctoral college (feps-pgr-apply@soton.ac.uk).

Project leader

For an initial conversation, email Dr Hansung Kim (h.kim@soton.ac.uk).

Postgraduate research project