Module overview
Working with data of various forms is a crucial skill for all engineers and scientists. This module introduces students to working with, analysing and processing various different forms of data. The module focusses on ensuring students have a thorough grasp of the appropriate use of statistical and graphical measures to make decisions on data, and the basic practical tools and techniques required to filter, refine and query data. At its heart, this module provides the grounding for students to be able to perform Exploratory Data Analysis (EDA).
Aims and Objectives
Learning Outcomes
Subject Specific Practical Skills
Having successfully completed this module you will be able to:
- Be able to use tools to compute appropriate statistics from data
- Use SQL to create, update and query a database
- Be able to use tools to produce graphs and visualisations of data
- Gain facility in using computational tools to inspect, process, filter and refine data for different types
Subject Specific Intellectual and Research Skills
Having successfully completed this module you will be able to:
- Understand the broader context of EDA (Exploratory Data Analysis) as a part in solving questions involving data
- Choose the most appropriate approach to analyse a dataset for a given problem
Knowledge and Understanding
Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:
- Know how to choose appropriate statistical measures or graphical diagrams to answer simple questions about data
- Demonstrate knowledge of appropriate techniques for pre-processing, filtering, and refining data for analysis
- Understanding of different types of data and knowledge of different ways of representing that data
Syllabus
Types of data and their representations, including:
- Structured vs unstructured
- Text vs binary
- Graphs
- Streams
- Tables
- Images, sequences & volumes
- Encodings (e.g. JSON, XML)
Slicing data:
- Regular expressions
- Filtering
- Data pipelines
- Tools (e.g., grep, sed &awk, python scripts, pandas)
Extracting information:
- Descriptive statistics
- Introductory data visualisation
- Tools (e.g. numpy, scipy, matplotlib/seaborn)
Relational data:
- The relational model
+ Relations, domains, attributes, keys, dependencies
+ Normalisation
- Practical SQL
+ querying
+ simple joins
The wider context:
- Understand the role of EDA as an initial step in understanding data related to a problem
Learning and Teaching
Teaching and learning methods
The module consists of:
- Lectures
- Guided self-study
- Labs as part of the AICE Lab Programme which will cover practical aspects
Type | Hours |
---|---|
Specialist Laboratory | 8 |
Guided independent study | 74 |
Completion of assessment task | 24 |
Revision | 12 |
Lecture | 32 |
Total study time | 150 |
Assessment
Summative
This is how we’ll formally assess what you have learned in this module.
Method | Percentage contribution |
---|---|
Lab work | 10% |
Coursework | 50% |
Exam | 40% |
Referral
This is how we’ll assess you if you don’t meet the criteria to pass this module.
Method | Percentage contribution |
---|---|
Lab Marks carried forward | 10% |
Exam | 50% |
Coursework | 40% |
Repeat
An internal repeat is where you take all of your modules again, including any you passed. An external repeat is where you only re-take the modules you failed.
Method | Percentage contribution |
---|---|
Coursework | 40% |
Lab Marks carried forward | 10% |
Exam | 50% |