Module overview
Data analysis is changing. New sources of data in a wide range of formats contain valuable information, but extracting this information is often challenging using traditional tools. This module introduces modern techniques for mining such data and demonstrates how they may be put into action. Methods for handling structured and unstructured data are discussed, including techniques for the analysis of textual data.
    Aims and Objectives
Learning Outcomes
Transferable and Generic Skills
Having successfully completed this module you will be able to:
- write technical reports that present results in a clear and reproducible manner.
 
Subject Specific Practical Skills
Having successfully completed this module you will be able to:
- manage and extract information from unstructured data.
 - use appropriate techniques to obtain data from the web in an ethical manner;
 
Knowledge and Understanding
Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:
- storing, searching and analysing structured and unstructured data in a variety of formats.
 
Subject Specific Intellectual and Research Skills
Having successfully completed this module you will be able to:
- select and apply methods for the analysis of textual data in Python, and interpret your results.
 
Syllabus
This module will cover:
- Data modalities
- Analysing structured and unstructured datasets
- Web scraping and Web crawling
- Feature extraction
- Indexing and information extraction
- Analysing unstructured text data; topic modelling
    Learning and Teaching
Teaching and learning methods
Depending on feasibility, teaching may be delivered face to face intensively over a week, or online using a mixture of synchronous and asynchronous online methods, which may include lectures, discussion boards, workshop activities, exercises, and videos. A range of resources will also be provided for further self-directed study.
      
              | Type | Hours | 
|---|---|
| Independent Study | 73 | 
| Teaching | 27 | 
| Total study time | 100 | 
Resources & Reading list
                                      Textbooks
                                
        
        
        
        
  
        
        
        
        
Han, Jiawei; Kamber, Micheline; Pei, Jian (2012). Data Mining. Waltham, MA, USA: Elsevier.
Toby Segaran (2007). Programming Collective Intelligence. Sebastopol, US: O'Reilly.
Assessment
Assessment strategy
100% CourseworkSummative
This is how we’ll formally assess what you have learned in this module.
| Method | Percentage contribution | 
|---|---|
| Assignment | 100% | 
Referral
This is how we’ll assess you if you don’t meet the criteria to pass this module.
| Method | Percentage contribution | 
|---|---|
| Assignment | 100% | 
Repeat Information
Repeat type: Internal & External