Skip to main content
Programming for Data Science

Course details

Code
O19P732COW
Fees
From £285.00
Credit
10 CATS points

Dates
22 Jan 2020 - 25 Mar 2020
Sessions
10
Day of week
Wednesday
Time
7:00-9:00pm

Programming for Data Science

Overview

Data science is a discipline that uses scientific methods, processes and algorithms to extract meaningful information, knowledge and insights from structured and unstructured data.

The aim of this course is to provide insights on intermediate and advanced data science topics,  using  the Python programming language. The course will explore concepts such as  statistical modelling, machine learning and data management from a practical hands down point of view. The focus will be on tools and methods rather than diving into the theoretical basis, in order to be appreciated by an audience with a minimal mathematical background.

Experience of using a programming or scripting language is a must. The student should master all the concepts explored in the course "Introduction to Python - Programming for Data Science"

In order to complete the assignment (and in order to get the full benefit from the course) students will need access to a computer capable of running the open source software used in the course and access to the Internet. A limited amount of class time will be allocated to working on the class assignment, so students should ensure that they have access to a computer outside of class.

Programme details

Term Starts: 22nd January      

Week 1:   Introduction to probability and distributions. Regression.

Week 2:   Introduction to statistical tests: t-test and One-way ANOVA; Scipy and statsmodels

Week 3:   Machine Learning: supervised Learning with scikit-learn

Week 4:   Machine Learning: unsupervised Learning with scikit-learn

Week 5:   Deep Learning: image recognition with keras

Week 6:   Deep Learning: text manipulation with keras

Week 7:   Principles of data management and metadata tracking systems for data science

Week 8:   Big Data: the MapReduce Abstraction

Week 9:   Principles of study design I + final assignment

Week 10:  Principles of study design II + final assignment

Certification

Students who register for CATS points will receive a Record of CATS points on successful completion of their course assessment.

To earn credit (CATS points) you will need to register and pay an additional £10 fee per course. You can do this by ticking the relevant box at the bottom of the enrolment form or when enrolling online.

Coursework is an integral part of all weekly classes and everyone enrolled will be expected to do coursework in order to benefit fully from the course. Only those who have registered for credit will be awarded CATS points for completing work at the required standard.

Students who do not register for CATS points during the enrolment process can either register for CATS points prior to the start of their course or retrospectively from between January 1st and July 31st after the current academic year has been completed. If you are enrolled on the Certificate of Higher Education you need to indicate this on the enrolment form but there is no additional registration fee.

Fees

Description Costs
Course Fee £285.00
Take this course for CATS points £10.00

Tutor

Dr Massi Izzo

Massimiliano Izzo is a Research Software Engineer in the Department of Engineering Science, University of Oxford. He has a Doctorate in Biomedical Engineering from the University of Genoa, Italy, and currently works in the department's FAIR Data Science team on developing innovative data models for the life sciences.

Course aims

1. To familiarise with statistical modelling and statistical tests in Python.

2. To learn how to use a variety of machine learning algorithms to extract features from the data using Python libraries.

3. To gain insights on how to face scaling issues in a "big data" scenario

4. To familiarise with a few core concept of study design and experiment planning for data science

Teaching methods

Each week's session will consist of lectures and hands-on programming exercises, class discussions and interactive programming demonstrations by the lecturer.  

Learning outcomes

A the end of the course the students will be able :

  • to identify the correct statistical test for a specific scientific hypothesis
  • to use Python machine learning libraries to build up simple classifiers, train and test them
  • to collect the essential metadata to ensure reproducibility of the analysis they have run
  • to distribute a task over multiple machines in a "big data" scenario using the MapReduce model 

Assessment methods

Students will be asked to complete a capstone project for their coursework assignment. Two hours of class time will be allocated to complete the capstone project.

In order to complete the assignment (and in order to get the full benefit from the course) students will need access to a computer capable of running the open source software used in the course and access to the Internet. Only a limited amount of class time will be allocated to working on the assignment, so students should ensure that they have access to a computer outside of class.

Students must submit a completed Declaration of Authorship form at the end of term when submitting your final piece of work. CATS points cannot be awarded without the aforementioned form.

Application

To earn credit (CATS points) for your course you will need to register and pay an additional £10 fee per course. You can do this by ticking the relevant box at the bottom of the enrolment form or when enrolling online.

Please use the 'Book' or 'Apply' button on this page. Alternatively, please complete an application form.

Level and demands

Experience of using a programming or scripting language is a must. The student should master all the concepts explored in the course "Introduction to Python - Programming for Data Science"

Most of the Department's weekly classes have 10 or 20 CATS points assigned to them. 10 CATS points at FHEQ Level 4 usually consist of ten 2-hour sessions. 20 CATS points at FHEQ Level 4 usually consist of twenty 2-hour sessions. It is expected that, for every 2 hours of tuition you are given, you will engage in eight hours of private study.

Credit Accumulation and Transfer Scheme (CATS)