R for Statistical Significance Tests

Overview

R is a great computer programming language that is not difficult to learn even for those who have no previous computer programming experience. It is a high-level language which means its instructions and commands are human-readable. It is known for its large world-wide community and its power in data and statistical analyses.

In this online day school you will learn how to perform several widely-used hypotheses and statistical significance tests and correctly interpret the results.

The day assumes prior knowledge of how to use R and RStudio. It contains explanation and implementation of several tests such as:

  • t-test to determine if there is a significant difference between the means of two groups, which may be related in certain features, or to answer the question: is the mean of a vector different from a given value? This includes variations of the t-test.
  • Kolmogorov-Smirnov test to statistically test the distribution of a variable.
  • A/B test to establish which of two treatments, products, procedures, or the like is superior.
  • Permutation test to compare an observed statistic to a resampled distribution and determine whether an observed difference between samples might occur by chance.
  • ANOVA to test whether groupings in the data can be meaningful ways to understand the structure of the data.
  • Chi-Squared test to test differences across a contingency table.

This training covers various theoretical and practical aspects of several hypothesis and statistical significance tests. You will learn what a test does, when to use it, how to use it and how to interpret its results.

By the end of the day you will have access to all course material (e.g. slides, code examples and so on).

Please note: this event will close to enrolments at 23:59 UTC on 14 February 2024.

Programme details

All times GMT (UTC)

10am:  
Introduction and basic concepts:

  • What are hypothesis and statistical significance tests? Why do we need them?
  • What is a p-value? How do we interpret it?
  • The two types of error (type I and type II errors)
  • Paired sample vs independent sample
  • Data types and distributions
  • Kolmogorov-Smirnov test
  • Shapiro-Wilk test

11.20am:
Break

11.40am:
T-test, A/B and permutation tests:

  • One-sample t-test
  • Two-sample t-test
  • A/B testing
  • Why have a control group?
  • Resampling and resampling techniques
  • Permutation test
  • More on the p-value and how to interpret it

1pm:
Lunch break

2pm:
ANOVA, Chi-squared and Fisher’s Exact Tests:

  • Does data contain groups?
  • What is the ANOVA test?
  • Multivariate ANOVA (MANOVA)
  • Contingency tables and how to create them
  • Chi-squared test
  • Fisher’s exact test

3.20pm:
Break

3.40pm:
Kruskal-Wallis and Mann-Whitney tests and how to select a test:

  • More on data groupings
  • The Kruskal-Wallis test
  • The Mann-Whitney U test
  • A strategy for selecting the correct test for your data

4.40pm:
Conclusion and wrap up

5pm:
End of day

Fees

Description Costs
Course Fee £115.00

Funding

If you are in receipt of a UK state benefit or are a full-time student in the UK you may be eligible for a reduction of 50% of tuition fees.

Concessionary fees for short courses

Tutor

Dr Noureddin Sadawi

Dr Noureddin Sadawi specialises in machine/deep learning and data science. He has several years’ experience in various areas involving data manipulation and analysis. He received his PhD from the University of Birmingham. He is the winner of two international scientific software development contests - at TREC2011 and CLEF2012.

Noureddin is an avid scientific software researcher and developer with a passion for learning and teaching new technologies. He is an experienced scientific software developer and data analyst. Over the last few years, he has been using R and Python as his preferred programming languages.

He has also been involved in several projects spanning a variety of fields such as bioinformatics, textual/image/video data analysis, drug discovery, omics data analysis and computer network security. He has taught at multiple universities in the UK and has worked as a software engineer in different roles. Currently he holds the following part-time roles: senior content developer and lecturer at the University of London; international trainer with O'Reilly and Pearson; short course trainer and instructor at Goldsmiths University, London as well as a lecturer at the University of Oxford. He is the founder of SoftLight LTD, a London-based company that specialises in data science and machine/deep learning where he works as a consultant providing advice and expertise in these areas. Currently he is a member of the organising committee of this international conference: https://ilcict.ly/. A list of his publications can be found here.

Application

Please use the 'Book' button on this page. Alternatively, please contact us to obtain an application form. 

IT requirements

The University of Oxford uses Microsoft Teams for our learning environment, where students and tutors will discuss and interact in real time. Joining instructions will be sent out prior to the start date. We recommend that you join the session at least 10-15 minutes prior to the start time – just as you might arrive a bit early at our lecture theatre for an in-person event.

If you have not used the Microsoft Teams app before, once you click the joining link you will be invited to download it (this is free). Once you have downloaded the app, please test before the start of your course. If you are using a laptop or desktop computer, you will also be offered the option of connecting using a web browser. If you connect via a web browser, Chrome is recommended.

Please note that this course will not be recorded.