# R for Statistical Significance Tests

## Overview

R is a great computer programming language that is not difficult to learn even for those who have no previous computer programming experience. It is a high-level language which means its instructions and commands are human-readable. It is known for its large world-wide community and its power in data and statistical analyses.

In this online day school you will learn how to perform several widely-used hypotheses and statistical significance tests and correctly interpret the results.

The day assumes prior knowledge of how to use R and RStudio. It contains explanation and implementation of several tests such as:

• t-test to determine if there is a significant difference between the means of two groups, which may be related in certain features, or to answer the question: is the mean of a vector different from a given value? This includes variations of the t-test.
• Kolmogorov-Smirnov test to statistically test the distribution of a variable.
• A/B test to establish which of two treatments, products, procedures, or the like is superior.
• Permutation test to compare an observed statistic to a resampled distribution and determine whether an observed difference between samples might occur by chance.
• ANOVA to test whether groupings in the data can be meaningful ways to understand the structure of the data.
• Chi-Squared test to test differences across a contingency table.

This training covers various theoretical and practical aspects of several hypothesis and statistical significance tests. You will learn what a test does, when to use it, how to use it and how to interpret its results.

By the end of the day you will have access to all course material (e.g. slides, code examples and so on).

Please note: this event will close to enrolments at 23:59 UTC on 14 February 2024.

## Programme details

All times GMT (UTC)

10am:
Introduction and basic concepts:

• What are hypothesis and statistical significance tests? Why do we need them?
• What is a p-value? How do we interpret it?
• The two types of error (type I and type II errors)
• Paired sample vs independent sample
• Data types and distributions
• Kolmogorov-Smirnov test
• Shapiro-Wilk test

11.20am:
Break

11.40am:
T-test, A/B and permutation tests:

• One-sample t-test
• Two-sample t-test
• A/B testing
• Why have a control group?
• Resampling and resampling techniques
• Permutation test
• More on the p-value and how to interpret it

1pm:
Lunch break

2pm:
ANOVA, Chi-squared and Fisher’s Exact Tests:

• Does data contain groups?
• What is the ANOVA test?
• Multivariate ANOVA (MANOVA)
• Contingency tables and how to create them
• Chi-squared test
• Fisher’s exact test

3.20pm:
Break

3.40pm:
Kruskal-Wallis and Mann-Whitney tests and how to select a test:

• More on data groupings
• The Kruskal-Wallis test
• The Mann-Whitney U test
• A strategy for selecting the correct test for your data

4.40pm:
Conclusion and wrap up

5pm:
End of day

## Fees

Description Costs
Course Fee £115.00

## Funding

If you are in receipt of a UK state benefit or are a full-time student in the UK you may be eligible for a reduction of 50% of tuition fees.

Concessionary fees for short courses

## Tutor

Dr Noureddin Sadawi specialises in machine/deep learning and data science. He has several years’ experience in various areas involving data manipulation and analysis. He received his PhD from the University of Birmingham. He is the winner of two international scientific software development contests - at TREC2011 and CLEF2012.

Noureddin is an avid scientific software researcher and developer with a passion for learning and teaching new technologies. He is an experienced scientific software developer and data analyst. Over the last few years, he has been using R and Python as his preferred programming languages.

He has also been involved in several projects spanning a variety of fields such as bioinformatics, textual/image/video data analysis, drug discovery, omics data analysis and computer network security. He has taught at multiple universities in the UK and has worked as a software engineer in different roles. Currently he holds the following part-time roles: senior content developer and lecturer at the University of London; international trainer with O'Reilly and Pearson; short course trainer and instructor at Goldsmiths University, London as well as a lecturer at the University of Oxford. He is the founder of SoftLight LTD, a London-based company that specialises in data science and machine/deep learning where he works as a consultant providing advice and expertise in these areas. Currently he is a member of the organising committee of this international conference: https://ilcict.ly/. A list of his publications can be found here.