Year
2020
Units
4.5
Contact
2 x 50-minute lectures weekly
1 x 50-minute workshop weekly
1 x 90-minute computer lab weekly
Enrolment not permitted
1 of BUSN1009, STAT1122, STAT1412, STAT8721 has been successfully completed
Topic description
  1. Design of experiments: experiments, observational studies, sample surveys; measurements & variables; replication & pseudo-replication
  2. Descriptive statistics: graphical & numerical summaries; the shape of a distribution; data screening & outliers
  3. Exploring relationships: predictor-response data; the least-squares line; residuals & transformations; prediction; the sample correlation coefficient; time series
  4. Probability: basic concepts; conditional probability & independence; random variables, the binomial distribution, the normal distribution
  5. Statistical inference: samples & populations; estimation, confidence intervals, hypothesis testing; inference for normal samples (one-sample); inference for proportion
  6. Data management including selection, sampling and cleaning; big data storage and management issues; tools and techniques for exploratory data analysis and prediction; limitations and interpretations of inductive inference and case studies
Educational aims
This topic is directed towards students with little quantitative experience.

It serves as an introduction to the interdisciplinary and emerging field of data science. Students will learn how to combine tools and techniques from statistics, computer science and data visualization. It aims to impart an understanding of the key issues in the analysis of statistical data together with practical experience in using a modern statistical package to perform elementary statistical analysis in a wide range of applications
Expected learning outcomes
At the completion of this topic, students are expected to be able to:

  1. Gain an understanding of the statistical concepts and techniques presented
  2. Acquire quantitative confidence
  3. Gain an understanding of the data management, visualization and preservation of large collections of data