# Methods in Biostatistics II – JHSPH 140.652.01

This 2006 JHSPH OpenCourseWare graduate course is the second half of a course presenting fundamental concepts in applied probability, exploratory data analysis, and statistical inference, focusing on probability and analysis of one and two samples.

Topics include discrete and continuous probability models; expectation and variance; central limit theorem; inference, including hypothesis testing and confidence for means, proportions, and counts; maximum likelihood estimation; sample size determinations; elementary non-parametric methods; graphical displays; and data transformations.

**Learning Objectives**

Over the course of this educational session you'll:

1) reacquaint yourself with the mathematical, computational, statistical, and probability background needed to complete the course.

2) be introduced to the display and communication of statistical data. This will include graphical and exploratory data analysis using tools like scatterplots, boxplots, and the display of multivariate data. In this objective, students will be required to write extensively.

3) learn the distinctions between the fundamental paradigms underlying statistical methodology.

4) learn the basics of maximum likelihood.

5) learn the basics of frequentist methods: hypothesis testing, confidence intervals.

6) learn basic Bayesian techniques, interpretation and prior specification.

7) learn the creation and interpretation of P values.

8) learn estimation, testing, and interpretation for single group summaries such as means, medians, variances, correlations, and rates.

9) learn estimation, testing, and interpretation for two group comparisons such as odds ratios, relative risks, and risk differences.

10) learn the basic concepts of ANOVA.

**Course Organization**

This course constitutes 14 lectures. Note the lecture numbers continue from the first course as lectures 15–28. PDFs of lecture notes are available for all of the lectures. Additional handouts for a few lectures are also available. However, no homework, video, or audio files are associated. A final exam was likely given for this course, but it is not available through OCW.

**Further Reading**

The following texts are listed as required or recommended for the course:

*Required*

- Rosner, Bernard.
*Fundamentals of Biostatistics*. 6th edition, 2006.

This is a practical introduction to the methods, techniques, and computation of statistics with human subjects. It prepares students for their future courses and careers by introducing the statistical methods most often used in medical literature. Rosner minimizes the amount of mathematical formulation (algebra-based) while still giving complete explanations of all the important concepts. As in previous editions, a major strength of this book is that every new concept is developed systematically through completely worked out examples from current medical research problems. (Chapters 1–3 found here as a PDF.)

The Rosner text is given as assigned reading with most lectures. See each session for what was assigned.

*Recommended*

- Rice, John A.
*Mathematical Statistics and Data Analysis*. 2nd edition, 1995. (PDF)

This is the first text in a generation to re-examine the purpose of the mathematical statistics course. The book's approach interweaves traditional topics with data analysis and reflects the use of the computer with close ties to the practice of statistics. The author stresses analysis of data, examines real problems with real data, and motivates the theory. The book's descriptive statistics, graphical displays, and realistic applications stand in strong contrast to traditional texts which are set in abstract settings.

- Venables, WN; Ripley, BD.
*Modern Applied Statistics with S*. 4th edition, 2002. (PDF)

This book is a guide to using S-PLUS to perform statistical analyses and provides both an introduction to the use of S-PLUS and a course in modern statistical methods. S-PLUS is available for both Windows and UNIX workstations, and both versions are covered in depth. The aim of the book is to show how to use S-PLUS as a powerful and graphical data analysis system. Readers are assumed to have a basic grounding in statistics; as such the book in intended for would-be users of S-PLUS and both students and researchers using statistics. Throughout, the emphasis is on presenting practical problems and full analyses of real data sets.

- McDavid, James C.
*Program Evaluation and Performance Measurement: An Introduction to Practice*. Thousand Oaks, CA: SAGE Publications, 2005. - Marshall, Elliot.
*Flawed Statistics in Murder Trial May Cost Expert His Medical License*. Science 2005; 309:543. - Wainer, Howard.
*How to Display Data Badly.*The American Statistician 1984; 38:137-147. (PDF)

**Other Requirements**

Calculus, linear algebra, and a moderate level of mathematical literacy are prerequisites for this class. Note that simply having the prerequisites for this class does not necessarily mean that it is the correct class for you.

**Additional Course Resources**

The following resources may be useful to you as you progress throughout the course:

» R for Windows and Mac: R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. To download R, please choose your preferred CRAN mirror.

» WinEdt: WinEdt (shareware) is a ASCII editor and shell for MS Windows with a strong predisposition towards the creation of [La]TeX documents.

» John Verzani's Notes: simpleR - Using R for Introductory Statistics (PDF)

» J.H. Maindonald's Notes: Using R for Data Analysis and Graphics: Introduction, Code and Commentary (PDF)

» Julian J. Faraway's Notes: Practical Regression and ANOVA using R (PDF)

» Patrick Burns' Notes: A Guide for the Unwilling S User (PDF)

» Jonathan Baron's R Reference Card (PDF)

» Tom Short's R Reference Card (PDF)

» Karl Broman's R Site (Archived)

You must be logged in to post a comment Login