Preface#

In 1948, a small city in Massachusetts made medical history, not by developing a new drug or performing a landmark surgery, but by simply deciding to watch.

The Framingham Heart Study enrolled over 5,000 men and women and committed to following them for the rest of their lives. Researchers recorded blood pressure, cholesterol, and smoking habits, then waited to see who developed disease. What they discovered transformed medicine, proving that cardiovascular disease was not just a fact of ageing, but a collection of risk factors that could be managed. This represents the absolute peak of Observational Epidemiology: learning by watching the world exactly as it is.

However, the science of human health does not stop at passive observation; it demands active intervention. Once a problem is identified, we must test a solution.

To teach the rigorous mechanics of Experimental Epidemiology, we contrast the massive Framingham cohort with a classic 1993 Anorexia Nervosa clinical trial. If Framingham represents the macro-scale of physical disease tracking, this anorexia dataset represents the micro-scale of psychological intervention. It tracks a small, specific cohort to determine whether targeted treatments, such as Family Therapy or Cognitive Behavioural Therapy, can successfully alter a patient’s clinical trajectory. While the heart study teaches us how to identify the slow accumulation of risk, the anorexia trial teaches us how to mathematically prove that a treatment actually works.

Why This Book Exists#

Most biostatistics textbooks are written for mathematics students. As educators who have taught biostatistics and research methods across several international universities, we have seen firsthand how traditional, formula-heavy approaches can alienate health science students. This book was born directly from our classrooms. It is the culmination of our collective teaching experiences, designed to overcome the common hurdles our students face every semester.

We wrote this book for people who want to understand disease and prevent it. We believe that statistics must be taught through real variables with real clinical meanings. In these pages, a regression slope is not just a mathematical line; it is a tool that tells you something about human survival. A p-value is not just a decimal; it is a piece of evidence used to decide if a life-saving therapy is truly effective.

The Datasets#

Every row of data in this book represents a real person.

  • The Framingham Subset: 500 participants, 22 variables. This foundational dataset provides our landscape for teaching descriptive statistics, population risk, and long-term cardiovascular outcomes.

  • The Anorexia Subset: 72 participants, 3 variables. This focused, small-sample dataset allows us to explore the vital nuances of clinical trials, paired measurements, and the efficacy of psychological health interventions.

How to Use This Book#

Each chapter follows a consistent structure: a plain-English introduction to the theory, worked examples utilising our two datasets, and a practical lab manual. We deliberately chose to feature two powerful open-source software platforms, PSPP and R, ensuring that every student has free and equitable access to professional analytical tools.

You do not need a mathematics background to succeed here. You only need a genuine curiosity about human health and the patience to work through the data. Statistics, taught well, is not a mathematical requirement to be endured. It is the language in which the story of disease, and the story of its recovery, is told.

Payton Yau, Suhirthakumar Puvanendran, and Jarunee Nualyong March 2026