Preface#
In 1948, a small city in Massachusetts made medical history — not by developing a new drug or performing a landmark surgery, but by simply deciding to watch.
The Framingham Heart Study enrolled over 5,000 men and women and committed to following them for the rest of their lives. Researchers recorded blood pressure, cholesterol, and smoking habits, then waited to see who developed disease. What they discovered transformed medicine, proving that high blood pressure and smoking were not just facts of life, but risk factors that could be managed. This represents the peak of Observational Epidemiology: learning by watching the world as it is.
However, public health does not stop at observation. Once a problem is identified, we must test the solution. This requires Experimental Epidemiology, or the clinical trial. In this book, we complement the Framingham data with a classic Anorexia Clinical Trial dataset. While Framingham teaches us how to identify the risks that lead to heart disease, the anorexia data teaches us how to measure whether a psychological intervention, such as Family Therapy or Cognitive Behavioral Therapy — actually works to improve a patient’s health.
Why This Book Exists#
Most biostatistics textbooks are written for mathematics students. As educators who have taught biostatistics and research methods across several universities internationally, we have seen firsthand how traditional, formula-heavy approaches can alienate health science students. This book was born directly from our classrooms. It is the culmination of our collective teaching experiences, designed to overcome the common hurdles our students face every semester.
We wrote this book for people who want to understand diseases and prevent it. We believe that statistics should be taught through real variables with real clinical meanings. In these pages, a regression slope isvnot just a line; it is a tool that tells you something about survival. A p-value is not just a decimal; it is a piece of evidence used to decide if a therapy is effective.
The Datasets#
Every row of data in this book represents a real person.
The Framingham Subset: 500 participants, 22 variables. This dataset is the foundation for our lessons on descriptive statistics, risk, and long-term cardiovascular outcomes.
The Anorexia Subset: 72 participants, 3 variables. This “small-sample” dataset allows us to explore the nuances of clinical trials, paired measurements, and psychological health interventions.
How to Use This Book#
Each chapter follows a consistent structure: a plain-English introduction to the theory, worked examples using our two datasets, and a practical lab manual for both PSPP and R. You do not need a mathematics background to succeed here. You only need curiosity about human health and the patience to work through the data.
Statistics, taught well, is not a requirement to be endured. It is the language in which the story of disease — and the story of its recovery — is told.
Payton Yau, Suhirthakumar Puvanendran, and Jarunee Nualyong March 2026