Understanding Biostatistics Interpretation


Introduction

A basic understanding of statistical concepts is necessary to evaluate existing literature effectively. Statistical results do not, however, allow one to determine the clinical applicability of published findings. Statistical results can be used to make inferences about the probability of an event among a given population. Careful interpretation by the clinician is required to determine the value of the data as it applies to an individual or group of patients.[1] Good research studies provide a clear, testable hypothesis, or prediction, about what they expect to find in the relationships being tested.[2] The hypothesis is grounded in the empirical literature and based on clinical observations or expertise. It should be innovative in testing a novel relationship or confirming a prior study. There are at minimum 2 hypotheses in any study:

The null hypothesis assumes there is no difference or no effect, and (2) the experimental or alternative hypothesis predicts an event or outcome. Often, the null hypothesis is not stated or is assumed. Hypotheses are tested by examining relationships between independent variables, or those thought to have some effect, and dependent variables, or those thought to be moved or affected by the independent variable. These are also called predictor and outcome variables, respectively.

Statistics are used to test a study’s alternative or experimental hypothesis. Statistical models are fitted based on the dataset's nature, type, and other characteristics. Data typically involves measurement levels, which determine the type of statistical models that can be applied to test a hypothesis.[3] Nominal data are those variables containing 2 or more categories without underlying order or value. Examples of nominal data include indicators of group membership, such as male or female. Ordinal data is nominal data that includes an order or rank but has undefined spacing between groups or levels, such as faculty ranking or educational level. Interval data is ordinal data with clearly defined spacing between the intervals and no absolute zero points. An example of interval data is the temperature scale, as the magnitude of the difference between intervals is consistent and measurable (one degree). Ratio data are interval data that include an absolute zero, such as the amount of student loan debt. Nominal and ordinal data are categorical, where entities are divided into distinct groups, whereas interval and ratio data are considered continuous, giving each observation a distinct score.[4]

It is up to the researcher to appropriately apply statistical models when testing hypotheses. Several approaches can be used to analyze the same dataset, and how this is accomplished depends heavily on the nature of the wording in a researcher’s hypothesis.[5] Various statistical software packages exist, some available for free while others charge annual license fees that can be used to analyze data. Nearly all packages require the user to understand the types of data and the appropriate application of statistical models for each type. More sophisticated packages require the user to use the program’s proprietary coding language to perform hypothesis tests. These can require much time to learn, and errors can easily slip past the untrained eye. It is strongly recommended that unfamiliar users consult a statistical analyst when designing and running statistical models. Biostatistician consultations can occur at any time during a study, but earlier consultations are wise to prevent the introduction of accidental bias into study data and to help ensure accuracy and collection methods that are adequate to allow for tests of hypotheses.

Issues of Concern

Statistical Significance

If the probability of obtaining a test statistic value by chance (p-value) is less than .05, the experimental hypothesis is accepted as true. Another way of thinking about p-values is the probability that the null hypothesis is true, which for a cutoff of a p-value less than .05 would mean there is a less than 5% chance that the difference observed is not a true difference.[4] However, when interpreting statistical results, the p-value alone is not enough.[6] Significant does not always equate to important. Very small, potentially unimportant effects can turn out to be statistically significant.[7]

Clinical Significance

To evaluate the clinical relevance or importance of a significant result, one must be certain to consider the size of the effect.[8] Effect measures are standardized to allow application across different scales of measurement.[9] The following are some of the more common ways effect sizes can be estimated:

  1. Conducting a review of the literature and examining reported results
  2. Conducting pilot studies to get an indication of effects that might be seen in larger studies
  3. Making educated guesses based on what is clinically or practically meaningful and informed by experience
  4. Using conventional recommendations for effect size measures

One common measure of effect is the correlation coefficient, r. In general, small effects, or r=.10, indicate that the effect explains 1% of the total variance. Likewise, r=.30 is considered a medium effect, and r=.50 is considered large, explaining 25% of the variance and holding greater clinical relevance. The square of a correlational r-value indicates the proportion of variance explained by the relationship tested. Similarly, confidence intervals offer a way to determine the clinical strength or magnitude of observed effects.[10] A 95% confidence interval indicates a range of plausible values around another parameter (eg, mean or odds ratio) where there is a 95% chance that the data within that interval truly captures the value observed in the population being studied.[4] Confidence intervals also provide information about accuracy, as smaller intervals suggest greater precision, whereas larger intervals may suggest a high level of variability. It has been recommended that, at a minimum, studies report estimates of effect and confidence intervals to allow for appropriate interpretation of their results.[9] It is also important to note that although a study may be designed and statistically tested in a way that suggests inference and causation could be concluded (eg, longitudinal observations of change over time), only studies that employ a randomized and/or controlled design permit causative declarations to be made from their results.[11]

Enhancing Healthcare Team Outcomes

Statistical analysis is essential for any clinical research. It is of greater importance to understand the clinical significance of reported results and to determine whether those results can be extrapolated to the general population. Understanding the definitions and methods described above should help better understand and usability for medical professionals and students. 


Details

Editor:

Sameh W. Boktor

Updated:

3/13/2023 3:49:36 PM

References


[1]

Psoter KJ, Roudsari BS, Dighe MK, Richardson ML, Katz DS, Bhargava P. Biostatistics primer for the radiologist. AJR. American journal of roentgenology. 2014 Apr:202(4):W365-75. doi: 10.2214/AJR.13.11657. Epub     [PubMed PMID: 24660735]


[2]

Nizamuddin SL, Nizamuddin J, Mueller A, Ramakrishna H, Shahul SS. Developing a Hypothesis and Statistical Planning. Journal of cardiothoracic and vascular anesthesia. 2017 Oct:31(5):1878-1882. doi: 10.1053/j.jvca.2017.04.020. Epub 2017 Apr 13     [PubMed PMID: 28778775]


[3]

Garrocho-Rangel JA, Ruiz-Rodríguez MS, Pozos-Guillén AJ. Fundamentals in Biostatistics for Research in Pediatric Dentistry: Part I - Basic Concepts. The Journal of clinical pediatric dentistry. 2017:41(2):87-94. doi: 10.17796/1053-4628-41.2.87. Epub     [PubMed PMID: 28288291]


[4]

Winters R, Winters A, Amedee RG. Statistics: a brief overview. Ochsner journal. 2010 Fall:10(3):213-6     [PubMed PMID: 21603381]

Level 3 (low-level) evidence

[5]

West CP, Dupras DM. 5 ways statistics can fool you--tips for practicing clinicians. Vaccine. 2013 Mar 15:31(12):1550-2. doi: 10.1016/j.vaccine.2012.11.086. Epub 2012 Dec 11     [PubMed PMID: 23246309]


[6]

Ferrill MJ, Brown DA, Kyle JA. Clinical versus statistical significance: interpreting P values and confidence intervals related to measures of association to guide decision making. Journal of pharmacy practice. 2010 Aug:23(4):344-51. doi: 10.1177/0897190009358774. Epub 2010 Apr 13     [PubMed PMID: 21507834]


[7]

Wellek S. A critical evaluation of the current "p-value controversy". Biometrical journal. Biometrische Zeitschrift. 2017 Sep:59(5):854-872. doi: 10.1002/bimj.201700001. Epub 2017 May 15     [PubMed PMID: 28504870]


[8]

Ialongo C. Understanding the effect size and its measures. Biochemia medica. 2016:26(2):150-63. doi: 10.11613/BM.2016.015. Epub     [PubMed PMID: 27346958]

Level 3 (low-level) evidence

[9]

Cohen J. A power primer. Psychological bulletin. 1992 Jul:112(1):155-9     [PubMed PMID: 19565683]


[10]

Fethney J. Statistical and clinical significance, and how to use confidence intervals to help interpret both. Australian critical care : official journal of the Confederation of Australian Critical Care Nurses. 2010 May:23(2):93-7. doi: 10.1016/j.aucc.2010.03.001. Epub 2010 Mar 29     [PubMed PMID: 20347326]


[11]

Rothman KJ. Six persistent research misconceptions. Journal of general internal medicine. 2014 Jul:29(7):1060-4. doi: 10.1007/s11606-013-2755-z. Epub 2014 Jan 23     [PubMed PMID: 24452418]