Types of Variables and Commonly Used Statistical Designs

Jacob Shreffler; Martin Huecker

Types of Variables and Commonly Used Statistical Designs

Author: Jacob Shreffler Editor: Martin R. Huecker Updated: 3/6/2023 2:33:06 PM

Definition/Introduction

Suitable statistical design represents a critical factor in permitting inferences from any research or scientific study.[1] Numerous statistical designs are implementable due to the advancement of software available for extensive data analysis.[1] Healthcare providers must possess some statistical knowledge to interpret new studies and provide up-to-date patient care. We present an overview of the types of variables and commonly used designs to facilitate this understanding.[2]

Issues of Concern

Register For Free And Read The Full Article

Get the answers you need instantly with the StatPearls Clinical Decision Support tool. StatPearls spent the last decade developing the largest and most updated Point-of Care resource ever developed. Earn CME/CE by searching and reading articles.

Search engine and full access to all medical articles
10 free questions in your specialty
Free CME/CE Activities

Free daily question in your email
Save favorite articles to your dashboard
Emails offering discounts

Learn more about a Subscription to StatPearls Point-of-Care

Issues of Concern

Individuals who attempt to conduct research and choose an inappropriate design could select a faulty test and make flawed conclusions. This decision could lead to work being rejected for publication or (worse) lead to erroneous clinical decision-making, resulting in unsafe practice.[1] By understanding the types of variables and choosing tests that are appropriate to the data, individuals can draw appropriate conclusions and promote their work for an application.[3]

Variables

To determine which statistical design is appropriate for the data and research plan, one must first examine the scales of each measurement.[4] Multiple types of variables determine the appropriate design.

Ordinal data (also sometimes referred to as discrete) provide ranks and thus levels of degree between the measurement.[5] Likert items can serve as ordinal variables, but the Likert scale, the result of adding all the times, can be treated as a continuous variable.[6] For example, on a 20-item scale with each item ranging from 1 to 5, the item itself can be an ordinal variable, whereas if you add up all items, it could result in a range from 20 to 100. A general guideline for determining if a variable is ordinal vs. continuous: if the variable has more than ten options, it can be treated as a continuous variable.[7] The following examples are ordinal variables:

Likert items
Cancer stages
Residency Year

Nominal, Categorical, Dichotomous, Binary

Other types of variables have interchangeable terms. Nominal and categorical variables describe samples in groups based on counts that fall within each category, have no quantitative relationships, and cannot be ranked.[8] Examples of these variables include:

Service (i.e., emergency, internal medicine, psychiatry, etc.)
Ethnicity
Mode of Arrival (ambulance, helicopter, car)

A dichotomous or a binary variable is in the same family as nominal/categorical, but this type has only two options. Binary logistic regression, which will be discussed below, has two options for the outcome of interest/analysis. Often used as (yes/no), examples of dichotomous or binary variables would be:

Alive (yes vs. no)
Insurance (yes vs. no)
Readmitted (yes vs. no)

With this overview of the types of variables provided, we will present commonly used statistical designs for different scales of measurement. Importantly, before deciding on a statistical test, individuals should perform exploratory data analysis to ensure there are no issues with the data and consider type I, type II errors, and power analysis. Furthermore, investigators should ensure appropriate statistical assumptions.[9][10] For example, parametric tests, including some discussed below (t-tests, analysis of variance (ANOVA), correlation, and regression), require the data to have a normal distribution and that the variances within each group are similar.[6][11] After eliminating any issues based on exploratory data analysis and reducing the likelihood of committing type I and type II errors, a statistical test can be chosen. Below is a brief introduction to each of the commonly used statistical designs with examples of each type. An example of one research focus, with each type of statistical design discussed, can be found in Table 1 to provide more examples of commonly used statistical designs.

Commonly Used Statistical Designs

Independent Samples T-test

An independent samples t-test allows a comparison of two groups of subjects on one (continuous) variable. Examples in biomedical research include comparing results of treatment vs. control group and comparing differences based on gender (male vs. female).

Example: Does adherence to the ketogenic diet (yes/no; two groups) have a differential effect on total sleep time (minutes; continuous)?

Paired T-test

A paired t-test analyzes one sample population, measuring the same variable on two different occasions; this is often useful for intervention and educational research.

Example: Does participating in a research curriculum (one group with intervention) improve resident performance on a test to measure research competence (continuous)?

One-Way Analysis of Variance (ANOVA)

Analysis of variance (ANOVA), as an extension of the t-test, determines differences amongst more than two groups, or independent variables based on a dependent variable.[11] ANOVA is preferable to conducting multiple t-tests as it reduces the likelihood of committing a type I error.

Example: Are there differences in length of stay in the hospital (continuous) based on the mode of arrival (car, ambulance, helicopter, three groups)?

Repeated Measures ANOVA

Another procedure commonly used if the data for individuals are recurrent (repeatedly measured) is a repeated-measures ANOVA.[1] In these studies, multiple measurements of the dependent variable are collected from the study participants.[11] A within-subjects repeated measures ANOVA determines effects based on the treatment variable alone, whereas mixed ANOVAs allow both between-group effects and within-subjects to be considered.

Within-Subjects Example: How does ketamine effect mean arterial pressure (continuous variable) over time (repeated measurement)?

Mixed Example: Does mean arterial pressure (continuous) differ between males and females (two groups; mixed) on ketamine throughout a surgical procedure (over time; repeated measurement)?

Nonparametric Tests

Nonparametric tests, such as the Mann-Whitney U test (two groups; nonparametric t-test), Kruskal Wallis test (multiple groups; nonparametric ANOVA), Spearman’s rho (nonparametric correlation coefficient) can be used when data are ordinal or lack normality.[3][5] Not requiring normality means that these tests allow skewed data to be analyzed; they require the meeting of fewer assumptions.[11]

Example: Is there a relationship between insurance status (two groups) and cancer stage (ordinal)?

Chi-square

A Chi-square test determines the effect of relationships between categorical variables, which determines frequencies and proportions into which these variables fall.[11] Similar to other tests discussed, variants and extensions of the chi-square test (e.g., Fisher’s exact test, McNemar’s test) may be suitable depending on the variables.[8]

Example: Is there a relationship between individuals with methamphetamine in their system (yes vs. no; dichotomous) and gender (male or female; dichotomous)?

Correlation

Correlations (used interchangeably with ‘associations’) signal patterns in data between variables.[1] A positive association occurs if values in one variable increase as values in another also increase. A negative association occurs if variables in one decrease while others increase. A correlation coefficient, expressed as r, describes the strength of the relationship: a value of 0 means no relationship, and the relationship strengthens as r approaches 1 (positive relationship) or -1 (negative association).[5]

Example: Is there a relationship between age (continuous) and satisfaction with life survey scores (continuous)?

Linear Regression

Regression allows researchers to determine the degrees of relationships between a dependent variable and independent variables and results in an equation for prediction.[11] A large number of variables are usable in regression methods.

Example: Which admission to the hospital metrics (multiple continuous) best predict the total length of stay (minutes; continuous)?

Binary Logistic Regression

This type of regression, which aims to predict an outcome, is appropriate when the dependent variable or outcome of interest is binary or dichotomous (yes/no; cured/not cured).[12]

Example: Which panel results (multiple of continuous, ordinal, categorical, dichotomous) best predict whether or not an individual will have a positive blood culture (dichotomous/binary)?

The table provides more examples of commonly used statistical designs by providing an example of one research focus and discussing each type of statistical design (see Table. Types of Variables and Statistical Designs).

Clinical Significance

Though numerous other statistical designs and extensions of methods covered in this article exist, the above information provides a starting point for healthcare providers to become acquainted with variables and commonly used designs. Researchers should study types of variables before determining statistical tests to obtain relevant measures and valid study results.[6] There is a recommendation to consult a statistician to ensure appropriate usage of the statistical design based on the variables and that the assumptions are upheld.[1] With the variety of statistical software available, investigators must a priori understand the type of statistical tests when designing a study.[13] All providers must interpret and scrutinize journal publications to make evidence-based clinical decisions, and this becomes enhanced by a limited but sound understanding of variables and commonly used study designs.[14]

Nursing, Allied Health, and Interprofessional Team Interventions

All interprofessional healthcare team members need to be familiar with study design and the variables used in studies to accurately evaluate new data and studies as they are published and apply the latest data to patient care and drive optimal outcomes.

Media

(Click Image to Enlarge)

Types of Variables and Statistical Designs.

Contributed by M Huecker, MD, and J Shreffler, PhD

References

[1]

Cassidy LD. Basic concepts of statistical analysis for surgical research. The Journal of surgical research. 2005 Oct:128(2):199-206 [PubMed PMID: 16140341]

[2]

Scales CD Jr, Peterson B, Dahm P. Interpreting statistics in the urological literature. The Journal of urology. 2006 Nov:176(5):1938-45 [PubMed PMID: 17070214]

[3]

Oliver D, Mahon SM. Reading a research article part II: parametric and nonparametric statistics. Clinical journal of oncology nursing. 2005 Apr:9(2):238-40 [PubMed PMID: 15853166]

Level 2 (mid-level) evidence

[4]

McCrum-Gardner E. Which is the correct statistical test to use? The British journal of oral & maxillofacial surgery. 2008 Jan:46(1):38-41 [PubMed PMID: 17961892]

[5]

Neideen T, Brasel K. Understanding statistical tests. Journal of surgical education. 2007 Mar-Apr:64(2):93-6 [PubMed PMID: 17462209]

Level 3 (low-level) evidence

[6]

Mayya SS, Monteiro AD, Ganapathy S. Types of biological variables. Journal of thoracic disease. 2017 Jun:9(6):1730-1733. doi: 10.21037/jtd.2017.05.75. Epub [PubMed PMID: 28740689]

[7]

Mishra P, Pandey CM, Singh U, Gupta A. Scales of measurement and presentation of statistical data. Annals of cardiac anaesthesia. 2018 Oct-Dec:21(4):419-422. doi: 10.4103/aca.ACA_131_18. Epub [PubMed PMID: 30333338]

[8]

Taub PJ, Westheimer E. Biostatistics. Plastic and reconstructive surgery. 2009 Aug:124(2):200e-208e. doi: 10.1097/PRS.0b013e3181addcd9. Epub [PubMed PMID: 19644245]

Level 2 (mid-level) evidence

[9]

Oliver D, Mahon SM. Reading a research article part I: Types of variables. Clinical journal of oncology nursing. 2005 Feb:9(1):110-2 [PubMed PMID: 15751506]

[10]

Chew BH. Planning and Conducting Clinical Research: The Whole Process. Cureus. 2019 Feb 20:11(2):e4112. doi: 10.7759/cureus.4112. Epub 2019 Feb 20 [PubMed PMID: 31058006]

[11]

Giuliano KK, Polanowicz M. Interpretation and use of statistics in nursing research. AACN advanced critical care. 2008 Apr-Jun:19(2):211-22. doi: 10.1097/01.AACN.0000318124.33889.6e. Epub [PubMed PMID: 18560290]

[12]

Jupiter DC. A variety of variables. The Journal of foot and ankle surgery : official publication of the American College of Foot and Ankle Surgeons. 2014 Jan-Feb:53(1):124-5. doi: 10.1053/j.jfas.2013.06.001. Epub 2013 Jun 18 [PubMed PMID: 23790408]

Level 2 (mid-level) evidence

[13]

De Muth JE. Preparing for the first meeting with a statistician. American journal of health-system pharmacy : AJHP : official journal of the American Society of Health-System Pharmacists. 2008 Dec 15:65(24):2358-66. doi: 10.2146/ajhp070007. Epub [PubMed PMID: 19052282]

[14]

Rickard CM. Statistics for clinical nursing practice: an introduction. Australian critical care : official journal of the Confederation of Australian Critical Care Nurses. 2008 Nov:21(4):216-9. doi: 10.1016/j.aucc.2008.08.004. Epub 2008 Oct 15 [PubMed PMID: 18926715]