Introduction
In biostatistical research, understanding the simultaneous influence of multiple independent variables on a single dependent variable is essential. Multiple regression analysis serves as one of the most powerful statistical tools for this purpose. It enables researchers to model complex biological relationships by quantifying how multiple predictors jointly explain variation in a dependent outcome.
This article presents a comprehensive journal-style interpretation of a multiple regression analysis performed in MedCalc. The dependent variable in this study is Systolic Blood Pressure (SBP, mmHg), while the independent variables include Age (years), Body Mass Index (BMI, kg/m²), and Cholesterol (mg/dL).
The analysis aims to determine the relative contribution of these biological predictors toward explaining variability in systolic blood pressure among 25 subjects. Key parameters such as coefficients, standard errors, t-values, P-values, R², F-ratio, and residual analysis are discussed in depth. The interpretation follows a structured, journal-ready format suitable for publication in biomedical and statistical journals.
Objective
To evaluate the combined and individual effects of age, BMI, and cholesterol levels on systolic blood pressure (SBP) using multiple regression analysis in MedCalc software.
Methodology Overview
- Software Used: MedCalc Statistical Software
- Dependent Variable: Systolic_BP (mmHg)
- Independent Variables:
- Age (years)
- BMI (kg/m²)
- Cholesterol (mg/dL)
- Sample Size (n): 25
- Regression Method: Least Squares Multiple Regression (Enter method)
- Assumption Testing: Shapiro-Wilk test for normality of residuals
Summary of Statistical Output
| Parameter | Result |
|---|---|
| Sample Size (n) | 25 |
| R² (Coefficient of Determination) | 0.9980 |
| Adjusted R² | 0.9978 |
| Multiple Correlation Coefficient (R) | 0.9990 |
| Residual Standard Deviation | 0.7199 |
| F-ratio | 3569.8186 |
| Significance Level (p) | < 0.0001 |
| Normality (Shapiro–Wilk Test) | W = 0.9780, p = 0.8436 (normal) |
Regression Equation
The fitted multiple regression equation from MedCalc is:
Systolic BP (mmHg)= 66.4063 + 0.7029(Age) − 0.3667(BMI) + 0.2356(Cholesterol)
Each coefficient represents the expected change in systolic blood pressure for a one-unit change in the respective predictor, holding all other variables constant.
Interpretation of Coefficients
| Independent Variable | Coefficient (B) | Std. Error | t-value | P-value | Partial r | VIF | Interpretation |
|---|---|---|---|---|---|---|---|
| Age (years) | 0.7029 | 0.1191 | 5.901 | <0.0001 | 0.7898 | 172.032 | Statistically significant positive effect on SBP. |
| BMI (kg/m²) | -0.3667 | 0.3525 | -1.040 | 0.3100 | -0.2214 | 72.688 | Negative but non-significant effect on SBP. |
| Cholesterol (mg/dL) | 0.2356 | 0.1278 | 1.844 | 0.0793 | 0.3733 | 358.132 | Positive but marginally non-significant e |
Interpretation Summary
- Age (years) is the only statistically significant predictor (p < 0.0001). This indicates that with every additional year of age, systolic blood pressure increases by approximately 0.70 mmHg, keeping BMI and cholesterol constant.
- BMI (kg/m²) shows a slight negative relationship (B = -0.3667), though the effect is not significant (p = 0.3100). This suggests BMI is not a major independent determinant of SBP in this dataset.
- Cholesterol (mg/dL) displays a positive but marginal effect (B = 0.2356, p = 0.0793), suggesting a potential trend where higher cholesterol levels may elevate SBP, although statistical evidence is insufficient at the 5% level.
Model Performance and Goodness of Fit
The coefficient of determination (R² = 0.9980) implies that 99.8% of the total variation in systolic blood pressure can be explained by the combined effect of age, BMI, and cholesterol levels.
This extremely high R² indicates a strong predictive model. The adjusted R² (0.9978) remains nearly the same, confirming that the model retains accuracy even after adjusting for the number of predictors.
The F-ratio (3569.8186, p < 0.0001) further supports that the overall regression model is highly significant, meaning at least one predictor has a statistically significant relationship with SBP.
Analysis of Variance (ANOVA)
| Source | DF | Sum of Squares | Mean Square | F-ratio | P-value |
|---|---|---|---|---|---|
| Regression | 3 | 5549.6777 | 1849.8926 | 3569.8186 | <0.0001 |
| Residual (Error) | 21 | 10.8823 | 0.5182 | — | — |
| Total | 24 | — | — | — | — |
The ANOVA table indicates that the regression model is statistically significant (p < 0.0001). The large F-ratio (3569.8186) reflects that the explained variance is much higher than the unexplained variance, confirming the model’s robustness.
Correlation Matrix
| Variables | Systolic_BP | Age (years) | BMI (kg/m²) | Cholesterol (mg/dL) |
|---|---|---|---|---|
| Systolic_BP | 1.0000 | 0.9988 | 0.9804 | 0.9963 |
| Age (years) | 0.9988 | 1.0000 | 0.9801 | 0.9960 |
| BMI (kg/m²) | 0.9804 | 0.9801 | 1.0000 | 0.9905 |
| Cholesterol (mg/dL) | 0.9963 | 0.9960 | 0.9905 | 1.0000 |
All predictor variables are highly correlated with SBP, particularly age (r = 0.9988). However, the high multicollinearity (VIF values above 100) suggests strong intercorrelation among predictors, which could inflate standard errors and reduce the reliability of individual coefficients.
Residual and Assumption Analysis
Normality of Residuals
The Shapiro–Wilk test (W = 0.9780, p = 0.8436) indicates no significant deviation from normality, satisfying one of the main assumptions of multiple regression.
Multicollinearity
The Variance Inflation Factors (VIF) are notably high (ranging from 72.7 to 358.1), signaling severe multicollinearity among age, BMI, and cholesterol. In journal discussion, it’s important to note that such multicollinearity may:
- Inflate standard errors
- Reduce the precision of estimated coefficients
- Make individual predictors appear statistically insignificant
To resolve this, variable reduction or principal component analysis could be applied in future research.
Biological Interpretation
The results indicate that age is the most influential determinant of systolic blood pressure among the selected predictors. This finding is biologically plausible since arterial stiffness and vascular resistance typically increase with age, contributing to higher blood pressure.
The non-significant relationship between BMI and SBP may arise from limited sample size or overlapping effects with age and cholesterol. Cholesterol shows a mild positive trend, suggesting that lipid metabolism may influence vascular health and pressure regulation.
Together, these results reinforce the multifactorial nature of hypertension, where age-related physiological changes play a dominant role.
Discussion
The exceptionally high R² indicates that the model fits the dataset extremely well. However, caution should be exercised, as the presence of high multicollinearity may cause overfitting and limit generalizability.
This regression model demonstrates how multiple physiological predictors interact to influence blood pressure, emphasizing the value of multivariate analysis in biostatistics. For publication purposes, it’s crucial to discuss these aspects transparently to maintain analytical integrity.
Conclusion
The multiple regression analysis performed in MedCalc demonstrates a significant relationship between systolic blood pressure and age, with cholesterol showing a minor positive effect and BMI contributing negatively but insignificantly.
Key findings include:
- Age is a highly significant predictor (p < 0.0001).
- Model fit is excellent (R² = 0.9980, Adjusted R² = 0.9978).
- Residuals are normally distributed, confirming model validity.
- Multicollinearity is substantial and should be addressed in larger or more diversified datasets.
Overall, this model provides a strong statistical basis for understanding age-related increases in systolic blood pressure and supports the importance of multivariate biostatistical approaches in biomedical research.



