How to Write a Data Analysis Report

A data analysis report communicates statistical findings to a mixed audience — from the raw question and dataset through cleaning, analysis, and visualisation to clear, evidence-based conclusions. This guide covers every stage for STEM, data science, and research methods students.

Data Cleaning Statistics Visualisation Interpretation Data Science

What Is a Data Analysis Report?

A data analysis report documents the full analytical process applied to a dataset — from the research question and data source through preparation, statistical testing, and visualisation, to conclusions and recommendations. It is different from a research paper (which reports original primary research) in that it often uses existing or secondary data, and the primary contribution is the analytical insight, not data collection.

Data analysis reports are common in: statistics and research methods modules, data science and machine learning projects, public health and epidemiology studies, environmental monitoring, engineering quality control, and industry internship or placement reports.

Report Structure

SectionContent
TitleDataset name, analytical question, date
Executive SummaryQuestion, key finding, main recommendation (1 page)
IntroductionBackground, research question, why this analysis matters
Data DescriptionSource, structure, variables, collection method
Data CleaningMissing values, outliers, transformations — what was done and why
Exploratory Data Analysis (EDA)Descriptive statistics, distributions, initial visualisations
Statistical AnalysisTests performed, with assumptions and outputs
ResultsFindings from the analysis with supporting figures/tables
DiscussionInterpretation, limitations, comparison to prior work
Conclusions / RecommendationsAnswers to the research question, actionable outputs
ReferencesDataset sources, statistical method references
AppendicesFull code, additional plots, raw output tables

Step 1 — Define the Research Question

Every analysis must answer a specific question. Vague questions produce vague conclusions. Before touching the data, state:

Step 2 — Describe the Data

Before any analysis, describe the dataset transparently:

Step 3 — Data Cleaning

Data cleaning is the most time-consuming phase of any real data analysis. Document every decision — your report must be reproducible. Key issues to address and report:

Always report what you started with and what you ended with. "The original dataset contained 12,450 records. After removing 340 duplicates and 128 records with missing outcome data, the analysis dataset contained 11,982 records." This transparency is a mark of professional-quality reporting.

Need help with your data analysis report?

Our data science and statistics specialists run the analysis and write the full report — EDA, statistical testing, visualisations, and APA/IEEE referencing.

Get Help Now →

Step 4 — Exploratory Data Analysis (EDA)

EDA is the phase where you understand the data before applying formal statistical tests. Report:

EDA findings should inform your choice of statistical test. If the data is heavily skewed, you may need non-parametric tests. If groups are very unequal in size, this may affect power calculations.

Step 5 — Statistical Analysis

Report each statistical test with: the test name, the null hypothesis, the test result, the p-value, effect size, and confidence interval. Always check and report whether test assumptions were met.

Choosing the right test

Question typeData typeAppropriate test
Compare two groupsContinuous, normalIndependent t-test
Compare two groupsContinuous, non-normalMann-Whitney U
Compare 3+ groupsContinuous, normalOne-way ANOVA
Compare 3+ groupsNon-normalKruskal-Wallis
Association between two categorical variablesCategoricalChi-squared test
Correlation between two continuous variablesContinuous, normalPearson's r
Correlation between two variablesOrdinal or non-normalSpearman's ρ
Predict a continuous outcomeMixedLinear regression
Predict a binary outcomeMixedLogistic regression

Visualisation — Best Practice

Effective data visualisation communicates findings that text and tables cannot. Rules for STEM data analysis reports:

Common Mistakes

Frequently Asked Questions

Should I include my code in the report?

In most academic submissions, code goes in an appendix — not in the main body. The main report should be readable by someone who does not write code; the appendix satisfies the reproducibility requirement. Some data science-specific modules ask for a Jupyter notebook or R Markdown document instead of or alongside a Word/PDF report — check your brief.

How do I handle a dataset with no significant results?

Report the results honestly and discuss why the expected effect was not found. Null results are scientifically valid — they tell us something. Discuss whether the sample was large enough (statistical power), whether the measurement was sensitive enough, and what the result does or does not imply about the research question. Never manipulate the analysis to produce significance (p-hacking).

What citation style should I use?

For data sources: cite the dataset itself (with DOI or URL, access date, and publisher/depositor). For statistical methods: cite the original paper that introduced the test or model. For software: cite the software package (e.g., "R version 4.3.0 (R Core Team, 2023)" or "Python 3.11 with scikit-learn 1.3"). Use APA or the citation style specified in your module guide.