Data Science & Bioinformatics
London, United Kingdom
Biomedical scientist with 7+ years in clinical and genomic laboratories, now focused on bioinformatics and data-driven analysis of sequencing data.
I specialise in building and analysing NGS pipelines, combining wet-lab expertise with computational workflows to extract biologically meaningful insights.
- Clinical experience: UCLH, CooperGenomics (NGS, ATMPs, reporting)
- Strong in Python, R, SQL, and Bash for genomic data workflows
- Experience with RNA-seq, variant calling, and pipeline development
Languages: Python · R · SQL · Bash
Bioinformatics: HISAT2 · STAR · SAMtools · BEDtools · bcftools · DESeq2
Data Viz: Tableau · ggplot2 · seaborn · matplotlib
Cloud: AWS · Google Cloud · GKE (Kubernetes)
| Project | Description | Tools |
|---|---|---|
| M. tuberculosis WGS Variant Analysis Workflow | Galaxy-based workflow for QC, trimming, alignment, coverage assessment, variant calling, annotation, and IGV-supported review of resistance-associated loci in Mycobacterium tuberculosis | Galaxy · BWA-MEM2 · Picard · SAMtools · mosdepth · bcftools · SnpEff · SnpSift · MultiQC · IGV |
| RNA-seq Pipeline | Containerised RNA-seq workflow built with Nextflow and Docker, covering QC, trimming, alignment, quantification, MultiQC reporting, and differential expression analysis | Nextflow · Docker · FastQC · Cutadapt · STAR · featureCounts · MultiQC · DESeq2 |
| Genomic Data Science | End-to-end RNA-seq & variant analysis using HISAT2, StringTie, and DESeq2 | Python · R · Bash · Bioconductor |
| Salifort Motors | Predictive modelling to understand drivers of employee turnover and inform retention strategy | XGBoost · NumPy · SciPy · scikit-learn · Pandas · Statsmodels |
| TikTok Project | Exploratory analysis of engagement metrics to uncover content trends and optimisation levers | Matplotlib · Seaborn · Plotly · SciPy |
| Bellabeat Case Study | Fitbit data analysis and Tableau dashboard | R · dplyr · Tableau · SQL |
| AWS Solution Architecture | Cloud deployment diagrams & IaC design | AWS · ECS · S3 · Aurora |
| Fiber Business Intelligence Capstone | Data integration and visualization for business insights | BigQuery · Tableau · SQL |
| Portfolio Website | Personal website showcasing bioinformatics and data projects | HTML · CSS · JS |
MSc Bioinformatics - Atlantic Technological University (Remote) - 2025-Present
MSc Cell & Gene Therapy - University College London - 2021-2023
BSc Biomedical Science - University of Catania - 2014-2017
Machine Learning & AI Certifications:
IBM: Machine Learning · AI Engineering
Data, Analytics, and Cloud Certifications:
Google: Data Analytics · Advanced Data Analytics · IT Automation with Python · Project Management · Business Intelligence
Google Cloud: Architecting with Google Kubernetes Engine
Amazon Web Services (AWS): Cloud Practitioner Essentials · Cloud Solutions Architect
Bioinformatics Certifications:
Johns Hopkins University: Genomic Data Science Specialization
Wellcome: Bioinformatics for Biologists: An Introduction to Linux, Bash Scripting, and R; Analysing and Interpreting Genomics Datasets
Programming & Data Science Courses:
freeCodeCamp: Data Analysis with Python; Relational Databases; Scientific Computing with Python & Databases
DE<code>LIFE: Genomes, Networks & Pathways; Data Science & Machine Learning with Python
