Studying for an MSc in Data Science and Machine Learning at UCL, focused on NLP and computer vision. Previously: an MSc in Astrophysics at Imperial College London, then NLP work at LemonAI, an AI startup in London. I'm driven to build models that are both genuinely useful and technically rigorous.
I am looking to build genuinely useful and technically rigorous models. Open to ML and data science roles, ideally as an ML engineer.
Benchmarked cross-lingual factual recall across 100K Wikidata facts and 12 languages; led mechanistic interpretability analysis identifying language-specific MLP neurons and showing GRPO training shifts processing to deeper, more general layers.
Three-branch CNN trained on 1.5M+ augmented frames with custom data preprocessing, achieving 1.4 cm MAE on a held-out test set and deployed as a real-time mobile prototype.
Transfer learning on DenseNet169 with data augmentation, achieving 81% test accuracy on a class-balanced dataset of 40K+ labelled images. Presented at Le Wagon Demo Day.
Hybrid rule-based + Claude API data augmentation pipeline synthesising realistic transcription noise (disfluencies, phonetic substitutions, filled pauses, repetitions) to expand training distribution and reduce OOD degradation in production speech NLP models.
Computer Vision · Statistical Data Science · Applied ML/DL · Reinforcement Learning · Bayesian DL · Statistical NLP
General Relativity · Cosmology · Research Computing
Thesis: Evaluating FLARES Simulations for High Redshift 'Little Red Dots' →