Yousef Qaddura

Quantitative Analyst, Mathematics PhD

Statement

A highly motivated quantitative analyst and applied mathematician extensively trained in machine learning and data science. In my PhD research, I used geometry in representation theory to develop, analyze and implement symmetry-aware machine learning models.


With a proven track record of analyzing complex data sets and developing intricate machine learning models, I am committed to help companies advance by aiding them in developing strategic plans based on data-driven predictive modeling and findings.

Work Experience

QAP Associate — Fraud Modeling

Jan 2026 – Present
Python, Spark, Synthetic Data Generation, Diffusion models, Transformers
  • Conduct literature review and research on synthetic data generation for highly imbalanced commercial fraud datasets (~0.07% fraud rate), evaluating diffusion models, transformer-based generative models, and GenAI-based augmentation strategies for downstream model training and evaluation.

QAP Associate — Commercial Stress Testing Validation

July 2025 – Jan 2026
Gen AI, Python, Software Development, Statistical Testing
  • Built end-to-end validation automation platform combining backend services, frontend GUIs, and generative AI-driven report generation to streamline model validation and documentation workflows, reducing validation cycle time by ~40%.
  • Mentored a junior team member on quantitative validation and software tooling development.
  • Statistically validated of macroeconomic models using backtesting, stability testing, and residual diagnostics.

Quantitative Analytics Program (QAP) Intern — Credit Card Modeling

Summer 2024
Python, SQL, DataRobot, Machine Learning
  • Designed and executed an extensive permutation-importance-based feature selection process, reducing a high-dimensional dataset of ~2000 features to a refined subset.
  • Developed a definition-based feature clustering tool, enhancing feature diversity and interpretability.
  • Trained and evaluated 1000+ machine learning models in DataRobot to optimize selection performance.

Skills

Projects

Food Recommendation System

Python, pandas, Latent Factor Model, Collaborative Team-work
  • Completed as part of the Fall 2023 Erdos Institue Data Science Bootcamp.
  • Implemented a recommender system that uses the Yelp Dataset to suggest new restaurants to the user by using only their numerical ratings of restaurants.
  • Tuned for an optimal mixing hyperparameter in a mixture model of a Latent Factorization Recommender System and a robust averaging baseline model.
View in Github

Stomper & Wombat's Database Design

SQL, EER Diagrams, Collaborative Team-work
  • Put together a requirements document which contained well-written & categorized business rules.
  • Formulated a conceptual design using an EER diagram and obtained a relational model at the BCNF level.
  • Compiled relevant external views and reports for stakeholders.
  • Implemented and tested the database in MySQL server.
View in Github

Palmer Penguins Regression Analysis

R, R Markdown, ggplot, Regression Analysis, Bootstrap, Cross-Validation
  • Conducted an exploratory data analysis on the palmer penguins dataset.
  • Carried out a best-subsets regression to obtain a preferred model.
  • Performed a cross-validation for assessment and interpreted bootstrap confidence intervals.
View in Github

Discriminating 7 and 9 with LDA

R, R Markdown, ggplot, Dimensionality Reduction, LDA
  • Employed multivariate statistical analysis techniques of principal component analysis and linear discriminant analysis to train a classifier for 7 & 9 digit images.
  • Assessed an optimal number of components based on three types of error rates: apparent, test and leave-one-out.
View in Github

Education

The Ohio State University (OSU) | Columbus, OH

July 2020 - May 2025

PhD in Mathematics Summa Cum Laude

MSc in Applied Statistics (Dual Degree, Graduated 2023) Summa Cum Laude

Swarthmore College | Swarthmore, PA

August 2016 - May 2020

BA in Mathematics and Computer Science Summa Cum Laude

Research

Estimating the Euclidean distortion of an orbit space

(In collaboration with Ben Blum-Smith, Harm Derksen, Dustin G. Mixon and Brantley Vose)
Invariant Machine Learning, Bilipschitz Invariant Theory, Euclidean Contortion
  • Developed general tools for bounding the distortion of metric quotients.
  • Applied these tools to various families of quotients by groups of Euclidean isometries.
View Arxiv Pre-Print

Group-Invariant Max Filtering

(In collaboration with my PhD advisor, Dustin G. Mixon)
Matlab, Invariant Machine Learning, Probabilistic Methods
  • Used linear algebra and Voronoi analysis to obtain sufficient conditions for the stability and injectivity of max filtering, a recently proposed symmetry-invariant data embedding in machine learning.
  • Developed Matlab scripts and utilized linear programming to compute the G-Voronoi characteristic, a newly introduced quantity associated to the representation of a finite group and its max filtering theory.
View Publication

Stable Weighted Phase Retrieval

Invariant Machine Learning, Differential & Semi-Algebraic Geometry
  • Generalized the phase retrieval problem to the weighted setting, highlighting its connections to a nearest neighbor problem in cryogenic electron microscopy (cryo-EM), a leading technique in molecular imaging.
  • Used Voronoi analysis, semi-algebraic geometry, and differential geometry to establish sufficient conditions for stability in weighted phase retrieval. This serves as a special case of a broader theory on the local regular stability of max filtering, which the work addresses.
View Publication

Vector Borne Disease Spread DDE Modeling

(in collaboration with my undergraduate mentor Nsoki Mavinga)
Delay differential equations, Matlab, Linear Stability Analysis
  • Reframed a traditional model predicting spread of infectious, vector-borne diseases.
  • Employed linearization methods to prove stability conditions.
  • Developed MATLAB interactive applications to simulate numerical solutions of various delay differential equations with real-time parameter tuning.
View Publication