Yousef Qaddura

Math PhD Candidate & Data Scientist

Statement

A highly motivated applied mathematician extensively trained in statistics and data science. In my PhD research, I used geometry in representation theory to develop, analyze and implement symmetry-aware machine learning models.


With a proven track record of analyzing complex data sets and developing intricate machine learning models, I am committed to help companies advance by aiding them in developing strategic plans based on data-driven predictive modeling and findings.

Work Experience

Quantitative Analytics Intern - Summer 2024
Quantitative Analytics Specialist - Starting July 2025

Skills

Projects

Food Recommendation System

Python, pandas, Latent Factor Model, Collaborative Team-work
  • Completed as part of the Fall 2023 Erdos Institue Data Science Bootcamp.
  • Implemented a recommender system that uses the Yelp Dataset to suggest new restaurants to the user by using only their numerical ratings of restaurants.
  • Tuned for an optimal mixing hyperparameter in a mixture model of a Latent Factorization Recommender System and a robust averaging baseline model.
View in Github

Palmer Penguins Regression Analysis

R, R Markdown, ggplot, Regression Analysis, Bootstrap, Cross-Validation
  • Conducted an exploratory data analysis on the palmer penguins dataset.
  • Carried out a best-subsets regression to obtain a preferred model.
  • Performed a cross-validation for assessment and interpreted bootstrap confidence intervals.
View in Github

Stomper & Wombat's Database Design

SQL, EER Diagrams, Collaborative Team-work
  • Put together a requirements document which contained well-written & categorized business rules.
  • Formulated a conceptual design using an EER diagram and obtained a relational model at the BCNF level.
  • Compiled relevant external views and reports for stakeholders.
  • Implemented and tested the database in MySQL server.
View in Github

Discriminating 7 and 9 with LDA

R, R Markdown, ggplot, Dimensionality Reduction, LDA
  • Employed multivariate statistical analysis techniques of principal component analysis and linear discriminant analysis to train a classifier for 7 & 9 digit images.
  • Assessed an optimal number of components based on three types of error rates: apparent, test and leave-one-out.
View in Github

Education

The Ohio State University (OSU) | Columbus, OH

July 2020 - May 2025

PhD in Mathematics Summa Cum Laude

MSc in Applied Statistics (Dual Degree, Graduated 2023) Summa Cum Laude

Swarthmore College | Swarthmore, PA

August 2016 - May 2020

BA in Mathematics and Computer Science Summa Cum Laude

Research

Group-Invariant Max Filtering

(In collaboration with my PhD advisor, Dustin G. Mixon)
Matlab, Invariant Machine Learning, Probabilistic Methods
  • Used linear algebra and Voronoi analysis to obtain sufficient conditions for the stability and injectivity of max filtering, a recently proposed symmetry-invariant data embedding in machine learning.
  • Developed Matlab scripts and utilized linear programming to compute the G-Voronoi characteristic, a newly introduced quantity associated to the representation of a finite group and its max filtering theory.
View Arxiv Pre-Print

Stable Weighted Phase Retrieval

Invariant Machine Learning, Differential & Semi-Algebraic Geometry
  • Generalized the phase retrieval problem to the weighted setting, highlighting its connections to a nearest neighbor problem in cryogenic electron microscopy (cryo-EM), a leading technique in molecular imaging.
  • Used Voronoi analysis, semi-algebraic geometry, and differential geometry to establish sufficient conditions for stability in weighted phase retrieval. This serves as a special case of a broader theory on the local regular stability of max filtering, which the work addresses.
View Arxiv Pre-Print

Vector Borne Disease Spread DDE Modeling

(in collaboration with my undergraduate mentor Nsoki Mavinga)
Delay differential equations, Matlab, Linear Stability Analysis
  • Reframed a traditional model predicting spread of infectious, vector-borne diseases.
  • Employed linearization methods to prove stability conditions.
  • Developed MATLAB interactive applications to simulate numerical solutions of various delay differential equations with real-time parameter tuning.
View Publication