I'm passionate about using data to drive impact. After earning my degree in Data Science and Statistics from Miami University, I’ve gained hands-on experience across research, industry, and higher ed — from supporting research studies to analyzing clinical trials data. Now, as a Data Analyst at the University of Toledo, I build and maintain Power BI reports and modernize legacy systems using Oracle SQL. Whether it’s education, healthcare, or beyond, I’m always excited to explore new challenges where data can make a difference.
➤ Developed 50+ Power BI reports to monitor financial transactions &
budgets, students’ registrations and performance, etc.
➤ Automated university’s reporting pipelines and migrated 40+ legacy
SAP Crystal Reports using Oracle SQL and Microsoft PowerBI
Report Builder, eliminating manual SQL queries and data processing.
➤ Rebuilt a system of PowerBI reports for Health Science College to
automate input filtering, eliminating manual .NET coding requirements.
➤ Developed advanced SQL queries in Snowflake & integrated queries
into PowerBI dashboards to automate analysis of enrollment
patterns, streamlining process on studies’ performance deliverables.
➤ Developed interactive visuals in R or RShiny for analysis of
sponsors’ proposals, offering insights for proposal strategies.
➤ Designed and executed a PowerBI app through intensive ETL process
using Snowflake SQL for bid defenses.
➤ Conducted in-depth analysis of sponsors’ proposals (enrollment
modeling, feasibility analysis, sites & countries selection, etc.)
for proposal strategies, offering insights into site recommendations,
global study locations, & potential competitions.
➤ Conducted sentiment analysis on customer interactions, identifying
sentiment distribution & customer segmentation to enhance satisfaction
➤ Consolidated data from billing, claims, etc., using Tableau
to detect revenue loss, cost drivers, & profitability trends.
➤ Utilized R, SQL and analysis techniques (diagnostic, time series,
etc.) to track agent performance & report KPIs, enhancing team
responsiveness and optimizing business processes by 20%.
➤ Conducted analysis on diverse datasets (Neuronal Phases, Linguistics,
Chronic Cough Study, etc.) for over 50 clients’ research papers and
theses by processing data and applying complex statistical methods
(Fisher’s Exact, ANOVA, etc.).
➤ Co-authored three research studies on the effectiveness of a
“Minimize Risk, Maximize Life” course administered by Envision
Partnerships, Ageism among Korean dental hygiene students, and the
effect of Postures among students
Performed data cleaning, feature engineering, and exploratory analysis on hospital EHR data to support stroke risk prediction. Evaluated multiple models using cross-validation and hyperparameter tuning to improve performance, and analyzed key predictors to identify high-risk patient groups and support clinical decision-making.
GitHubConducted data preprocessing, feature engineering, and evaluated multiple models (logistic, elastic nets, random forests, and neural networks) using cross-validation and hyperparameter tuning to predict hospitals readmission, then assessed the importance of variables to gain insights and develop strategies for reducing readmission rates.
GitHubIdentify trends and assess customer satisfaction in the legal services industry through ABA's raw data. The research explores customer demographics, question categories, sentiment analysis of conversations, and the determinants of commonly asked questions. Machine learning techniques, such as elastic net, random forests, and neural networks, are employed to develop a robust classification model that predicts the most popular question categories based on relevant predictor variables.
GitHubThis application analyzes trends in education through the years in 50 US states. The app was made with RShiny. Full user guide can be found in the GitHub.
Launch ApplicationThis program was built to train and create a model that predicts Spotify popularity scores for 70K tracks using 23 known predictor variables from raw RData. Full user guide on GitHub.
GitHubThis application takes 14 million movies tags and evaluates 27 million user ratings associated with 58,000 movies, then suggests an input number of movies based on the input genre with users’ preferences (rating scores, year released). User guide on GitHub.
GitHubKwon, J. H., Hughes, M., Vo, A., Park, J.-R., Kim, D. R., & Kang, K.-H. (2025). Are you anxious about aging? Unpacking the roots of ageism among Korean dental hygiene students. Educational Gerontology, 1–11. https://doi.org/10.1080/03601277.2025.2500724
Orwig, W., Bellaiche, L., Spooner, S., Vo, A., Baig, Z., Ragnhildstveit, A., Schacter, D. L., Barr, N., & Seli, P. (2024). Using AI to generate visual art: Do individual differences in creativity predict AI-assisted art quality? Creativity Research Journal, 1–12. https://doi.org/10.1080/10400419.2024.2440691
Kwon, J., Hughes, M., & Vo, A. (2023). Association between anxiety about aging and ageism toward older adults among Korean dental hygiene workforce. Innovation in Aging, 7(Supplement_1), 835–836. https://doi.org/10.1093/geroni/igad104.2694
Nguyen, M., Vu, D., Vo, A., Liang, L., and Giabbanelli, P. J. (2023) Preserving Simulation Insight While Removing Data: Verification of Compressed Simulation Traces via Machine Learning Annual Modeling and Simulation Conference (ANNSIM), Hamilton, ON, Canada, 2023, pp. 345-356
Gnanabharathi, B., Fahoum, S.-R. H., & Blitz, D. M. (2024, June 27). Neuropeptide modulation enables biphasic internetwork coordination via a dual-network neuron. eNeuro. https://pmc.ncbi.nlm.nih.gov/articles/PMC11211724/
Simonsen, R. (2023). Exploring similarities in 'seem' constructions with experiencers in English and Spanish. Language, 99(1), e18–e34.