Ming Xia

I'm a Data Scientist with a strong foundation in Computer Science and a passion for building intelligent, data-driven solutions. My work spans a wide range of technical domains—from deep learning for medical imaging and mortgage risk prediction to interactive data visualization and full-stack development. I specialize in applying machine learning models to real-world challenges with an emphasis on both predictive accuracy and model interpretability. With hands-on experience in tools like Python, SQL, TensorFlow, Tableau, and cloud platforms such as AWS and GCP, I enjoy transforming complex datasets into actionable insights. Outside of my professional work, I lead several community clubs focused on technology, tea culture, and automotive adventures, where I build lifelong relationships and foster opportunities for growth and collaboration.

Experience

Data Science Intern

EchoPlus LLC
  • Integrated six large datasets from Yelp, covering 1.32M users and 5.26M reviews, using Python and advanced text mining techniques.
  • Performed sentiment analysis and vectorization to uncover insights affecting elite user retention; designed 3D root cause analysis reports.
  • Delivered actionable insights to boost business growth and improve retention strategies.
  • Analyzed behavioral patterns of 200K+ news app users for click prediction and resolved cold-start issues, boosting CTR by 20%.
  • Led A/B testing on mobile gaming data (90K+ players), optimizing gate placements and improving retention through statistical analysis and predictive modeling.
Jan 2025 – Feb 2025

Founder & Development Manager

Ling Qi Technology
  • Led a 20-engineer team to build a cutting-edge facial recognition system with 15% faster recognition than industry benchmarks.
  • Bridged technical and client-facing roles, ensuring delivery of tailored machine learning solutions across various industries.
  • Developed responsive websites and WeChat mini-programs in PHP/HTML, reducing operational costs and improving UX.
  • Maintained scalable systems serving over 10,000 active users.
Oct 2016 – Dec 2024

Founder & Sales Manager

BBW Baby Store
  • Grew retail operations from 1 to 3 locations, serving over 1,000 families annually with specialized baby products.
  • Raised $1 million through new business strategies and expansion initiatives.
Oct 2017 – Dec 2024

Founder

Ling Qi Construction
  • Led commercial construction projects for major clients including China Forestry Group and China Energy Engineering Group.
  • Participated in infrastructure development at Yangzhou Airport Plant, contributing to regional logistics expansion.
Oct 2020 – Dec 2024

Education

University of Pennsylvania

Master of Science in Engineering
Artificial Intelligence

Gained a robust foundation in machine learning, deep learning, computer vision, and data-driven systems through a curriculum jointly administered by the Departments of Electrical and Systems Engineering (ESE) and Computer and Information Science (CIS). Developed skills to build and optimize intelligent systems that address real-world engineering challenges, with hands-on experience in algorithm design, big data processing, and ethical AI deployment.

May 2027

Harvard University

Master of Liberal Arts
Data Science

Achieved a cumulative GPA of 3.88/4.00 and was listed on the Dean’s List for Academic Achievement. Coursework: Engaged in a rigorous curriculum that included Database Management, Data Visualization, Machine Learning, Data Mining, and Statistical Analysis. These courses equipped you with advanced analytical skills and the technical proficiency required to manipulate and interpret complex datasets.

Dec 2024

Washington State University

Bachelor of Science
Computer Science

Gained a robust foundation in machine learning, deep learning, computer vision, and data-driven systems through a curriculum jointly administered by the Departments of Electrical and Systems Engineering (ESE) and Computer and Information Science (CIS). Developed skills to build and optimize intelligent systems that address real-world engineering challenges, with hands-on experience in algorithm design, big data processing, and ethical AI deployment.

May 2022

Skills

Programming Languages & Tools
  • Python (Pandas/Polars, Scikit-Learn, NumPy, TensorFlow, Flask, Django, Web Scrapy)
  • C, Java
  • JavaScript, HTML
  • SQL, R
  • Swift, Android
Data Visualization & Business Intelligence
  • Tableau, Power BI, PyGWalker
  • Matplotlib, Seaborn
Machine Learning & Data Science
  • Supervised: Linear/Multilinear/Logistic Regression, SVM, Decision Trees, Random Forests, XGBoost
  • Unsupervised: K-Means Clustering, PCA
  • Deep Learning: CNN, RNN, GANs
  • Reinforcement Learning: Q-Learning, SARSA, DQN
  • Natural Language Processing: BERT, GPT
  • A/B Testing, ELT (Extract, Load, Transform)
Cloud Platforms
  • AWS (SageMaker)
  • Google Cloud Platform
Database Technologies
  • SQL, SQLite, MySQL
  • Apache Spark
Microsoft Office
  • Word, PowerPoint, Excel (including VBA)

Interests

Outside of my professional work in data science and technology, I’m passionate about automotive culture, community leadership, and cultural exploration. As the Vice President of the Mercedes Car Club and Sichuan-Tibet Line Club in Yangzhou, I’ve organized road trips across China and created meaningful connections among car enthusiasts and local businesses.

I’m also a green tea enthusiast and member of Tea Club in my hometown Yangzhou, Jiangsu, China, where I enjoy curating and sharing global tea experiences. I value building lifelong relationships and creating opportunities for collaboration through shared interests.

In my free time, I explore business opportunities, travel for inspiration, and enjoy learning about the intersection of technology, culture, and community development.

Projects

Machine Learning Projects

Health and Medical AI
  • Autism Prediction
  • Disease Prediction
  • Heart Disease Prediction
  • Lung Cancer Detection
  • Parkinson’s Disease Prediction
  • FaceMask Detection Using TensorFlow
  • Detection of COVID-19 from X-Ray Images
  • Cancer Cell Classification Using Scikit-learn
Finance and Risk Modeling
  • Sales Forecast Prediction
  • Loan Eligibility Prediction
  • Credit Card Fraud Detection
  • Medical Insurance Price Prediction
  • Analyzing Selling Price of Used Cars
  • Mortgage Risk Assessment and Climate Change
  • Stock Market Prediction Using Machine Learning
Real Estate and Urban Analytics
  • Seattle Housing Price Prediction
  • Weather Forecasting with Machine Learning
Security, Compliance and Authentication
  • Fake News Classification
  • License Plate Recognition with OpenCV and Tesseract OCR
NLP, Social Media and Language
  • Twitter Sentiment Analysis
  • Language Detection
Computer Vision and Image Recognition
  • Clothing Classifier
  • Gesture recognition
  • Food Image Classification
  • Visual Question Answering
  • Facial Emotion Recognition
  • Pothole detection using YOLO
  • Handwritten Digit Classification
Other
  • The Predictive Summit

Data Science and Analysis

  • Optimizing Product Recommendations Through AB Testing
  • A Paradigm Shift in Purchase and Redemption Predictions
  • Unleashing the Potential of User History for Click Prediction
  • A Comprehensive Data Analysis of Yelp Dataset
  • Seattle Housing Price Prediction
  • Movies Review Scraping And Analysis
  • YouTube Channel Videos Web Scrapping
  • Market Basket Analysis
  • Uber Trips Data Analysis
  • World Happiness Report Analysis and Visualization
  • Quote Scraping
  • News Scraping and Analysis
  • Real-time Share Price Scrapping and Analysis

SQL

  • Fraud Detection with SQL
  • Exploring Squirrel Census Data
  • Designing and Creating a Database
  • Analyzing CIA Factbook Data Using SQL
  • Customers and Products Analysis Using SQL
  • SQL Window Functions for Northwind Traders

Python

OS Projects

  • Mini Unix

Mobile Development

  • Crazy Alarm
  • Othello Game
  • Lazy Calculator
  • Calender Note Share
  • Astronomy Picture of the Day App

Personal Projects

  • DeepSeek-ChatBot
  • Youtube Downloader
  • EEG-based Machine Learning Methods in Seizure Prediction

Contact

Mail Address

PO Box 55211, Portland, OR 97220