Mingyang Li received two Bachelor's degrees studying nanomaterials. Just when he thought he'd create his own self-healing superhero armor, he found himself in love with data crunching more than with magnetron sputtering1. Following his heart, soon after joining the Scientific Computering (SCMP) program at UPenn, he submatriculated into DATS.
I am a Research Associate at Wharton Research Data Services (WRDS), where I build pipelines to parse SEC filings (Form 10-K, etc.).
I also conduct social science research at the World Well-Being Project (WWBP), mainly focusing on comparing Chinese and American cultures using microblog data.
I am joining Google as a software engineer in late September, 2019.
Word embedding models trained on 5,000,000 Weibo posts from each year of 2012-2018. Posts are deduplicated, in Mandarin, and in Simplified Chinese. Models are 10-fold fastText per year -- thus, a total of 70 models. Request collaboration here.
Sectionalized Form 10-Ks in plain text. Each filing is split into separate items. We have the Item IDs, too. Request collaboration via email.
Shared here are my review notes and summaries for some courses I have taken. Hope you will find one helpful.
"Learn a course as if you are to teach it." -- Mingyang Li
CIS545 Big Data Analytics - Spring 2018
This was a fun course to attend.
Term Project Report: Titled "A Quantitative Study On How Emojis' Meanings Shift From West To East," this report innovatively used radar chart to compare how each emoji projects (as in linear algrebra) to five of human's basic emotions (see Ekman’s List of Basic Emotions, 1972) across English-speaking American users on Twitter and Mandarin-speaking Chinese users on Weibo.
Lambda Ratio In ANOVA: A hand-written page detailing how we came to the F distribution deriving from the lambda ratio for the ANOVA example. This is perhaps the most general case we could consider for normally distributed random variables.
Lambda Ratio In Regression: A hand-written page detailing how we came to the T distribution deriving from the lambda ratio for the linear regression example.
CIS550 Database - Fall 2017
"A SQL query goes into a bar, walks up to two tables and asks, 'can I join you?'" -- Anonymous
How To Properly Answer Questions: Lacking formal former education in computer sicence, I found myself in need of explicitly writing down how I should phrase myself for answering many questions. Source.