About Me

I am a currently a postdoctoral researcher at the Department of Computer and Information Science, University of Pennsylvania, supervised by Prof. Mayur Naik since September 2021. Previously, I was a PhD student working with Prof. Susan Davidson at the Department of Computer and Information Science, University of Pennsylvania from August 2016 to August 2021 and got my bachelor degreee from Tsinghua University in 2016. My research interests lie at the intersection of the data science, data management and machine learning.

[CV]

Education

Ph.D

Unversity of Pennsylvania

Department of Computer and Information Science

August, 2016 - August, 2021

Bachelor

Tsinghua University

Department of Automation

August, 2012 - July, 2016

Experience

Publications

  • Do Machine Learning Models Learn Statistical Rules Inferred from Data?
    Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong
    ICML 2023 [Code][Paper]
  • Learning to Select Pivotal Samples for Meta Re-weighting
    Yinjun Wu, Adam Stein, Jacob Gardner, Mayur Naik
    AAAI 2023 (oral) [Code][Paper][Slides]
  • CHEF: A Cheap and Fast Pipeline for Iteratively Cleaning Label Uncertainties
    Yinjun Wu, James Weimer, Susan B. Davidson [Technical report][Code]
    Wu, Yinjun, James Weimer, and Susan B. Davidson. "CHEF: a cheap and fast pipeline for iteratively cleaning label uncertainties." in Proceedings of the VLDB Endowment 14, no. 11 (2021): 2410-2418.
  • Dynamic Gaussian Mixture based Deep Generative Model For Robust Forecasting on Sparse Multivariate Time Series
    Yinjun Wu, Jingchao Ni, Wei Cheng, Bo Zong, Dongjin Song, Zhengzhang Chen, Yanchi Liu, Xuchao Zhang, Haifeng Chen, Susan B. Davidson [Paper][Full version][Code]
    In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 1, pp. 651-659. 2021
  • DeltaGrad: Rapid retraining of machine learning models
    Yinjun Wu, Edgar Dobriban, Susan B. Davidson [Paper][Slides][Code]
    In International Conference on Machine Learning (ICML), pp. 10355-10366. PMLR, 2020.
  • Lessons learned from the early performance evaluation of Intel Optane DC Persistent Memory in DBMS
    Yinjun Wu, Kwanghyun Park, Rathijit Sen, Brian Kroth, Jaeyoung Do [Paper][Technical report]
    In Proceedings of the 16th International Workshop on Data Management on New Hardware, pp. 1-3. 2020.
  • PrIU: A provenance-based approach for incrementally updating regression models
    Yinjun Wu, Val Tannen, Susan B. Davidson [Paper][Slides]
    In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 447-462. 2020.
  • ProvCite: Provenance-based Data Citation [Paper][Slides]
    Yinjun Wu, Abdussalam Alawini, Daniel Deutch, Tova Milo, Susan B. Davidson
    In Proceedings of the VLDB Endowment (2019), 12(7)
  • Data Citation: Giving Credit where Credit is Due [Paper][Slides]
    Yinjun Wu, Abdussalam Alawini, Susan B. Davidson, Gianmaria Silvello
    In Proceedings of the 2018 International Conference on Management of Data (SIGMOD conference), pp. 99-114. ACM, 2018.
  • Data Citation: A New Provenance Challenge [Paper]
    Abdussalam Alawini, Susan B. Davidson, Gianmaria Silvello, Val Tannen, Yinjun Wu (authors sorted alphabetically)
    IEEE Data Eng. Bull. 41(1): 27-38 (2018)
  • Automating Data Citation in CiteDB [Paper]
    Abdussalam Alawini, Susan Davidson, Wei Hu, Yinjun Wu (authors sorted alphabetically)
    Proceedings of the VLDB Endowment 10.12 (2017): 1881-1884.
  • BAH: A Bitmap Index Compression Algorithm for Fast Data Retrieval [Paper]
    Chenxing Li, Zhen Chen, Wenxun Zheng, Yinjun Wu, Junwei Cao
    Local Computer Networks (LCN), 2016 IEEE 41st Conference on, 697-705
  • CAMP: A New Bitmap Index for Data Retrieval in Traffic Archival [Paper]
    Yinjun Wu, Zhen Chen, Junwei Cao, Haoxun Li, Chenxing Li, Yijie Wang, Wenxun Zheng, Jiahui Chang, Jing Zhou, Ziwei Hu, Jinghong Guo
    IEEE Communications Letters 20 (6), 1128-1131
  • Combat: A new bitmap index coding algorithm for big data [Paper]
    Yinjun Wu, Zhen Chen, Yuhao Wen, Wenxun Zheng, Junwei Cao
    Tsinghua Science and Technology 21 (2), 136-145
  • A general analytical model for spatial and temporal performance of bitmap index compression algorithms in Big Data [Paper]
    Yinjun Wu, Zhen Chen, Yuhao Wen, Junwei Cao, Wenxun Zheng, Ge Ma
    Computer Communication and Networks (ICCCN), 2015 24th International Conference on
  • A survey of bitmap index compression algorithms for big data [Paper]
    Zhen Chen, Yuhao Wen, Junwei Cao, Wenxun Zheng, Jiahui Chang, Yinjun Wu, Ge Ma, Mourad Hakmaoui, Guodong Peng
    Tsinghua Science and Technology 20 (1), 100-115

My Projects

Teaching