Dongkuan (DK) Xu / 胥栋宽
Hello! I am an Assistant Professor at NC State CS, leading the NCSU Reliable & Efficient Computing Lab (web under construction) and working on deep learning, machine learning, and natural language processing. I received my Ph.D. at Penn State, where I was advised by Xiang Zhang. I received my M.S. in Optimization and B.E. at the University of Chinese Academy of Sciences and Renmin University of China, respectively.
I has been collaborating with Microsoft Research exploring neural architecture search (NAS) and hyperparameter optimization (HPO) for Foundation Models, and with Google Research to enable scalable and adaptive learning for Vision-Language Models. I was a research scientist at Moffett AI, investigating low-resource model compression. I also spent some wonderful time at NEC Labs America on contrastive learning and multi-task learning.
Other than my work, I am a big fan of American football. I love Nittany Lions, New York Giants, and Dallas Cowboys. I also like workout and soccer ball.
Email  / 
CV (Oct 2022)  / 
Twitter  / 
Google Scholar  / 
LinkedIn   
I'm actively looking for multiple PhDs / interns to work in Reliable & Efficient AI (一亩三分地 I, 一亩三分地 II). Feel free to send me your CV. Once we have a commitment to each other, trust me I will do my best to help you!   
(I've received an amazingly large number of applications. Super thanks for everyone's interest! Interviews are in progress. Good luck~)   
|
|
Research
I'm interested in reliable, efficient, and landable deep learning for AI at scale, investigating how to achieve Pareto optimality between decision reliability (uncertainty, robustness, adaptability), computational resources (data, parameters, computation), and model performance (inference, training) of deep learning systems. My long-term research goal is to free AI from the data-parameter-computation hungry beast and democratize AI to serve a broader range of populations and real-world domains.
Reliable & Scalable Deep Learning with Theoretical Guarantees
Efficient Large-scale Training & Inference Algorithms
Algorithm-hardware Co-design for AI Acceleration
Applications: Natural Language Processing, Computer Vision, Sciences
|
Call for Papers
The First Workshop on DeepLearning-Hardware Co-Design for AI Acceleration (link) to be held at AAAI’23 (Workshop Chair)
Welcomed topics include: model compression, deep learning acceleration, hard-soft co-design, applications, etc
Workshop is non-archival and permits under-review or concurrent submissions. Best Paper Awards will be selected
Submission Deadline: November 30th, 2022 (link)
|
News
|
|
Accelerating Dataset Distillation via Model Augmentation
L. Zhang*, J. Zhang*, B. Lei, S. Mukherjee, X.Pan, B.Zhao, C. Ding, Y. Li, D. Xu
[CVPR 2023] The IEEE/CVF Conference on Computer Vision and Pattern Recognition
Highlight Paper (2.5%)
PDF
We propose two model augmentation techniques, i.e. using early-stage models and weight perturbation to learn an informative synthetic set with significantly reduced training cost. Extensive experiments demonstrate that our method achieves up to 20× speedup.
|
|
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
S. Tang, Y. Wang, Z. Kong, T. Zhang, Y. Li, C. Ding, Y. Wang, Y. Liang, D. Xu
[CVPR 2023] The IEEE/CVF Conference on Computer Vision and Pattern Recognition
PDF
We propose a novel early exiting strategy based on cascading input similarity with valid assumptions on saturation states in visual-language models, a pioneering exploration of extending early exiting selection to encoders and decoders of sequence-to-sequence architectures.
|
|
Calibrating the Rigged Lottery: Making All Tickets Reliable
B. Lei, R. Zhang, D. Xu, B. K. Mallick
[ICLR 2023] The 11th International Conference on Learning Representations
PDF
We for the first time identify and study the reliability problem of sparse training and find that sparse training exacerbates the over-confidence problem of DNNs. We then develop a new sparse training method, CigL, to produce more reliable sparse models, which can simultaneously maintain or even improve accuracy with only a slight increase in computational and storage burden.
|
|
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
S. Huang, B. Lei, D. Xu, H. Peng, Y. Sun, M. Xie, C. Ding
[DAC 2023] The 60th Design Automation Conference
PDF
To assist explainable sparse training, we propose important weights exploitation and weights coverage exploration to characterize sparse training. Our method does not need to train dense models, achieving up to 95% sparsity ratio and even higher accuracy than dense training, with same amount of iterations.
|
|
Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration
S. Huang, H. Fang, K. Mahmood, B. Lei, N. Xu, B. Lei, Y. Sun, D. Xu, W. Wen, C. Ding
[DAC 2023] The 60th Design Automation Conference
We propose an energy efficient spiking neural network training workflow, and design a new drop-andgrow strategy with decreasing number of non-zero weights in the process of dynamically updating sparse mask. We demonstrate extremely high sparsity (i.e., 99%) model performance in SNN based vision tasks.
|
|
Efficient Informed Proposals for Discrete Distributions via Newton’s Series Approximation
Y. Xiang*, D. Zhu*, B. Lei, D. Xu, R. Zhang
[AISTATS 2023] The 26th International Conference on Artificial Intelligence and Statistics
PDF
We develop a gradient-like proposal for any discrete distribution without this strong requirement. Built upon a locally-balanced proposal, our method efficiently approximates the discrete likelihood ratio via a Newton’s series expansion to enable a large and efficient exploration in discrete spaces.
|
|
Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction
B. Lei, D. Xu, R. Zhang, S. He, B. K. Mallick
arXiv preprint arXiv:2301.03573 (2023)
PDF
We develop an adaptive gradient correction method to accelerate and stabilize the convergence of sparse training. Theoretically, we prove that our method can accelerate the convergence rate of sparse training.
|
|
A Survey for Efficient Open Domain Question Answering
Q. Zhang, S. Chen, D. Xu, Q. Cao, X, Chen, T. Cohn, M. Fang
arXiv preprint arXiv:2211.07886 (2023) i>
PDF
We provide a survey of recent advances in the efficiency of ODQA models. We walk through the ODQA models and conclude the core techniques on efficiency. Quantitative analysis on memory cost, processing speed, accuracy and overall comparison are given.
|
|
Improving Long-tailed Classification by Disentangled Variance Transfer
Y. Tian, W. Gao, Q. Zhang, P. Sun, D. Xu
Internet of Things
PDF
We propose a class-based covariance transfer method from the perspective of disentangling to transfer covariance information in long-tailed classification task.
|
|
Auto-CAM: Label-Free Earth Observation Imagery Composition and Masking Using Spatio-Temporal Dynamics
Y. Xie, Z. Li, H. Bao, X. Jia, D. Xu, X. Zhou, S. Skakun
[AAAI 2023] The 37th AAAI International Conference on Artificial Intelligence
PDF / Code / Supp
We propose an autonomous image composition and masking method for cloud masking, a fundamental task in Earth observation problems across social sectors such as agriculture, energy, and water.
|
|
Time Series Contrastive Learning with Information-Aware Augmentations
D. Luo, W. Cheng, Y. Wang, D. Xu, J. Ni, W. Yu, X. Zhang, Y. Liu, Y. Chen, H. Chen, X. Zhang
[AAAI 2023] The 37th AAAI International Conference on Artificial Intelligence
PDF / Code / Supp
We propose an adaptive data augmentation method to avoid ad-hoc choices or painstakingly trial-and-error tuning for time series representation learning.
|
|
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
D. Xu, S. Mukherjee, X. Liu, D. Dey, W. Wang, X. Zhang, A. H. Awadallah, J. Gao
[NeurIPS 2022] The 36th Conference on Neural Information Processing Systems
PDF / Code / Supp / Slides
We develop a few-shot task-agnostic NAS framework, AutoDistil, for distilling large language models into compressed students with variable computational cost. AutoDistil outperforms leading baselines with upto 3x additional reduction in computational cost and negligible loss in task performance.
|
|
S4: a High-sparsity, High-performance AI Accelerator
I. E. Yen, Z. Xiao, D. Xu
[SNN 2022] Sparsity in Neural Networks 2022 Workshop
PDF / Code / Supp / Slides
We introduce the first commercial hardware platform supporting high-degree sparsity acceleration up to 32 times — S4. S4 provides a (sparse) equivalent computation power of 944 TOPS in INT8 and 472 TFLOPS in BF16, and has 20GB LPDDR4 memory with up to 72 GB memory bandwidth in a low 70 Watt power envelope. We demonstrate several-times practical inference speedup on S4 over mainstream inference platforms such as Nvidia T4.
|
|
An Automatic and Efficient BERT Pruning for Edge AI Systems
S. Huang, N. Liu, Y. Liang, H. Peng, H. Li, D. Xu, M. Xie, C. Ding
[ISQED 2022] The 23rd IEEE International Society for Quality Electronic Design
Video / PDF / Code / Supp / Slides
We propose AE-BERT, an automatic and efficient pruning framework. AE-BERT achieves the inference time of a single BERT-BASE encoder on Xilinx Alveo U200 FPGA board that is 1.83x faster compared to Intel(R) Xeon(R) Gold 5218 (2.30GHz) CPU.
|
|
Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm
S. Huang*, D. Xu*, I. E. Yen, S. Chang, B. Li, C. Ding, et al.
[ACL 2022] The 60th Annual Meeting of the Association for Computational Linguistics
PDF / Code / Supp / Slides
We study network pruning of Transformer-based language models under the pre-training and fine-tuning paradigm and propose a counter-traditional hypothesis that pruning increases the risk of overfitting when performed during the fine-tuning phase.
|
|
InfoGCL: Information-Aware Graph Contrastive Learning
D. Xu, W. Cheng, D. Luo, H. Chen, X. Zhang
[NeurIPS 2021] The 35th Conference on Neural Information Processing Systems
PDF / Code / Supp / Slides
We propose an information-aware contrastive learning framework for graph-structure data, and show for the first time that all recent graph contrastive learning methods can be unified by our framework.
|
|
(SparseBERT) Rethinking Network Pruning - under the Pre-train and Fine-tune Paradigm
Dongkuan Xu, Ian En-Hsu Yen, Jinxi Zhao, Zhibin Xiao
[NAACL-HLT 2021] 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics
PDF / Code / Supp / Slides
We study how knowledge is transferred and lost during the pre-train, fine-tune, and pruning process, and propose a knowledge-aware sparse pruning process that achieves significantly superior results than existing literature.
|
|
Data Augmentation with Adversarial Training for Cross-Lingual NLI
Xin Dong, Yaxin Zhu, Zuohui Fu, Dongkuan Xu, Gerard de Melo
[ACL 2021] The 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing
PDF / Code / Supp / Slides
We study data augmentation for cross-lingual natural language inference and propose two methods of training a generative model to induce synthesized examples to reflect more diversity in a semantically faithful way.
|
|
Deep Multi-Instance Contrastive Learning with Dual Attention for Anomaly Precursor Detection
Dongkuan Xu, Wei Cheng, Jingchao Ni, Dongsheng Luo, Masanao Natsumeda, Dongjin Song, Bo Zong, Haifeng Chen, Xiang Zhang
[SDM 2021] The 21th SIAM International Conference on Data Mining
PDF / Code / Supp / Slides
We utilize multi-instance learning to model the uncertainty of precursor period, and design a contrastive loss to address the issue that annotated anomalies are few.
|
|
Multi-Task Recurrent Modular Networks
Dongkuan Xu, Wei Cheng, Xin Dong, Bo Zong, Wenchao Yu, Jingchao Ni, Dongjin Song, Xuchao Zhang, Haifeng Chen, Xiang Zhang
[AAAI 2021] The 35th AAAI International Conference on Artificial Intelligence
PDF / Code / Supp / Slides
We propose MT-RMN to dynamically learn task relationships and accordingly learn to assemble composable modules into complex layouts to jointly solve multiple sequence processing tasks.
|
|
Transformer-Style Relational Reasoning with Dynamic Memory Updating for Temporal Network Modeling
Dongkuan Xu, Junjie Liang, Wei Cheng, Hua Wei, Haifeng Chen, Xiang Zhang
[AAAI 2021] The 35th AAAI International Conference on Artificial Intelligence
PDF / Code / Supp / Slides
We propose TRRN to model temporal networks by employing transformer-style self-attention to reason over a set of memories.
|
|
How Do We Move: Modeling Human Movement with System Dynamics
Hua Wei, Dongkuan Xu, Junjie Liang, Zhenhui Li
[AAAI 2021] The 35th AAAI International Conference on Artificial Intelligence
PDF / Code / Supp / Slides
We propose MoveSD to model state transition in human movement from a novel perspective, by learning the decision model and integrating the system dynamics.
|
|
Longitudinal Deep Kernel Gaussian Process Regression
Junjie Liang, Yanting Wu, Dongkuan Xu, Vasant Honavar
[AAAI 2021] The 35th AAAI International Conference on Artificial Intelligence
PDF / Code / Supp / Slides
We introduce Longitudinal deep kernel Gaussian process regression to fully automate the discovery of complex multi level correlation structure from longitudinal data.
|
|
Parameterized Explainer for Graph Neural Network
Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, Xiang Zhang
[NeurIPS 2020] The 34th Conference on Neural Information Processing Systems
PDF / Code / Supp / Slides
We propose to adopt deep neural networks to parameterize the generation process of explanations, which enables a natural approach to multi-instance explanations.
|
|
Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification
Xin Dong, Yaxin Zhu, Yupeng Zhang, Zuohui Fu, Dongkuan Xu, Sen Yang, Gerard de Melo
[SIGIR 2020] The 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
PDF / Code / Supp / Slides
We propose a semi-supervised adversarial perturbation framework that encourages the model to be more robust towards such divergence and better adapt to the target language.
|
|
Tensorized LSTM with Adaptive Shared Memory for Learning Trends in Multivariate Time Series
Dongkuan Xu, Wei Cheng, Bo Zong, Dongjin Song, Jingchao Ni, Wenchao Yu, Yanchi Liu, Haifeng Chen, Xiang Zhang
[AAAI 2020] The 34th AAAI International Conference on Artificial Intelligence
PDF / Code / Poster / Slides
We propose a deep architecture for learning trends in multivariate time series, which jointly learns both local and global contextual features for predicting the trend of time series.
|
|
Longitudinal Multi-Level Factorization Machines
Junjie Liang, Dongkuan Xu, Yiwei Sun, Vasant Honavar
[AAAI 2020] The 34th AAAI International Conference on Artificial Intelligence
PDF / Code / Supp
We propose longitudinal kulti-level factorization machine, to the best of our knowledge, the first model to address these challenges in learning predictive models from longitudinal data.
|
|
Adaptive Neural Network for Node Classification in Dynamic Networks
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Yameng Gu, Xiao Liu, Jingchao Ni, Bo Zong, Haifeng Chen, Xiang Zhang
[ICDM 2019] The 19th IEEE International Conference on Data Mining
PDF / Slides
We propose an adaptive neural network for node classification in dynamic networks, which is able to consider the evolution of both node attributes and network topology.
|
|
Spatio-Temporal Attentive RNN for Node Classification in Temporal Attributed Graphs
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Xiao Liu, Xiang Zhang
[IJCAI 2019] The 29th International Joint Conference on Artificial Intelligence
PDF / Code / Poster / Slides
We propose a spatio-temporal attentive RNN model, which aims to learn node representations for classification by jointly considering both the temporal and spatial patterns of the node.
|
|
Deep Co-Clustering
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Xiao Liu, Xiang Zhang
[SDM 2019] The 19th SIAM International Conference on Data Mining
PDF / Code / Supp / Poster / Slides
DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian mixture model to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features.
|
|
Co-Regularized Deep Multi-Network Embedding
Jingchao Ni, Shiyu Chang, Xiao Liu, Wei Cheng, Haifeng Chen, Dongkuan Xu and Xiang Zhang
[WWW 2018] The 27th International Conference on World Wide Web
PDF / Code
DMNE coordinates multiple neural networks (one for each input network data) with a co-regularized loss function to manipulate cross-network relationships, which can be many-to-many, weighted and incomplete.
|
|
Multiple Instance Learning Based on Positive Instance Graph
Dongkuan Xu, Wei Zhang, Jia Wu, Yingjie Tian, Qin Zhang, Xindong Wu
arXiv preprint
Most multi-instance learning (MIL) methods that study true positive instances ignore 1) the global similarity among positive instances and 2) that negative instances are non-i.i.d.. We propose a MTL method based on positive instance graph updating to address this issue.
|
|
A Review of Multi-Instance Learning Research
Yingjie Tian, Dongkuan Xu, Chunhua Zhang
Operations Research Transactions, 2018
PDF
This paper reviews the research progress of multi-instance learning (MTL), introduces different assumptions, and categories MTL methods into instance-level, bag-level, and embedded-space. Extensions and major applications in various areas are discussed at last.
|
|
SALE: Self-Adaptive LSH Encoding for Multi-Instance Learning
Dongkuan Xu, Jia Wu, Dewei Li, Yingjie Tian, Xingquan Zhu, Xindong Wu
Pattern Recognition, 2017
PDF
We propose a self-adaptive locality-sensitive hashing encoding method for multi-instance learning (MIL), which efficiently deals with large MIL problems.
|
|
Metric Learning for Multi-Instance Classification with Collapsed Bags
Dewei Li, Dongkuan Xu, Jingjing Tang, Yingjie Tian
[IJCNN 2017] The 30th IEEE International Joint Conference on Neural Networks
PDF
We propose a metric learning method for multi-instance classification, aiming to find an instance-dependent metric by maximizing the relative distance on neighborhood level.
|
|
PIGMIL: Positive Instance Detection via Graph Updating for Multiple Instance Learning
Dongkuan Xu, Jia Wu, Wei Zhang, Yingjie Tian
arXiv preprint arXiv:1612.03550, 2016
PDF
We propose a positive instance detection method based on multiple instance learning, of which the core idea is that true positive instances should not only be similar to themselves globally but also different from negative instances robustly.
|
|
Multi-Metrics Classification Machine
Dewei Li, Wei Zhang, Dongkuan Xu, Yingjie Tian
[ITQM 2016] The 4th International Conference on Information Technology and Quantitative Management
PDF
(Best Paper Award)
We propose a metric learning approach called multi-metrics classification machine. We establish an optimization problem for each class (each metric) to learn multiple metrics independently.
|
|
A Comprehensive Survey of Clustering Algorithms
Dongkuan Xu, Yingjie Tian
Annals of Data Science, 2015
PDF
(1100 citations)
We introduce the definition of clustering, the basic elements involved in clustering process, and categorize the clustering algorithms into the traditional ones and the modern ones. All the algorithms are discussed comprehensively.
|
|
A Support Vector Machine-based Ensemble Prediction for Crude Oil Price with VECM and STEPMRS
Dongkuan Xu, Tianjia Chen, Wei Xu
International Journal of Global Energy Issues, 2015
PDF
This paper proposes a support vector machine-based ensemble model to forecast crude oil price based on VECM and stochastic time effective pattern modelling and recognition system (STEPMRS).
|
|
A Neural Network-Based Ensemble Prediction Using PMRS and ECM
Dongkuan Xu, Yi Zhang, Cheng Cheng, Wei Xu, Likuan Zhang
[HICSS 2014] The 47th Hawaii International Conference on System Science
PDF
This paper presents an integrated model to forecast crude oil prices, where pattern modelling & recognition system is used to model the price trend and error correction model is offered to forecast errors. A neural network layer is employed to integrate the results.
|
Professional Services
-
Academic Committee Member:
-
Conference/Workshop Chair:
-
The First Workshop on DL-Hardware Co-Design for AI Acceleration @ AAAI2023
-
International Workshop on Resource-Efficient Learning for Knowledge Discovery (RelKD'23) @ KDD2023
-
The First Conference on Machine Learning Algorithms & Natural Language Processing (MLNLP'22)
-
Session Chair:
-
Research Track of KDD'22
-
ADS Track of KDD'22
-
Senior Program Committee Member:
-
Program Committee Member:
-
ICLR'21, 22, 23
-
ICML'21, 22
-
NeurIPS'20, 21, 22
-
AAAI'20, 21, 22
-
ISQED'23
-
KDD'20, 21, 22
-
ACL Rolling Review'22
-
LoG'22
-
IJCAI'20, 22
-
NAACL'21
-
EMNLP'20, 21
-
COLING'22
-
WSDM'22, 23
-
SDM'22
-
EACL'21
-
ACM CIKM'20, 21, 22
-
AACL-IJCNLP'20, 22
-
IJCNN'18, 19, 20, 21
-
Journal Reviewer:
-
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
-
Communications of the ACM
-
IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
-
IEEE Transactions on Knowledge and Data Engineering (TKDE)
-
IEEE Transactions on Cybernetics
-
Information Fusion
-
ACM Transactions on Knowledge Discovery from Data (TKDD)
-
Pattern Recognition
-
Neural Networks
-
Neurocomputing
-
ACM Transactions on Asian and Low-Resource Language Information Processing
-
IEEE Access
-
Neural Computation
-
Complexity
-
Soft Computing
-
Complex & Intelligent Systems
-
Multimedia Tools and Applications
-
Big Data
-
External Conference Reviewer:
-
AAAI'18, 19, 20, KDD'18, 19, 20, 21, TheWebConf (WWW)'20, 21, 22, WSDM'20, 21, ICDM'18, 19, 21, SDM'18, 19, 20, 21, 22, ACM CIKM'18, 19, Big Data'18, IJCNN'16, 17, ITQM'16, 17
-
Conference Volunteer:
-
The Annual Conference of NAACL-HLT, 2021
-
Backuping SDM Session Chairs, 2021
-
The 35th AAAI Conference on Artificial Intelligence, 2021
-
The 26th SIGKDD Conference on Knowledge Discovery and Data Mining, 2020
|
Teaching Experiences
-
Teaching Assistant at Penn State
-
Guest Lecturer
COSI 133A: Graph Mining
Brandeis University, 2021 Fall
COSI 165B: Deep Learning
Brandeis University, 2021 Spring
|
Supervised Students/Interns
Shaoyi Huang, Ph.D. at University of Connecticut
Topic I: Sparse Neural Architecture Search
Topic II: Few-shot Foundation Model Distillation
Bowen Lei, Ph.D. at Texas A&M University
Topic: Theoretical Foundations of Sparse Training
Xiangxiang Gao, Ph.D. at Shanghai Jiao Tong University
Topic: Theoretical Foundations of Sparse Training
Shuren He, Ph.D. at Texas A&M University
Topic: Theoretical Foundations of Sparse Training
Shengze Xu, Undergraduate at Zhejiang University
Topic: Theoretical Foundations of Sparse Neural Networks
Shijian Wu, Master at Georgia Institute of Technology
Topic: Theoretical Foundations of Sparse Neural Networks
Zhenglun Kong, Ph.D. at Northeastern University
Topic: Efficient Transformer Architecture Search
Xukun Liu, Undergraduate at South University of Science and Technology of China
Topic: Efficient Transformer Architecture Search
Haoze Lv, Undergraduate at South University of Science and Technology of China
Topic: Efficient Transformer Architecture Search
Wei Zhang, Undergraduate at Renmin University of China
(Now Ph.D. at City University of Hong Kong)
Topic: Cost-Sensitive Multi-Instance Learning
Jie Zhang, Master at Zhejiang University
Topic: Efficient Data-centric AI
Lei Zhang, Master at Zhejiang University
Topic: Efficient Data-centric AI
Xiang Pan, Master at New York University
Topic: Efficient Data-centric AI
Dongyao Zhu, Undergraduate at University of California San Diego
Topic: Efficient Data-centric AI
Shitian Zhang, Master at University of Chinese Academy of Sciences
Topic: Adaptive Code Generation
Jiasheng Gu, Master at University of Southern California
Topic: Adaptive Code Generation
Ziqing Wang, Undergraduate at Sun Yat-sen University
Topic: Efficient Co-design AI
Shuya Li, Master at Tsinghua University
Topic: Efficient Intelligent Traffic Learning
Shengkun Tang, Undergraduate at Wuhan University
Topic: Efficient Multi-modal Learning
Tianchi Zhang, Master at University of Michigan - Ann Arbor
Topic: Efficient Multi-modal Learning
Xuelin Kong, Master at National University of Singapore
Topic: Efficient Multi-modal Learning
Hongye Fu, Undergraduate at Zhejiang University
Topic: Efficient Multi-modal Learning
Chengyuan Liu, Ph.D. at NC State University
Topic: Robust Generalized Model Compression
Weizhi Gao, Master at University of Chinese Academy of Sciences
Topic: Robust Generalized Model Compression
Jianwei Li, Master at San Jose State University
Topic: Robust Generalized Model Compression
Yanbo Fang, Master at Rutgers University
Topic: Robust Generalized Model Compression
|
Patent Applications
System and Method for Knowledge-Preserving Neural Network Pruning.
Enxu Yan, Dongkuan Xu, and Zhibin Xiao.
U.S. Patent. 11,200,497. Dec. 2021
Bank-balanced-sparse Activation Feature Maps for Neural Network Models.
Enxu Yan and Dongkuan Xu.
U.S. Patent App. 17/038,557. Apr. 2022.
Neural Network Pruning Method and System via Layerwise Analysis.
Enxu Yan and Dongkuan Xu.
U.S. Patent App. 17/107,046. Nov. 2020.
Unsupervised Multivariate Time Series Trend Detection for Group Behavior Analysis.
Wei Cheng, Haifeng Chen, Jingchao Ni, Dongkuan Xu, and Wenchao Yu.
U.S. Patent App. 16/987,734. Mar. 2021.
Tensorized LSTM with Adaptive Shared Memory for Learning Trends in Multivariate Time Series.
Wei Cheng, Haifeng Chen, Jingchao Ni, Dongkuan Xu, and Wenchao Yu.
U.S. Patent App. 16/987,789. Mar. 2021.
Adaptive Neural Networks for Node Classification in Dynamic Networks.
Wei Cheng, Haifeng Chen, Wenchao Yu, and Dongkuan Xu.
U.S. Patent App. 16/872,546. Nov. 2020.
Spatio Temporal Gated Recurrent Unit.
Wei Cheng, Haifeng Chen, and Dongkuan Xu.
U.S. Patent App. 16/787,820. Aug. 2020.
Automated Anomaly Precursor Detection.
Wei Cheng, Dongkuan Xu, Haifeng Chen, and Masanao Natsumeda.
U.S. Patent App. 16/520,632. Feb. 2020.
|
Talks
Parameter Efficiency: Democratizing AI at Scale (Slides)
Waltham, MA, USA, Dec. 2021.
Brandeis University.
Chasing Efficiency of Pre-trained Language Models
Redmond, Washington, USA, Jun. 2021.
Microsoft Research Lab.
BERT Pruning: Structural vs. Sparse (Slides)
Waltham, MA, USA, Apr. 2021.
Brandeis University.
BERT, Compression and Applications (Slides)
Mountain View, USA, Apr. 2021.
Xpeng Motors.
BERT Architecture and Computation Analysis (Slides)
Los Altos, USA, May. 2020.
Moffett.AI.
Learning Trends in Multivariate Time Series (Slides)
New York, USA, Feb. 2020.
AAAI 2020.
Node Classification in Dynamic Networks (Slides)
Beijing, China, Nov. 2019.
ICDM 2019.
Anomaly Precursor Detection via Deep Multi-Instance RNN (Slides)
Princeton, USA, May. 2019.
NEC Laboratories America.
Deep Co-Clustering (Slides)
Calgary, Canada, May 2019.
SDM 2019.
Efficient Multiple Instance Learning (Slides)
Princeton, USA, May. 2018.
NEC Laboratories America.
|
Honors and Awards
-
Doctor of Philosophy (Ph.D.)
-
College of IST Award for Excellence in Teaching Support (top 2), 2019
-
Third place winner (Eng.) in the 37rd annual PSU Graduate Exhibition (News), 2022
-
NAACL Scholarship, 2021
-
SIAM Student Travel Award, 2021
-
IST Travel Awards, Spring 2021, Fall 2021
-
College of IST Award for Excellence in Teaching Support, Finalist, 2021
-
KDD Student Registration Award, 2020
-
AAAI Student Scholarship, 2020
-
IST Travel Award, Spring 2020
-
IST Travel Award, Spring 2019
-
Master of Science (M.S.)
-
ITQM Best Paper, 2016
-
President’s Fellowship of Chinese Academy of Sciences (the most prestigious award), 2016
-
National Graduate Scholarship, China (2% in university), 2016
-
Graduate Student Academic Scholarship, 2017
-
Graduate Student Academic Scholarship, 2016
-
Graduate Student Academic Scholarship, 2015
-
Bachelor of Engineering (B.E.)
-
First-class Scholarship of Sashixuan Elite Fund, China (5% in university), 2014
-
Kwang-hua Scholarship of RUC, China, 2014
-
Second-class Scholarship of Excellent Student Cadre, 2014
-
Meritorious Winner in Mathematical Contest in Modeling, 2013
-
First-class Scholarship of Social Work and Volunteer Service of RUC, 2013
|
- ACM (Association for Computing Machinery) Student Membership, 2021-Present
- ACL (Association for Computational Linguistics) Membership, 2021-Present
- AAAI (Association for the Advancement of Artificial Intelligence) Student Membership, 2019-Present
- SIAM (Society of Industrial and Applied Mathematics) CAS Student Member, 2016-Present
- President of Youth Volunteers Association of School of Information of RUC, 2012-2013
- Volunteer of Beijing Volunteer Service Federation (BVF), 2012-Present
- Leader of National Undergraduate Training Programs for Innovation and Entrepreneurship, 2011-2012
|
|