I work at the School of Computer Science as an assistant professor and master’s supervisor now, doing some fundamental research on reinforcement learning and multi-agent systems in the AI lab led by professor Lei Chen (陈蕾). I am also recruiting graduate students, if interested, feel free to email me at sdyang@njupt.edu.cn.
I graduated from the School of Information Science and Technology, Southwest Jiaotong University (西南交通大学信息学院) with a bachelor’s degree in 2013 and from the Department of Computer Science and Technology, Nanjing University (南京大学计算机系) with a Ph.D’s degree in 2020, advised by professor Yang Gao (高阳).
My research interest includes reinforcement learning, multi-agent systems and multi-armed bandits. I have published more than 20 papers at the top international AI conferences/journals such as AAAI, AAMAS, TNNLS, TCYB.
🔥 News
- 2024.08: One paper on “Multi-agent reinforcement learning” is accepted by ESWA.
- 2024.06: One paper on “Multi-agent reinforcement learning” is accepted by TITS.
- 2024.04: One paper on “Multi-agent reinforcement learning” is accepted by IJCAI 2024.
- 2024.04: Two papers on “Reinforcement learning” and “Multi-agent reinforcement learning” are accepted by the Journal of Software.
- 2024.04: One paper on “Reinforcement learning” is accepted by Chinese Journal of Computers.
- 2023.12: One paper on “Multi-agent reinforcement learning” is accepted by ICASSP 2024.
- 2023.10: One paper on “Multi-agent reinforcement learning” is accepted by TCYB.
📝 Publications
🐕 Reinforcement Learning
- Shangdong Yang, Dingyuanhao Sun, Xingguo Chen. Off-Policy Temporal Difference Learning with Bellman Residuals, Mathematics, 2024.
- Xingguo Chen, Guang Yang, Shangdong Yang, Huihui Wang, Shaokang Dong, Yang Gao. Online Attentive Kernel-Based Temporal Difference Learning, Knowledge-Based Systems, 2023.
- Shangdong Yang, Miaoying Yu, Xingguo Chen, Wenbin Li, Lei Chen. Group-wise Contrastive Learning Based Sequence-aware Skill Discovery, CCFAI 2023 best paper/Journal of Software, 2024.
- Hongye Cao, Xiao Liu, Shaokang Dong, Shangdong Yang, Jing Huo, Wenbin Li, Yang Gao. A Survey of Interpretability Research Methods for Reinforcement Learning, Chinese Journal of Computers, 2024, 8: 1853-1882.
- Hongye Cao, Shangdong Yang, Jing Huo, Xingguo Chen, Yang Gao. Enhancing OOD Generalization in Offline Reinforcement Learning with Energy-Based Policy Optimization, ECAI 2023.
- Xingguo Chen, Xingzhou Ma, Yang Li, Guang Yang, Shangdong Yang, Yang Gao. Modified Retrace for Off-Policy Temporal Difference Learning, UAI 2023.
- Shangdong Yang, Huihui Wang, Shaokang Dong, Xingguo Chen. Leveraging Transition Exploratory Bonus for Efficient Exploration in Hard-Transiting Reinforcement Learning Problems, Future Generation Computer Systems, 2023, 145: 442-453.
- Xiao Liu, Shuyang Liu, Bo An, Yang Gao, Shangdong Yang, Wenbin Li. Effective Interpretable Policy Distillation via Critical Experiences Identification, IEEE Intelligent Systems, 2023.
- Xingguo Chen, Dingyuanhao Sun, Guang Yang, Shangdong Yang, Yang Gao. A Survey of Reinforcement Learning Algorithms from a Fixed Point Perspective, Chinese Journal of Computers, 2023.
- Shangdong Yang, Yang Gao, Bo An, Hao Wang, Xingguo Chen. Efficient Average Reward Reinforcement Learning Using Constant Shifting Values, AAAI 2016.
🧑🏻🤝🧑🏼 Multi-agent Systems
- Shaokang Dong, Chao Li, Shangdong Yang, Wenbin Li, Yang Gao. Decentralized Counterfactual Value with Threat Detection for Multi-Agent Reinforcement Learning in Mixed Cooperative and Competitive Environments, Expert Systems with Applications, 2024, 257: 125116.
- Wubing Chen, Shangdong Yang, Wenbin Li, Yujing Hu, Xiao Liu, Yang Gao. Learning Multi-intersection Traffic Signal Control via Coevolutionary Multi-Agent Reinforcement Learning, IEEE Transactions on Intelligent Transportation Systems, 2024.
- Chao Li, Yujing Hu, Shangdong Yang, Tangjie Lv, Changjie Fan, Wenbin Li, Chongjie Zhang, Yang Gao. STAR: Spatio-Temporal State Compression for Multi-Agent Tasks with Rich Observations, IJCAI 2024.
- Shaokang Dong, Chao Li, Guang Yang, Zhenxing Ge, Hongye Cao, Wubing Chen, Shangdong Yang, Xingguo Chen, Wenbin Li, Yang Gao. Survey on Solutions and Applications for Mixed-motive Games, Journal of Software, 2024.
- Chao Li, Shaokang Dong, Shangdong Yang, Hongye Cao, Wenbin Li, Yang Gao. Multi-agent Sparse Interaction Modeling Is an Anomaly Detection Problem, ICASSP 2024.
- Shaokang Dong, Hangyu Mao, Shangdong Yang, Shengyu Zhu, Wenbin Li, Jianye Hao, Yang Gao. WToE: Learning When to Explore in Multi-Agent Reinforcement Learning, IEEE Transactions on Cybernetics, 2023.
- Yunkai Zhuang, Shangdong Yang, Wenbin Li, Yang Gao. Convergence Analysis of Graphical Game-based Nash Q−learning Using the Interaction Detection Signal of N−step Return, ICASSP 2023.
- Wubing Chen, Wenbin Li, Xiao Liu, Shangdong Yang, Yang Gao. Learning Explicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning via Polarization Policy Gradient, AAAI 2023.
- Zhenxing Ge, Shangdong Yang, Pinzhuo Tian, Zixuan Chen, Yang Gao. Modeling Rationality: Toward Better Performance Against Unknown Agents in Sequential Games, IEEE Transactions on Cybernetics, 2023.
🎰 Multi-armed Bandits
- Shangdong Yang, Yang Gao. An Optimal Algorithm for the Stochastic Bandits While Knowing the Near-optimal Mean Reward, IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(5): 2285-2291.
- Shangdong Yang, Hao Wang, Chenyu Zhang, Yang Gao. Contextual Bandits with Hidden Features to Online Recommendation Via Sparse Interactions, IEEE Intelligent Systems, 2020, 35(5): 62-72.
- Chenyu Zhang, Hao Wang, Shangdong Yang, Yang Gao. A Contextual Bandit Approach to Personalized Online Recommendation via Sparse Interactions, PAKDD 2019.
- Shangdong Yang, Hao Wang, Yang Gao, Xingguo Chen. An Optimal Algorithm for the Stochastic Bandits with Knowing Near-optimal Mean Reward, AAMAS 2018.
🍀 Others
- Fan Meng, Qunli Yang, Zhengda He, Shangdong Yang*, Weidong Tang. GUARD: Multigranularity-based Unsupervised Anomaly Detection Algorithm for Multivariate Time Series, CCIS 2022.
- Yansheng Wu, Chengju Li, Shangdong Yang*. New Galois Hulls of Generalized Reed-solomon Codes, Finite Fields and Their Applications, 2022, 83: 102084.
- Chenyu Zhang, Hao Wang, Shangdong Yang, Yang Gao. Incremental Nonnegative Matrix Factorization Based on Matrix Sketching and k-means Clustering, IDEAL 2016.
🏆 Honors and Awards
- 2023.07, Outstanding Paper Award, CCFAI 2023
- 2022.09, Youth Fund of the National Natural Science Foundation of China, NSFC
- 2022.07, State Key Laboratory of Novel Software Technology Project, Nanjing University
- 2020.10, Double-Innovation Doctor Project, Jiangsu Province
- 2018.10, Innovation Ability Improvement Plan for Excellent Doctoral Students, Nanjing University
- 2017.10, CETC The 14th Research Institute Glarun Scholarship, Nanjing University
- 2010.10, National Scholarship, Southwest Jiaotong University
👨🎓 Educations
- 2013.09 - 2020.06, Doctor, Department of Computer Science and Technology, Nanjing University, Nanjing.
- 2016.10 - 2016.11, Visiting student, Data Science Lab led by professor Longbing Cao, University of Technology Sydney (UTS).
- 2009.09 - 2013.06, Bachelor, School of Information Science and Technology, Southwest Jiaotong University, Chengdu.
- 2006.09 - 2009.06, Huaiyin High School, Huai’an.
💬 Invited Talks
- 2018.08, Excellent Paper Report, the 5th Seminar on Agents and Multi-agent System of CCF
📚 Courses
- Python Programming (Blended Teaching) (for NJUPT undergraduate students, Fall, 2024)
- Python Programming (Blended Teaching) (for NJUPT undergraduate students, Fall, 2023)
- Data Structure (for NJUPT undergraduate students, Fall and Spring, 2021-2022)