Machine Learning Seminar

About Joint Machine Learning Seminar Series: Our newly launched Joint Machine Learning Seminar Series is a collaborative initiative across three schools at the University of Sydney, co-organized by Dr. Chang Xu (School of Computer Science), Prof. Dmytro Matsypura (Business School), and Yiming Ying (School of Mathematics & Statistics). The goal of this initiative is to foster interdisciplinary interaction and collaboration on cutting-edge research in Machine Learning (ML) and Artificial Intelligence (AI). We welcome suggesions of potential future speakers for this seminar series.

Future Seminars: To maintain a high-quality seminar series, we aim to feature speakers with impactful contributions to ML and AI research. However, if no suitable speaker is available for a given session, we will organize canned seminar talks in the School of math and statistics, focusing on the mathematical and statistical aspects of machine learning, ensuring continuous engagement with fundamental and advanced topics in the field. We invite researchers, faculty, and students across disciplines to join us for this engaging talk and networking opportunity over coffee!

Please direct enquiries about this seminar series to Yiming Ying.

Seminars

Monday, June 16, 2025, 11am-12pm Carslaw 175-275

Speaker: Junyu Zhou (University of Sydney)

Title: Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

Abstract:

Recent progress has been made in understanding the statistical generalization performance of gradient descent methods for overparameterized neural networks within the neural tangent kernel (NTK) regime. However, most of the existing work on regression problems is limited to shallow network architectures, leaving a notable gap in the theory for deep neural networks. We address this gap by presenting a comprehensive generalization analysis for deep ReLU networks trained using gradient descent (GD) and stochastic gradient descent (SGD). Specifically, we establish the first known minimax-optimal rates of excess population risk for both GD and SGD with deep ReLU networks, under the assumption that the network width scales polynomially with respect to the network depth and training sample size. Our results demonstrate that with sufficient width, gradient descent methods for deep ReLU networks can achieve the optimal rates of generalization on par with kernel methods.

Monday, June 2, 2025, 11am-12pm Carslaw 173

Speaker: Dr. Pinak Mandal (University of Sydney)

Title: Learning Dynamical Systems with Hit-and-Run Random Feature Maps

Abstract:

Forecasting chaotic dynamical systems is a central challenge across science and engineering. In this talk, we will explore how random feature maps can be adapted to deliver remarkably strong performance on this task. A key ingredient for successful forecasting is ensuring that the features produced by the model lie in the nonlinear region of the activation function. We will see how this can be achieved through careful selection of the internal weights in a data-driven way using a hit-and-run algorithm. With a few additional modifications, such as increasing the depth of the model and introducing localization, we achieve state-of-the-art forecasting results on a variety of high-dimensional chaotic systems, reaching up to 512 dimensions. Our method produces accurate short-term trajectory predictions, as well as reliable estimates of long-term statistical behavior in the test cases.

Speaker Bio: Pinak Mandal is a postdoctoral researcher at the University of Sydney, specializing in machine learning and dynamical systems. His work spans several topics, including unlearning in generative models, learning dynamical systems from data, deep learning for solving PDEs, and data assimilation. He also develops open-source tools for scientific computing and visualization.

Monday, MAY 26, 2025, 11am-12pm Carslaw 452

Speaker: Dr. Andi Han (University of Sydney)

Title: A Theoretical Perspective on Diffusion Models through Feature Learning

Abstract:

Diffusion models have emerged as a cornerstone of modern generative AI. Yet, their theoretical foundations remain poorly understood. In this talk, we present a new perspective by elucidating diffusion models through the lens of feature learning. By carefully analyzing their training dynamics under gradient descent, we demonstrate that diffusion models inherently promote more balanced feature representations than conventional classification models. This insight helps explain the growing interest in using diffusion models for representation learning. Furthermore, we show that when underlying rules exist between features, diffusion models trained via denoising score matching exhibit persistent errors in capturing these rules. This aligns with empirical observations that diffusion models often struggle to respect physical constraints in generated images and videos.

Speaker Bio: Andi Han recently joined the School of Mathematics and Statistics, University of Sydney, as a Lecturer in Data Science. Before joining USYD, he was a Postdoctoral Researcher at RIKEN AIP, Continuous Optimization Team. He completed his PhD in Business Analytics at USYD. His research broadly covers large generative models, optimization (on manifolds), efficiency of foundation models and graph neural networks with applications to biology and chemistry.

Monday, MAY 19, 2025, 11:00am, J12 (CS building) lecture theatre 123

Speaker: Dino Sejdinovic (University of Adelaide)

Title: Squared Neural Probabilistic Models

Abstract:

We describe a new class of probabilistic models, squared families, where densities are defined by squaring a linear transformation of a statistic and normalising with respect to a base measure. Key quantities, such as the normalising constant and certain statistical divergences, admit a helpful parameter-integral decomposition giving a closed form normalising constant in many cases of interest. Parametrising the statistic using neural networks results in highly expressive yet tractable models, with universal approximation properties. This approach naturally extends to other probabilistic settings, such as modelling point processes. We illustrate the effectiveness of squared neural probabilistic models on a variety of tasks, demonstrating their ability to represent complex distributions while maintaining analytical and computational advantages. Joint work with Russell Tsuchida, Jiawei Liu, and Cheng Soon Ong.

Speaker Bio: Dino Sejdinovic is a Professor of Statistical Machine Learning at the University of Adelaide (since 2022), where he is affiliated with the Australian Institute for Machine Learning (AIML) and the Responsible AI Research Centre (RAIR). He also holds visiting appointments with the Nanyang Technological University, Singapore and the Institute of Statistical Mathematics, Tokyo. He was previously an Associate Professor at the Department of Statistics, University of Oxford and a Turing Faculty Fellow of the Alan Turing Institute. He held postdoctoral positions at the University College London and the University of Bristol and received a PhD in Electrical and Electronic Engineering from the University of Bristol (2009). His research spans a wide variety of topics at the interface between machine learning and statistical methodology, including large-scale nonparametric and kernel methods, robust and trustworthy machine learning, causal inference, and uncertainty quantification.

Friday, MAY 16, 2025, 1pm-2pm Carslaw Building 275

Speaker: Professor Grace YI (University of Western Ontario)

Title: Exploring Data Quality Challenges with a Quick Tour of Statistical Inference and Machine Learning

Abstract:

In today's data-driven landscape, harnessing vast and varied datasets provides unparalleled opportunities for knowledge extraction and informed decision-making. Yet, within this abundance, a pivotal concern emerges: the quality and origin of the data, a challenge pervasive across diverse fields such as health sciences, epidemiology, economics, and beyond. The presence of noisy data or measurement error can obscure patterns, introduce biases, and undermine the reliability of analyses. Such an issue has attracted extensive attention in both statistical and machine learning communities. In this talk, I will briefly discuss the complexities arising from dealing with noisy data and their potential to impede statistical inference or machine learning procedures. Through this exploration, we aim to shed light on the importance of addressing data quality issues and developing strategies to mitigate their adverse effects on decision-making processes.

Speaker Bio: Grace Y. Yi is a Professor and Tier I Canada Research Chair in Data Science at the University of Western Ontario. Her research interests focus on statistical methodology to address challenges concerning measurement error, causal inference, missing data, high-dimensional data, and statistical machine learning. She authored the monograph “Statistical Analysis with Measurement Error or Misclassification: Strategy, Method and Application” (2017, Springer) and co-edited “Handbook of Measurement Error Models” (Grace Y. Yi, Aurore Delaigle, and Paul Gustafson, 2021, Chapman & Hall/CRC). Professor Yi is a Fellow of the Institute of Mathematical Statistics, a Fellow of the American Statistical Association, and an Elected Member of the International Statistical Institute. In 2010, she received the Centre de Recherches Mathématiques and the Statistical Society of Canada (CRM-SSC) Prize. Professor Yi is a Co-Editor-in-Chief of The Electronic Journal of Statistics (2022-2024) and the Editor of the Statistical Methodology and Theory Section for The New England Journal of Statistics in Data Science. She was the Editor-in-Chief of The Canadian Journal of Statistics (2016-2018). She was the chair of the Lifetime Data Science Section of the American Statistical Association (2023). She was the President of the Statistical Society of Canada (2021-2022) and the Founder of the first chapter (Canada Chapter, established in 2012) of the International Chinese Statistical Association.

Monday, MAY 5, 2025, 1pm-2pm AGR room

Speaker: Peilin Liu (Usyd)

Title: Theoretical Insights into Transformers

Abstract:

Transformer-based architectures have emerged as cornerstones of deep learning, powering breakthroughs in natural language processing via innovations like parameter-efficient fine-tuning and scalable model designs. Yet despite their remarkable empirical performance, a unified theoretical framework remains elusive. In this talk, I will introduce a novel distribution-regression perspective for transformers, offering rigorous analysis and fresh insights into their underlying mechanisms.

Speaker Bio: Peilin Liu is a second-year PhD candidate at the School of Mathematics and Statistics and the Brain and Mind Centre, University of Sydney. His research spans the theoretical foundations of deep learning theory, large language models, and medical AI.

Monday, April 14, 2025, 11:00am (Coffee served from 10:30 AM), F23.01.105, Michael Spence Building, Auditorium (2) 105

Speaker: Kush Varshney (IBM Research)

Registration required: registration link

Title: Toward a Systems Theory for Human-Centered Trustworthy Agentic AI

Abstract:

The evolution of AI has progressed rapidly from traditional machine learning models for prediction and classification to large language models (LLMs) for generative AI and now to autonomous, agentic AI systems. Despite these advancements, ensuring AI remains human-centered and trustworthy is a critical societal need. In this talk, Dr. Kush R. Varshney will explore the specifications required to ensure AI systems demonstrate reliability, human interaction, purpose alignment, and respect for human agency and dignity. He will discuss the current gaps in theoretical foundations for LLM-based agentic AI and propose how a systems theory approach could enhance prediction, control, and optimization in AI.

Speaker Bio: Dr. Kush R. Varshney is an IBM Fellow and leads Human-Centered Trustworthy AI Research at the IBM Thomas J. Watson Research Center, NY. His contributions to AI include developing well-known open-source tools such as AI Fairness 360, AI Explainability 360, Uncertainty Quantification 360, and AI FactSheets 360. He has been recognized with multiple IBM Corporate Technical Awards and is a Fellow of IEEE. His book Trustworthy Machine Learning (2022) explores the theoretical underpinnings of fair and explainable AI.

About the School

Research

Undergraduate Study

For Prospective Students

Internal Pages

Machine Learning Seminar

Seminars

Monday, June 16, 2025, 11am-12pm Carslaw 175-275

Monday, June 2, 2025, 11am-12pm Carslaw 173

Monday, MAY 26, 2025, 11am-12pm Carslaw 452

Monday, MAY 19, 2025, 11:00am, J12 (CS building) lecture theatre 123

Friday, MAY 16, 2025, 1pm-2pm Carslaw Building 275

Monday, MAY 5, 2025, 1pm-2pm AGR room

Monday, April 14, 2025, 11:00am (Coffee served from 10:30 AM), F23.01.105, Michael Spence Building, Auditorium (2) 105

Maths & Stats website: