Second International Workshop on Trusted Automated Decision-Making
An ETAPS 2025 Workshop
ETAPS is the premier conference on theory and practice of software systems.
The Third International Workshop on Trusted Automated Decision-Making (TADM 2025) elicits discussion from researchers on what constitutes trustworthiness in automated decision-making. The landscape of AI safety has continued to evolve and expand rapidly. From hallucinations to the black-box algorithms, AI requires a paradigmatic shift in thinking about safety in software design and development. The aim of this workshop is to bring researchers together to advance the technical aspects of attaining trustworthiness, and ask how to engineer safe and robust AI. Continuing conversations on safety is critical to ensuring the responsible development and deployment of automated decision-making systems. Topics of interest include but are not limited to Adversarial Testing, Explainability, Interpretability, Counterfactual Reasoning, Bias Audits, Multi-Disciplinary Collaboration and Safety by Design.
Location:
Michael DeGroote Centre for Learning and Discovery (MDCL) at McMaster University
1280 Main St W, Hamilton, ON L8S 4K1, Canada
Time | Event |
8:45am - 9am | Opening Welcome (Ramesh) |
9:00 - 9:30 | Alan Wassyng |
9:30 - 10:00 | Omri Isac “Neural Network Verification with Proof Production” |
10:00 - 10:30 | Coffee Break |
Keynote | |
10:30 - 11:30 | David Parnas via Zoom 45 min Talk + 15 min Q & A “Never Trust Software that Imitates Intelligence! What Software Can We Trust?” |
Keynote Invited Talks 30 minutes each | |
11:30 - 12:00 | Mark Lawford |
12:00 - 12:30 | Andrés Corrada “Who Grades the Graders?” |
12:30- 14:00 | Lunch |
14:00 - 14:30 | Chris Drummond “Questioning the Centrality of Reproducibility in Sceintific Evaluation” |
14:30 - 15:00 | TADM speaker in person or Parnass via Zoom |
15:00 - 15:30 | Debate |
15:30 - 16:00 | Discussion |
16:00 - 16:30 | Break |
TADM Invited Talks 30 minutes each | |
16:30 - 17:00 | Bill Farmer “An Alternative Approach to Formal Mathematics” |
17:00 - 17:30 | Lingyang Chu “Boosted Adversarial Attacks: A Divide-and-Conquer Strategy for Revealing Vulnerabilities in Deep Learning Models” |
17:30 - 18:00 | Sandhya Saisubramanian via Zoom “User-Aligned Assessment of Autonomous Systems” |
We particularly encourage research of a nascent or speculative nature, in order to chalk a way forward. Two kinds of submissions are solicited, abstracts and position papers. TADM fosters the future of collaborative research for attaining the highest levels of trust in automated decision making. The abstracts and position papers will be subject to peer review by an interdisciplinary team, with a minimum of two reviewers for each submission. The top six submissions will be invited for presentation, and whose camera-ready versions will be included in the pre-proceedings.
Proceedings will be published as video on the TADM website
Although AI is often presented as a new field that has made great strides recently, we should have a sense of deja vu. There were university courses on AI more than fifty years ago. Then, as now, it was a promising field, a field in which researchers made great promises. It was then, and remains, a field that produces untrustworthy programs. Why is that, and what we can do about it?
David Lorge Parnas began studying professional software development in 1969. He is best known for the concept of information hiding, methods of precise software documentation and advocacy for professionalism. He taught at Carnegie Mellon, Technische Hochschule Darmstadt, McMaster University, the University of Limerick among others. He has received four honorary doctorates and is a Fellow of the Royal Society of Canada, the Royal Irish Academy, the Canadian Academy of Engineering, the Gesellschaft für Informatik, the ACM and the IEEE.
Mark Lawford is the Director of McMaster Centre for Software Certification (McSCert), the 2024 IEEE TCSE Synergy award recipient. He joined McMaster University in 1998, where he helped to develop the software engineering and mechatronics engineering programs. Throughout his career he has work with industry on the practical application of formal methods and model based software engineering to safety critical real-time software. After working with industry partners in the nuclear and medical domains earlier in his career, over the past decade he has increasingly focused on working with OEMs on assurance of automotive systems.
Title: Software Defined Vehicle Safety: More than an AI problem
Abstract: Automotive innovation is increasingly software driven. As a result of the competitive environment requiring a yearly release of new automotive models featuring the latest driver assistance features and innovation in electrification, safety critical software is being developed at an unprecedented scale on extremely tight deadlines. To be competitive automotive manufacturers and parts suppliers need to change how they develop and assure software intensive systems. Increasingly the software is making use of Machine Learning (ML) to provide this new functionality. Further researchers are advocating for the use of ML to automate the safety assurance of these systems. I will examine automotive accidents involving software and ML components in safety critical automotive systems, discuss how the accidents could have been prevented, and comment on potential issues with the use of ML in safety systems and the use of ML in the development of the safety assurance those systems .
Dr. Corrada-Emmanuel trained as a physicist at Harvard and the University of Massachusetts at Amherst. He transitioned into industrial work by joining Paul Bamberg, one of his instructors at Harvard, at Dragon Systems. There he was part of the R&D team that released the 1st consumer continuous speech recognition product - the still continuing Naturally Speaking line. The problem of errors in transcriptions of recorded speech by human annotators he encountered there was the genesis of his research into the problem of evaluating experts without ground truth. In 2008 he started working with Howard Schultz at UMass/Amherst on the problem of fusing multiple noisy maps without ground truth. They developed an algebraic method that used Compressed Sensing techniques to recover the average error in sparsely-correlated maps. This was followed by the classification version in 2010 that uses Algebraic Geometry. More than a decade of research with this algebraic approach to evaluation culminated in 2023 with the realization that it is part of a wider logic of unsupervised evaluation. He recently founded NTQR Logic to research and promote the use of this logic for formal verification of expert evaluations in unsupervised settings.
Title: Who grades the graders? Logics of unsupervised evaluation for formal verification of monitoring experts
Abstract: Using AI agents to monitor the work of other agents is a safety design pattern actively being researched. In unsupervised settings - no human in the loop - this results in open monitoring chains with unspecified ambiguities at the end. This talk discusses a formal verification framework for assessments of monitoring experts via multiple-choice exams. These tests can be interpreted semantically as classification evaluations. They can also be treated semantically free. We can consider the mathematics of mappings between observed counts of disagreements between experts on a test an the only set of correctness statistics that explain them. In unsupervised settings no answer keys are available for the exams experts take. As a consequence, formally verifying the logical soundness of monitoring experts is not possible. But formal verification of their logical consistency is possible. It can be used to construct purely logical alarms that can detect when at least one member of a monitoring ensemble is malfunctioning. The practical use of the alarms is demonstrated by having the latest reasoning LLMs grade other LLMs completing Dyck language expressions. The BIG-Bench-Mistake dataset used for the experiments provides a rich sandbox for demonstrating how the logic of multiple choice exams is useful for grading experts carrying out more difficult evaluations.
Alan Wassyng is a Professor in the Department of Computing and Software at McMaster University. He has been working on safety critical software intensive systems for over 30 years. His research focuses on how we develop such systems so that they are safe and dependable, and how we certify this. As a consultant to Ontario Hydro, in the 1990s he helped develop a new methodology for building safety critical software at Ontario Hydro - now Ontario Power Generation, and was a senior member of the team that used that methodology to redesign the shutdown systems. That methodology, slightly modified, is still in use today. After running a software consulting business for 15 years, he returned to academic life, joining the Department of Computing and Software at McMaster University in 2002. His research at McMaster focuses on safety assurance of safety critical software intensive systems. With colleagues Tom Maibaum and Mark Lawford at McMaster, he founded the McMaster Centre for Software Certification (McSCert) in 2009. He was the Director of McSCert from 2009 to 2017. McSCert was the winner of the 2024 IEEE CS TCSE Distinguished Synergy Award, "for outstanding and sustained contributions to the software engineering community". Alan's graduate students publish primarily on developing and assuring safety of software intensive systems, especially in the automotive and medical device domains. He was also one of the founders of the Software Certification Consortium (SCC), and has been the Chair of the SCC Steering Committee since its inception. The SCC held its 1st meeting in August 2007 and its 21st meeting in May 2024. Alan has collaborated with the US Nuclear Regulatory Commission (NRC) and with the US Food and Drug Administration on safety of digital systems. He was one of the 10 researchers selected world-wide by the NRC to participate in their Expert Clinic on Safety of Digital Instrumentation & Control in 2010.
I am a PhD student advised by Prof. Guy Katz, at the School of Computer Science and Engineering at the Hebrew University of Jerusalem. My research is centered on the verification of deep neural networks (DNNs), focusing on proof production. In particular, I am working on: (1) enabling DNN verifiers to produce proofs, that attest to their correctness or reveal potential errors in their implementation; (2) reliably checking proofs produced by verifiers; and (3) leveraging proofs to improve the scalability of DNN verifiers.
Title: Neural Network Verification with Proof Production
Abstract: The rapid rise of AI in computer science has led to the widespread deployment of deep neural networks (DNNs), including in safety-critical domains. This growing dependence on DNNs creates an urgent need to ensure their correctness. In response, the verification community has devised various techniques and tools for verifying DNNs. When DNN verifiers discover an input that triggers an error, that is easy to confirm; but when they report that no error exists, it is much more difficult to ensure that the verification tool itself is not flawed. As multiple errors have already been observed in DNN verification tools, the reliability of DNN verification is questioned. One prominent approach for addressing this issue is by proof production, i.e., the generation of mathematical objects that witness the correctness of the verification process.
In this talk, I will provide a brief overview of DNN verification and introduce a method for augmenting a class of verifiers with proof production capabilities. Specifically, we will explore: 1) the construction of such a mechanism based on a constructive adaptation of the well-known Farkas lemma and mechanisms for handling piecewise-linear functions; 2) how to reliably check these proofs; and 3) how proof production could be leveraged to improve the performance of DNN verifiers, using conflict clauses.
Sandhya Saisubramanian is an Assistant Professor in the School of EECS at Oregon State University where she leads the Intelligent and Reliable Autonomous Systems (IRAS) research group. Her research focus is on safe and reliable decision making in autonomous systems that operate in complex, unstructured environments. She received her Ph.D from the University of Massachusetts Amherst. Saisubramanian is a recipient of the Outstanding Program Committee Member award at ICAPS 2022 and a Distinguished Paper Award at IJCAI 2020.
Title: User-Aligned Assessment of Autonomous Systems
Abstract: As autonomous systems become increasingly embedded in real-world settings, evaluating their performance through aggregate metrics like task completion rate or accuracy is often insufficient. Such measures can obscure critical misalignments between system behavior and user expectations. In this talk, I will present two approaches, each focusing on different sources of errors: (1) validating the agent's learning process through explanations; and (2) detecting and categorizing errors, including identifying scenarios where successful task completion is inherently infeasible. By aligning assessment methods with what users truly value, we move toward more reliable and trustworthy autonomous systems.
Title: An Alternative Approach to Formal Mathematics
Abstract: Formal mathematics is mathematics done with the aid of a formal logic. It offers mathematics practitioners -- mathematicians, computing professionals, engineers, and scientists -- four major benefits over traditional mathematics: (1) greater rigor, (2) software support, (3) mechanically checks proofs, and (4) mathematical knowledge as a formal structure. The standard approach to formal mathematics focuses on certification: Mathematics is done with the help of a proof assistant and all details are formally proved and mechanically checked. Although there is a good argument that the standard approach has been a big success, it has had very little impact on mathematical practice. Only a small slice of the millions of mathematics practitioners have ever used a formal logic or proof assistant in their work!
We propose, as an alternative to the standard approach, an approach to formal mathematics that focuses on two goals: communication and accessibility. We argue that to achieve these goals the approach needs to satisfy the following requirements: (1) the underlying logic is as close to mathematical practice as possible, (2) theories, notation, and proofs are constructed to maximize communication, (3) mathematical knowledge is organized using the little theories method, and (4) there are several levels of supporting software. We believe that formal mathematics can be made more useful, accessible, and natural to a wider range of mathematics practitioners by pursuing this alternative approach. Moreover, this approach can be used as a stepping stone for helping mathematics practitioners cross the great void between traditional mathematics and fully formal mathematics.
Dr. Farmer's research interests are logic, mathematical knowledge management, mechanized mathematics, and formal methods. One of his most significant achievements is the design and implementation of the IMPS proof assistant, which was done at MITRE in partnership with Dr. Joshua Guttman and Dr. Javier Thayer. His work on IMPS has led to research on developing practical logics based on simple type theory and NGB set theory and on organizing mathematical knowledge as a network of interconnected axiomatic theories. He also has collaborated with Dr. Jacques Carette for several years at McMaster on developing a framework for integrating axiomatic and algorithmic mathematics. As part of this research, Dr. Farmer has investigated how to reason about the interplay of syntax and semantics, as exhibited in syntax-based mathematical algorithms like symbolic differentiation, within a logic equipped with global quotation and evaluation operators. Dr. Farmer is currently working on developing a communication-oriented approach to formal mathematics as an alternative to the standard certification-oriented approach employed using proof assistants.
Dr. Lingyang Chu is an Assistant Professor in the Department of Computing and Software at McMaster University. He earned his Ph.D. in Computer Science from the University of Chinese Academy of Sciences in 2015. Following his doctorate, Dr. Chu was a Postdoctoral Fellow at Simon Fraser University under the supervision of Prof. Jian Pei from 2015 to 2018. He then served as a Principal Researcher at Huawei Technologies Canada before joining McMaster University on January 1, 2021.
Throughout his career, Dr. Chu has made significant contributions to both academia and industry. His scholarly output includes 56 research works accumulating over 1,976 citations. His research has been published in top-tier venues, and one of his works on interpretable AI was highlighted by a mainstream AI research portal in China in 2018. In the industry sector, many of his research outcomes have been successfully deployed as services and products. Notably, a personalized federated learning system he worked on was implemented as a core service on tens of millions of smartphones running Harmony OS.
Title: Boosted Adversarial Attacks: A Divide-and-Conquer Strategy for Revealing Vulnerabilities in Deep Learning Models
Abstract: Ensuring the robustness of deep learning models requires a systematic understanding of how models may fail under adversarial conditions. In this talk, I present two of my recent works that explore the concept of boosted attack—a principled adversarial methodology that combines multiple diversified attack strategies and coordinates them through a learned selection mechanism. This divide-and-conquer approach produces stronger and more adaptive adversarial behaviors, which provides an effective lens to uncover hidden vulnerabilities in modern deep neural networks. Specifically, the first work studies the strength of the boosted attack method against image-based neural networks. We propose the Cocktail Universal Adversarial Attack (CUAP), which jointly trains a set of diverse universal perturbations and a lightweight selection network that assigns the most effective perturbation to attack each input image. This method significantly improves attack success rates while preserving real-time efficiency. The second work further studies the strength of the boosted attack method against graph neural networks. We introduce the Multifaceted Anchor Nodes Attack (MFAN), which constructs multiple sets of anchor nodes and trains an assignment network to select the optimal set for each target node. This strategy achieves high attack effectiveness under strict budget constraints and demonstrates strong performance in structural attacks on graphs. Together, these studies establish boosted attack as a general and scalable strategy for adversarial evaluation. The findings reveal critical structural weaknesses in deep learning models and point to new directions for designing models with stronger security and robustness.
Please feel free to reach out to Ramesh Bharadwaj with any questions you may have.