Yi Cai – Personal Website

I am currently a PhD candidate in the Cybersecurity and AI working group at Freie Universität Berlin under the supervision of Prof. Gerhard Wunder. My research centers on responsible AI, with a particular emphasis on explainability, fairness, and robustness of machine learning and AI systems.

Building on several published works in feature attribution, I'm now actively seeking to extend my research to broader forms of explanations and investigate their utility, including higher-order feature interactions, data attribution, and their practical impacts in model training & debugging.

Explainable AI Feature Attribution LLMs

Download CV View publications

Experience

Academic roles

PhD Student, Research Assistant in Explainable AI

Freie Universität Berlin · Berlin, Germany

2022 – Present

Research Assistant in Explainable AI

Leibniz Universität Hannover · Hannover, Germany

2020 – 2022

Student Research Assistant in Semi-supervised Machine Learning

Leibniz Universität Hannover · Hannover, Germany

2017 – 2018

Research & Publications

Selected work

Rethinking Explanation Evaluation under the Retraining Scheme

Accepted at AAAI'26

Revisits widely-used retraining-based evaluation for feature attribution, demonstrates its pitfalls, and proposes alternative efficient, distortion-free schemes for assessing explanation quality.

Preprint Code

GEFA: A General Feature Attribution Framework Using Proxy Gradient Estimation

ICML'25

Derives a feature attribution framework based on proxy gradient estimation. The proposed method provides an unbiased estimation of Shapley values, which is generally applicable to different model architectures and data modalities.

Paper Code

On Gradient-like Explanation under a Black-box Setting

ICML'24

Explores the possibility of delivering gradient-like attributions with only query-level access. The resulting preserves a set of promising axioms of the white-box approaches and demonstrates competitive performance in the empirical experiments

Preprint Paper Code

For a full list of publications, please visit my Google Scholar profile.

Projects

The projects that fund(ed) my research

Opportunities and risks of generative AI in cybersecurity (AIgenCY)

2024 - Present

AIgenCY shall explore fundamental research on existing and forthcoming threats for and through generative AI, particularly large language models and foundation models. At the same time, suitable measures are to be developed to improve the detection of and defence against such cyber attacks.

Link Project Page (DE)

Center for Trustworthy AI (ZVKI)

2022 - 2024

The independent think tank iRights.Lab founded the centre in partnership with the institutes Fraunhofer AISEC and IAIS as well as the Freie Universität Berlin. As a forum for debate in Germany, the CTAI makes the developments surrounding societal questions about artificial intelligence and algorithmic systems tangible.

Link

Responsible Artificial Intelligence

2020 - 2022

Artificial intelligence (AI) technologies are the driving force behind digitization. Due to their enormous social relevance, a responsible use of AI is of particular importance. The research and application of responsible AI is a very young discipline and requires the bundling of research activities from different disciplines in order to design and apply AI systems in a reliable, transparent, secure and legally acceptable way.

Link

KISWIND

2021 - 2022

The demand for automated damage detection in the field of civil infrastructures is high, for economic and safety reasons. This collaborative research project aims to contribute to a further development of automated damage detection in wind energy turbines based on acoustic emission testing (AET) and machine learning.

Link

Contact

Reach out for collaborations, talks, or questions

Email: yi.cai@fu-berlin.de

GitHub: github.com/caiy0220

Scholar: Google Scholar profile

LinkedIn: LinkedIn profile