Explainable AI reveals Clever Hans effects in unsupervised learning models

Brown, T. B. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems, NeurIPS Vol. 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).
Ruff, L. et al. A unifying review of deep and shallow anomaly detection. Proc. IEEE 109, 756–795 (2021).
Google Scholar
Krishnan, R., Rajpurkar, P. & Topol, E. J. Self-supervised learning in medicine and healthcare. Nat. Biomed. Eng. 6, 1346–1352 (2022).
Google Scholar
Li, A. et al. Unsupervised analysis of transcriptomic profiles reveals six glioma subtypes. Cancer Res. 69, 2091–2099 (2009).
Google Scholar
Jiang, L., Xiao, Y., Ding, Y., Tang, J. & Guo, F. Discovering cancer subtypes via an accurate fusion strategy on multiple profile data. Front. Genet. 10, 20 (2019).
Google Scholar
Eberle, O. et al. Historical insights at scale: a corpus-wide machine learning analysis of early modern astronomic tables. Sci. Adv. 10, eadj1719 (2024).
Google Scholar
Rettig, L., Khayati, M., Cudré-Mauroux, P. & Piórkowski, M. in Applied Data Science 289–312 (Springer, 2019).
Eskin, E., Arnold, A., Prerau, M. J., Portnoy, L. & Stolfo, S. J. in Applications of Data Mining in Computer Security, Advances in Information Security 77–101 (Springer, 2002).
Bergmann, P., Batzner, K., Fauser, M., Sattlegger, D. & Steger, C. The MVTec anomaly detection dataset: a comprehensive real-world dataset for unsupervised anomaly detection. Int. J. Comput. Vis. 129, 1038–1059 (2021).
Google Scholar
Zipfel, J. et al. Anomaly detection for industrial quality assurance: a comparative evaluation of unsupervised deep learning models. Comput. Ind. Eng. 177, 109045 (2023).
Google Scholar
Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at (2021).
Chen, T., Kornblith, S., Swersky, K., Norouzi, M. & Hinton, G. E. Big self-supervised models are strong semi-supervised learners. In Advances in Neural Information Processing Systems, NeurIPS Vol. 33 (eds Larochelle, H. et al.) 22243–22255 (Curran Associates, 2020).
Radford, A. et al. Learning transferable visual models from natural language supervision. In ICML Proc. Machine Learning Research Vol. 139, 8748–8763 (2021).
Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).
Google Scholar
Dippel, J. et al. RudolfV: a foundation model by pathologists for pathologists. Preprint at (2024).
Lapuschkin, S. et al. Unmasking Clever Hans predictors and assessing what machines really learn. Nat. Commun. 10, 1096 (2019).
Google Scholar
Geirhos, R. et al. Shortcut learning in deep neural networks. Nat. Mach. Intell. 2, 665–673 (2020).
Google Scholar
Schramowski, P. et al. Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nat. Mach. Intell. 2, 476–486 (2020).
Google Scholar
DeGrave, A. J., Janizek, J. D. & Lee, S.-I. for radiographic COVID-19 detection selects shortcuts over signal. Nat. Mach. Intell. 3, 610–619 (2021).
Google Scholar
Anders, C. J. et al. Finding and removing Clever Hans: using explanation methods to debug and improve deep models. Inf. Fusion 77, 261–295 (2022).
Google Scholar
Linhardt, L., Müller, K.-R. & Montavon, G. Preemptively pruning Clever-Hans strategies in deep neural networks. Inf. Fusion 103, 102094 (2024).
Google Scholar
Winkler, J. K. et al. Association between surgical skin markings in dermoscopic images and diagnostic performance of a deep learning convolutional neural network for melanoma recognition. JAMA Dermatol. 155, 1135–1141 (2019).
Google Scholar
Montavon, G., Samek, W. & Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018).
Google Scholar
Gunning, D. et al. XAI—explainable artificial intelligence. Sci. Robot. 4, eaay7120 (2019).
Google Scholar
Arrieta, A. B. et al. Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020).
Google Scholar
Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J. & Müller, K.-R. Explaining deep neural networks and beyond: a review of methods and applications. Proc. IEEE 109, 247–278 (2021).
Google Scholar
Klauschen, F. et al. Toward explainable artificial intelligence for precision pathology. Annu. Rev. Pathol. 19, 541–570 (2024).
Google Scholar
Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, e0130140 (2015).
Google Scholar
Montavon, G., Binder, A., Lapuschkin, S., Samek, W. & Müller, K.-R. in Lecture Notes in Computer Science Vol. 11700 (eds Samek, W. et al.) 193–209 (Springer, 2019).
Kauffmann, J., Müller, K.-R. & Montavon, G. Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recognit. 101, 107198 (2020).
Google Scholar
Eberle, O. et al. Building and interpreting deep similarity models. IEEE Trans. Pattern Anal. Mach. Intell. 44, 1149–1161 (2022).
Google Scholar
Vielhaben, J., Lapuschkin, S., Montavon, G. & Samek, W. Explainable AI for time series via virtual inspection layers. Pattern Recognit. 150, 110309 (2024).
Google Scholar
Chormai, P., Herrmann, J., Müller, K.-R. & Montavon, G. Disentangled explanations of neural network predictions by finding relevant subspaces. IEEE Trans. Pattern Anal. Mach. Intell. 46, 7283–7299 (2024).
Google Scholar
Zhou, C. et al. LIMA: less is more for alignment. In Advances in Neural Information Processing Systems, NeurIPS Vol. 36 (eds Oh, A. et al.) 55006–55021 (Curran Associates, 2023).
Muttenthaler, L., Dippel, J., Linhardt, L., Vandermeulen, R. A. & Kornblith, S. Human alignment of neural network representations. In Proc. International Conference on Learning Representations (ICLR) (OpenReview.net, 2023).
Wang, X. et al. ChestX-Ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3462–3471 (IEEE, 2017).
Cohen, J. P. et al. COVID-19 image data collection: prospective predictions are the future. Preprint at (2020).
Azizi, S. et al. Big self-supervised models advance medical image classification. In Proc. International Conference on Computer Vision (ICCV) 3458–3468 (IEEE, 2021).
Eslami, S., Meinel, C. & de Melo, G. PubMedCLIP: how much does CLIP benefit visual question answering in the medical domain? In Proc. Findings of the Association for Computational Linguistics (EACL) 1151–1163 (Association for Computational Linguistics, 2023).
Huang, S.-C. et al. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. npj Digit. Med. 6, 74 (2023).
Google Scholar
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. E. A simple framework for contrastive learning of visual representations. In ICML Proc. Machine Learning Research Vol. 119, 1597–1607 (PMLR, 2020).
Zbontar, J., Jing, L., Misra, I., LeCun, Y. & Deny, S. Barlow twins: self-supervised learning via redundancy reduction. In ICML Proc. Machine Learning Research Vol. 139, 2310–12320 (PMLR, 2021).
Deng, J. et al. Imagenet: a large-scale hierarchical image database. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 248–255 (IEEE, 2009).
Chen, T., Luo, C. & Li, L. Intriguing properties of contrastive losses. In Advances in Neural Information Processing Systems, NeurIPS Vol. 34 (eds Ranzato, M. et al.) 11834–11845 (Curran Associates, 2021).
Robinson, J. et al. Can contrastive learning avoid shortcut solutions? In Advances in Neural Information Processing Systems, NeurIPS Vol. 34 (eds Ranzato, M. et al.) 4974–4986 (Curran Associates, 2021).
Dippel, J., Vogler, S. & Höhne, J. Towards fine-grained visual representations by combining contrastive learning with image reconstruction and attention-weighted pooling. In ICML Workshop: Self-Supervised Learning for Reasoning and Perception (2021).
Li, T. et al. Addressing feature suppression in unsupervised visual representations. In Proc. Winter Conference on Applications of Computer Vision (WACV) 1411–1420 (IEEE, 2023).
Roth, K. et al. Towards total recall in industrial anomaly detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition 14298–14308 (IEEE, 2022).
Batzner, K., Heckler, L. & König, R. Efficientad: accurate visual anomaly detection at millisecond-level latencies. In Proc. Winter Conference on Applications of Computer Vision (WACV) 127–137 (IEEE, 2024).
Harmeling, S., Dornhege, G., Tax, D., Meinecke, F. & Müller, K.-R. From outliers to prototypes: ordering data. Neurocomputing 69, 1608–1618 (2006).
Google Scholar
Aggarwal, C. C. Outlier Analysis (Springer, 2013).
Chandola, V., Banerjee, A. & Kumar, V. Anomaly detection: a survey. ACM Comput. Surv. 41, 15:1–15:58 (2009).
Google Scholar
Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 33, 1065–1076 (1962).
Google Scholar
Kim, J. & Scott, C. D. Robust kernel density estimation. J. Mach. Learn. Res. 13, 2529–2565 (2012).
Google Scholar
Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J. & Williamson, R. C. Estimating the support of a high-dimensional distribution. Neural Comput. 13, 1443–1471 (2001).
Google Scholar
Montavon, G., Kauffmann, J. R., Samek, W. & Müller, K.-R. in Lecture Notes in Computer Science Vol. 13200 (eds Holzinger, A. et al.) 117–138 (Springer, 2020).
Yu, Y., Qian, J. & Wu, Q. Visual saliency via multiscale analysis in frequency domain and its applications to ship detection in optical satellite images. Front. Neurorobot. 15, 767299 (2022).
Google Scholar
Parmar, G., Zhang, R. & Zhu, J. On aliased resizing and surprising subtleties in GAN evaluation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 11400–11410 (IEEE, 2022).
Kirichenko, P., Izmailov, P. & Wilson, A. G. Last layer re-training is sufficient for robustness to spurious correlations. In Proc. International Conference on Learning Representations (ICLR) (OpenReview.net, 2023).
Sugiyama, M., Krauledat, M. & Müller, K.-R. Covariate shift adaptation by importance weighted cross validation. J. Mach. Learn. Res. 8, 985–1005 (2007).
Google Scholar
Sugiyama, M. & Kawanabe, M. Machine Learning in Non-stationary Environments: Introduction to Covariate Shift Adaptation (MIT Press, 2012).
Iwasawa, Y. & Matsuo, Y. Test-time classifier adjustment module for model-agnostic domain generalization. In Advances in Neural Information Processing Systems, NeurIPS Vol. 34 (eds Ranzato, M. et al.) 2427–2440 (Curran Associates, 2021).
Esposito, C., Landrum, G. A., Schneider, N., Stiefl, N. & Riniker, S. Ghost: adjusting the decision threshold to handle imbalanced data in machine learning. J. Chem. Inf. Model. 61, 2623–2640 (2021).
Google Scholar
Niven, T. & Kao, H. Probing neural network comprehension of natural language arguments. In Proc. Conference of the Association for Computational Linguistics (eds Korhonen, A. et al.) 4658–4664 (Association for Computational Linguistics, 2019).
Heinzerling, B. NLP’s Clever Hans moment has arrived. J. Cogn. Sci. 21, 161–170 (2020).
Braun, M. L., Buhmann, J. M. & Müller, K.-R. On relevant dimensions in kernel feature spaces. J. Mach. Learn. Res. 9, 1875–1908 (2008).
Google Scholar
Basri, R. et al. Frequency bias in neural networks for input of non-uniform density. In ICML Proc. Machine Learning Research Vol. 119, 685–694 (PMLR, 2020).
Fridovich-Keil, S., Lopes, R. G. & Roelofs, R. Spectral bias in practice: the role of function frequency in generalization. In Advances in Neural Information Processing Systems, NeurIPS Vol. 35 (eds Koyejo, S. et al.) (Curran Associates, 2022).
Arras, L. et al. in Lecture Notes in Computer Science Vol. 11700 (eds Samek, W. et al.) 211–238 (Springer, 2019).
Schnake, T. et al. Higher-order explanations of graph neural networks via relevant walks. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7581–7596 (2022).
Google Scholar
Ali, A. et al. XAI for transformers: better explanations through conservative propagation. In ICML Proc. Machine Learning Research Vol. 162, 435–451, (PMLR, 2022).
Jafari, F. R., Montavon, G., Muller, K. R. & Eberle, O. MambaLRP: explaining selective state space sequence models. In Advances in Neural Information Processing Systems, NeurIPS Vol. 37 (eds Globerson, A. et al.) 118540–118570 (Curran Associates, 2024).
Munir, M., Siddiqui, S. A., Dengel, A. & Ahmed, S. DeepAnT: a deep learning approach for unsupervised anomaly detection in time series. IEEE Access 7, 1991–2005 (2019).
Google Scholar
Devlin, J., Chang, M., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) (eds Burstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).
Pelka, O., Koitka, S., Rückert, J., Nensa, F. & Friedrich, C. M. in Lecture Notes in Computer Science Vol. 11043 (eds Stoyanov, D. et al.) 180–189 (Springer, 2018).
PubMedCLIP Hugging Face (2024).
openaiCLIP GitHub (2024).
Facebook Research. barlowtwins GitHub (2022).
CXR8 NIHCC (2017).
ieee8023 COVID-chestxray-dataset GitHub (2020).
Pang, G., Shen, C., Cao, L. & van den Hengel, A. Deep learning for anomaly detection: a review. ACM Comput. Surv. 54, 38:1–38:38 (2022).
Google Scholar
Rippel, O., Mertens, P. & Merhof, D. Modeling the distribution of normal data in pre-trained deep features for anomaly detection. In ICPR 6726–6733 (IEEE, 2020).
Jelinek, F., Mercer, R. L., Bahl, L. R. & Baker, J. K. Perplexity—a measure of the difficulty of speech recognition tasks. J. Acoust. Soc. Am. 62, S63–S63 (2005).
Google Scholar
Amazon Science. patchcore-inspection GitHub (2022).
Bradski, G. The OpenCV library. Dr. Dobb’s Journal of Tools 120, 122–125 (2000).
Torchvision: PyTorch’s computer vision library (2016).
Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K. & Müller, K.-R. Explainable AI: Interpreting, Explaining And Visualizing Deep Learning Vol. 11700 (Springer, 2019).
zennit. GitHub (2021).
Kauffmann, J. et al. Explainable AI reveals clever hans effects in unsupervised learning models: code. Zenodo (2024).
Ml-workgroup. COVID-19 image repository. GitHub (2020).
2025-03-17 00:00:00