Explainable AI reveals Clever Hans effects in unsupervised learning models

  • Brown, T. B. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems, NeurIPS Vol. 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).

  • Ruff, L. et al. A unifying review of deep and shallow anomaly detection. Proc. IEEE 109, 756–795 (2021).


    Google Scholar 

  • Krishnan, R., Rajpurkar, P. & Topol, E. J. Self-supervised learning in medicine and healthcare. Nat. Biomed. Eng. 6, 1346–1352 (2022).


    Google Scholar 

  • Li, A. et al. Unsupervised analysis of transcriptomic profiles reveals six glioma subtypes. Cancer Res. 69, 2091–2099 (2009).


    Google Scholar 

  • Jiang, L., Xiao, Y., Ding, Y., Tang, J. & Guo, F. Discovering cancer subtypes via an accurate fusion strategy on multiple profile data. Front. Genet. 10, 20 (2019).


    Google Scholar 

  • Eberle, O. et al. Historical insights at scale: a corpus-wide machine learning analysis of early modern astronomic tables. Sci. Adv. 10, eadj1719 (2024).


    Google Scholar 

  • Rettig, L., Khayati, M., Cudré-Mauroux, P. & Piórkowski, M. in Applied Data Science 289–312 (Springer, 2019).

  • Eskin, E., Arnold, A., Prerau, M. J., Portnoy, L. & Stolfo, S. J. in Applications of Data Mining in Computer Security, Advances in Information Security 77–101 (Springer, 2002).

  • Bergmann, P., Batzner, K., Fauser, M., Sattlegger, D. & Steger, C. The MVTec anomaly detection dataset: a comprehensive real-world dataset for unsupervised anomaly detection. Int. J. Comput. Vis. 129, 1038–1059 (2021).


    Google Scholar 

  • Zipfel, J. et al. Anomaly detection for industrial quality assurance: a comparative evaluation of unsupervised deep learning models. Comput. Ind. Eng. 177, 109045 (2023).


    Google Scholar 

  • Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at (2021).

  • Chen, T., Kornblith, S., Swersky, K., Norouzi, M. & Hinton, G. E. Big self-supervised models are strong semi-supervised learners. In Advances in Neural Information Processing Systems, NeurIPS Vol. 33 (eds Larochelle, H. et al.) 22243–22255 (Curran Associates, 2020).

  • Radford, A. et al. Learning transferable visual models from natural language supervision. In ICML Proc. Machine Learning Research Vol. 139, 8748–8763 (2021).

  • Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).


    Google Scholar 

  • Dippel, J. et al. RudolfV: a foundation model by pathologists for pathologists. Preprint at (2024).

  • Lapuschkin, S. et al. Unmasking Clever Hans predictors and assessing what machines really learn. Nat. Commun. 10, 1096 (2019).


    Google Scholar 

  • Geirhos, R. et al. Shortcut learning in deep neural networks. Nat. Mach. Intell. 2, 665–673 (2020).


    Google Scholar 

  • Schramowski, P. et al. Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nat. Mach. Intell. 2, 476–486 (2020).


    Google Scholar 

  • DeGrave, A. J., Janizek, J. D. & Lee, S.-I. for radiographic COVID-19 detection selects shortcuts over signal. Nat. Mach. Intell. 3, 610–619 (2021).


    Google Scholar 

  • Anders, C. J. et al. Finding and removing Clever Hans: using explanation methods to debug and improve deep models. Inf. Fusion 77, 261–295 (2022).


    Google Scholar 

  • Linhardt, L., Müller, K.-R. & Montavon, G. Preemptively pruning Clever-Hans strategies in deep neural networks. Inf. Fusion 103, 102094 (2024).


    Google Scholar 

  • Winkler, J. K. et al. Association between surgical skin markings in dermoscopic images and diagnostic performance of a deep learning convolutional neural network for melanoma recognition. JAMA Dermatol. 155, 1135–1141 (2019).


    Google Scholar 

  • Montavon, G., Samek, W. & Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018).


    Google Scholar 

  • Gunning, D. et al. XAI—explainable artificial intelligence. Sci. Robot. 4, eaay7120 (2019).


    Google Scholar 

  • Arrieta, A. B. et al. Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020).


    Google Scholar 

  • Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J. & Müller, K.-R. Explaining deep neural networks and beyond: a review of methods and applications. Proc. IEEE 109, 247–278 (2021).


    Google Scholar 

  • Klauschen, F. et al. Toward explainable artificial intelligence for precision pathology. Annu. Rev. Pathol. 19, 541–570 (2024).


    Google Scholar 

  • Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, e0130140 (2015).


    Google Scholar 

  • Montavon, G., Binder, A., Lapuschkin, S., Samek, W. & Müller, K.-R. in Lecture Notes in Computer Science Vol. 11700 (eds Samek, W. et al.) 193–209 (Springer, 2019).

  • Kauffmann, J., Müller, K.-R. & Montavon, G. Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recognit. 101, 107198 (2020).


    Google Scholar 

  • Eberle, O. et al. Building and interpreting deep similarity models. IEEE Trans. Pattern Anal. Mach. Intell. 44, 1149–1161 (2022).


    Google Scholar 

  • Vielhaben, J., Lapuschkin, S., Montavon, G. & Samek, W. Explainable AI for time series via virtual inspection layers. Pattern Recognit. 150, 110309 (2024).


    Google Scholar 

  • Chormai, P., Herrmann, J., Müller, K.-R. & Montavon, G. Disentangled explanations of neural network predictions by finding relevant subspaces. IEEE Trans. Pattern Anal. Mach. Intell. 46, 7283–7299 (2024).


    Google Scholar 

  • Zhou, C. et al. LIMA: less is more for alignment. In Advances in Neural Information Processing Systems, NeurIPS Vol. 36 (eds Oh, A. et al.) 55006–55021 (Curran Associates, 2023).

  • Muttenthaler, L., Dippel, J., Linhardt, L., Vandermeulen, R. A. & Kornblith, S. Human alignment of neural network representations. In Proc. International Conference on Learning Representations (ICLR) (, 2023).

  • Wang, X. et al. ChestX-Ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3462–3471 (IEEE, 2017).

  • Cohen, J. P. et al. COVID-19 image data collection: prospective predictions are the future. Preprint at (2020).

  • Azizi, S. et al. Big self-supervised models advance medical image classification. In Proc. International Conference on Computer Vision (ICCV) 3458–3468 (IEEE, 2021).

  • Eslami, S., Meinel, C. & de Melo, G. PubMedCLIP: how much does CLIP benefit visual question answering in the medical domain? In Proc. Findings of the Association for Computational Linguistics (EACL) 1151–1163 (Association for Computational Linguistics, 2023).

  • Huang, S.-C. et al. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. npj Digit. Med. 6, 74 (2023).


    Google Scholar 

  • Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. E. A simple framework for contrastive learning of visual representations. In ICML Proc. Machine Learning Research Vol. 119, 1597–1607 (PMLR, 2020).

  • Zbontar, J., Jing, L., Misra, I., LeCun, Y. & Deny, S. Barlow twins: self-supervised learning via redundancy reduction. In ICML Proc. Machine Learning Research Vol. 139, 2310–12320 (PMLR, 2021).

  • Deng, J. et al. Imagenet: a large-scale hierarchical image database. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 248–255 (IEEE, 2009).

  • Chen, T., Luo, C. & Li, L. Intriguing properties of contrastive losses. In Advances in Neural Information Processing Systems, NeurIPS Vol. 34 (eds Ranzato, M. et al.) 11834–11845 (Curran Associates, 2021).

  • Robinson, J. et al. Can contrastive learning avoid shortcut solutions? In Advances in Neural Information Processing Systems, NeurIPS Vol. 34 (eds Ranzato, M. et al.) 4974–4986 (Curran Associates, 2021).

  • Dippel, J., Vogler, S. & Höhne, J. Towards fine-grained visual representations by combining contrastive learning with image reconstruction and attention-weighted pooling. In ICML Workshop: Self-Supervised Learning for Reasoning and Perception (2021).

  • Li, T. et al. Addressing feature suppression in unsupervised visual representations. In Proc. Winter Conference on Applications of Computer Vision (WACV) 1411–1420 (IEEE, 2023).

  • Roth, K. et al. Towards total recall in industrial anomaly detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition 14298–14308 (IEEE, 2022).

  • Batzner, K., Heckler, L. & König, R. Efficientad: accurate visual anomaly detection at millisecond-level latencies. In Proc. Winter Conference on Applications of Computer Vision (WACV) 127–137 (IEEE, 2024).

  • Harmeling, S., Dornhege, G., Tax, D., Meinecke, F. & Müller, K.-R. From outliers to prototypes: ordering data. Neurocomputing 69, 1608–1618 (2006).


    Google Scholar 

  • Aggarwal, C. C. Outlier Analysis (Springer, 2013).

  • Chandola, V., Banerjee, A. & Kumar, V. Anomaly detection: a survey. ACM Comput. Surv. 41, 15:1–15:58 (2009).


    Google Scholar 

  • Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 33, 1065–1076 (1962).


    Google Scholar 

  • Kim, J. & Scott, C. D. Robust kernel density estimation. J. Mach. Learn. Res. 13, 2529–2565 (2012).


    Google Scholar 

  • Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J. & Williamson, R. C. Estimating the support of a high-dimensional distribution. Neural Comput. 13, 1443–1471 (2001).


    Google Scholar 

  • Montavon, G., Kauffmann, J. R., Samek, W. & Müller, K.-R. in Lecture Notes in Computer Science Vol. 13200 (eds Holzinger, A. et al.) 117–138 (Springer, 2020).

  • Yu, Y., Qian, J. & Wu, Q. Visual saliency via multiscale analysis in frequency domain and its applications to ship detection in optical satellite images. Front. Neurorobot. 15, 767299 (2022).


    Google Scholar 

  • Parmar, G., Zhang, R. & Zhu, J. On aliased resizing and surprising subtleties in GAN evaluation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 11400–11410 (IEEE, 2022).

  • Kirichenko, P., Izmailov, P. & Wilson, A. G. Last layer re-training is sufficient for robustness to spurious correlations. In Proc. International Conference on Learning Representations (ICLR) (, 2023).

  • Sugiyama, M., Krauledat, M. & Müller, K.-R. Covariate shift adaptation by importance weighted cross validation. J. Mach. Learn. Res. 8, 985–1005 (2007).


    Google Scholar 

  • Sugiyama, M. & Kawanabe, M. Machine Learning in Non-stationary Environments: Introduction to Covariate Shift Adaptation (MIT Press, 2012).

  • Iwasawa, Y. & Matsuo, Y. Test-time classifier adjustment module for model-agnostic domain generalization. In Advances in Neural Information Processing Systems, NeurIPS Vol. 34 (eds Ranzato, M. et al.) 2427–2440 (Curran Associates, 2021).

  • Esposito, C., Landrum, G. A., Schneider, N., Stiefl, N. & Riniker, S. Ghost: adjusting the decision threshold to handle imbalanced data in machine learning. J. Chem. Inf. Model. 61, 2623–2640 (2021).


    Google Scholar 

  • Niven, T. & Kao, H. Probing neural network comprehension of natural language arguments. In Proc. Conference of the Association for Computational Linguistics (eds Korhonen, A. et al.) 4658–4664 (Association for Computational Linguistics, 2019).

  • Heinzerling, B. NLP’s Clever Hans moment has arrived. J. Cogn. Sci. 21, 161–170 (2020).

    Google Scholar 

  • Braun, M. L., Buhmann, J. M. & Müller, K.-R. On relevant dimensions in kernel feature spaces. J. Mach. Learn. Res. 9, 1875–1908 (2008).


    Google Scholar 

  • Basri, R. et al. Frequency bias in neural networks for input of non-uniform density. In ICML Proc. Machine Learning Research Vol. 119, 685–694 (PMLR, 2020).

  • Fridovich-Keil, S., Lopes, R. G. & Roelofs, R. Spectral bias in practice: the role of function frequency in generalization. In Advances in Neural Information Processing Systems, NeurIPS Vol. 35 (eds Koyejo, S. et al.) (Curran Associates, 2022).

  • Arras, L. et al. in Lecture Notes in Computer Science Vol. 11700 (eds Samek, W. et al.) 211–238 (Springer, 2019).

  • Schnake, T. et al. Higher-order explanations of graph neural networks via relevant walks. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7581–7596 (2022).


    Google Scholar 

  • Ali, A. et al. XAI for transformers: better explanations through conservative propagation. In ICML Proc. Machine Learning Research Vol. 162, 435–451, (PMLR, 2022).

  • Jafari, F. R., Montavon, G., Muller, K. R. & Eberle, O. MambaLRP: explaining selective state space sequence models. In Advances in Neural Information Processing Systems, NeurIPS Vol. 37 (eds Globerson, A. et al.) 118540–118570 (Curran Associates, 2024).

  • Munir, M., Siddiqui, S. A., Dengel, A. & Ahmed, S. DeepAnT: a deep learning approach for unsupervised anomaly detection in time series. IEEE Access 7, 1991–2005 (2019).


    Google Scholar 

  • Devlin, J., Chang, M., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) (eds Burstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).

  • Pelka, O., Koitka, S., Rückert, J., Nensa, F. & Friedrich, C. M. in Lecture Notes in Computer Science Vol. 11043 (eds Stoyanov, D. et al.) 180–189 (Springer, 2018).

  • PubMedCLIP Hugging Face (2024).

  • openaiCLIP GitHub (2024).

  • Facebook Research. barlowtwins GitHub (2022).

  • CXR8 NIHCC (2017).

  • ieee8023 COVID-chestxray-dataset GitHub (2020).

  • Pang, G., Shen, C., Cao, L. & van den Hengel, A. Deep learning for anomaly detection: a review. ACM Comput. Surv. 54, 38:1–38:38 (2022).


    Google Scholar 

  • Rippel, O., Mertens, P. & Merhof, D. Modeling the distribution of normal data in pre-trained deep features for anomaly detection. In ICPR 6726–6733 (IEEE, 2020).

  • Jelinek, F., Mercer, R. L., Bahl, L. R. & Baker, J. K. Perplexity—a measure of the difficulty of speech recognition tasks. J. Acoust. Soc. Am. 62, S63–S63 (2005).


    Google Scholar 

  • Amazon Science. patchcore-inspection GitHub (2022).

  • Bradski, G. The OpenCV library. Dr. Dobb’s Journal of Tools 120, 122–125 (2000).

  • Torchvision: PyTorch’s computer vision library (2016).

  • Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K. & Müller, K.-R. Explainable AI: Interpreting, Explaining And Visualizing Deep Learning Vol. 11700 (Springer, 2019).

  • zennit. GitHub (2021).

  • Kauffmann, J. et al. Explainable AI reveals clever hans effects in unsupervised learning models: code. Zenodo (2024).

  • Ml-workgroup. COVID-19 image repository. GitHub (2020).

  • 2025-03-17 00:00:00

    Related Articles

    Back to top button