AI

Goals as reward-producing programs | Nature Machine Intelligence

  • Dweck, C. S. Article commentary: the study of goals in psychology. Psychol. Sci. 3, 165–167 (1992).

    Article 
    MATH 

    Google Scholar 

  • Austin, J. T. & Vancouver, J. B. Goal constructs in psychology: structure, process, and content. Psychol. Bull. 120, 338–375 (1996).

    Article 
    MATH 

    Google Scholar 

  • Elliot, A. J. & Fryer, J. W. in Handbook of Motivation Science Vol. 638 (ed. Shah, J. Y.) 235–250 (The Guilford Press, 2008).

  • Hyland, M. E. Motivational control theory: an integrative framework. J. Pers. Soc. Psychol. 55, 642–651 (1988).

    Article 
    MATH 

    Google Scholar 

  • Eccles, J. S. & Wigfield, A. Motivational beliefs, values, and goals. Annu. Rev. Psychol. 53, 109–132 (2002).

    Article 
    MATH 

    Google Scholar 

  • Brown, L. V. Psychology of Motivation (Nova Science Publishers, 2007); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”7.”>

    Fishbach, A. & Ferguson, M. J. in Social Psychology: Handbook of Basic Principles Vol. 2 (eds Kruglanski, A. W. & Higgins, E. T.) 490–515 (The Guilford Press, 2007).

  • Pervin, L. A. Goal Concepts in Personality and Social Psychology (Taylor & Francis, 2015); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”9.”>

    Moskowitz, G. B. & Grant, H. The Psychology of Goals Vol. 548 (Guilford Press, 2009).

  • Molinaro, G. & Collins, A. G. E. A goal-centric outlook on learning. Trends Cogn. Sci. 27, 1150–1164 (2023).

    Article 
    MATH 

    Google Scholar 

  • Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).

  • Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).

    Article 
    MATH 

    Google Scholar 

  • Chu, J., Tenenbaum, J. B. & Schulz, L. E. In praise of folly: flexible goals and human cognition. Trends Cogn. Sci. 28, 628–642 (2024).

    Article 
    MATH 

    Google Scholar 

  • Chu, J. & Schulz, L. E. Play, curiosity, and cognition. Annu. Rev. Dev. Psychol. 2, 317–343 (2020).

    Article 
    MATH 

    Google Scholar 

  • Lillard, A. S. in Handbook of Child Psychology and Developmental Science Vol. 3 (eds Liben, L. & Mueller, U.) 425–468 (Wiley-Blackwell, 2015).

  • Andersen, M. M., Kiverstein, J., Miller, M. & Roepstorff, A. Play in predictive minds: a cognitive theory of play. Psychol. Rev. 130, 462–479 (2023).

    Article 

    Google Scholar 

  • Oudeyer, P.-Y., Kaplan, F. & Hafner, V. V. Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11, 265–286 (2007).

    Article 
    MATH 

    Google Scholar 

  • Nguyen, C. T. Games: Agency as Art (Oxford Univ. Press, 2020).

  • Kolve, E. et al. AI2-THOR: an interactive 3D environment for visual AI. Preprint at (2017).

  • Fodor, J. A. The Language of Thought (Harvard Univ. Press, 1979).

  • Goodman, N. D., Tenenbaum, J. B., Feldman, J. & Griffiths, T. L. A rational analysis of rule-based concept learning. Cogn. Sci. 32, 108–154 (2008).

    Article 
    MATH 

    Google Scholar 

  • Piantadosi, S. T., Tenenbaum, J. B. & Goodman, N. D. Bootstrapping in a language of thought: a formal model of numerical concept learning. Cognition 123, 199–217 (2012).

    Article 
    MATH 

    Google Scholar 

  • Rule, J. S., Tenenbaum, J. B. & Piantadosi, S. T. The child as hacker. Trends Cogn. Sci.24, 900–915 (2020).

    Article 
    MATH 

    Google Scholar 

  • Wong, L. et al. From word models to world models: translating from natural language to the probabilistic language of thought. Preprint at (2023).

  • Ghallab, M. et al. PDDL—The Planning Domain Definition Language Tech Report CVC TR-98-003/DCS TR-1165 (Yale Center for Computational Vision and Control, 1998).

  • Chopra, S., Hadsell, R. & LeCun, Y. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 539–546 (IEEE, 2005).

  • Le-Khac, P. H., Healy, G. & Smeaton, A. F. Contrastive representation learning: a framework and review. IEEE Access 8, 193907–193934 (2020).

    Article 

    Google Scholar 

  • Pugh, J. K., Soros, L. B & Stanley, K. O. Quality diversity: a new frontier for evolutionary computation. Front. Robot. AI (2016).

  • Chatzilygeroudis, K., Cully, A., Vassiliades, V. & Mouret, J. B. Quality-diversity optimization: a novel branch of stochastic optimization. Springer Optim. Appl. 170, 109–135 (2020).

    MathSciNet 
    MATH 

    Google Scholar 

  • Mouret, J.-B. & Clune, J. Illuminating search spaces by mapping elites. Preprint at (2015).

  • Ward, T. B. Structured imagination: the role of category structure in exemplar generation. Cogn. Psychol. 27, 1–40 (1994).

    Article 
    MATH 

    Google Scholar 

  • Allen, K. R. et al. Using games to understand the mind. Nat. Hum. Behav. (2024).

  • Liu, M., Zhu, M. & Zhang, W. Goal-conditioned reinforcement learning: problems and solutions. In Proc. 31st International Joint Conference on Artificial Intelligence: Survey Track (ed. De Raedt, L.) 5502–5511 (IJCAI, 2022).

  • Colas, C., Karch, T., Sigaud, O. & Oudeyer, P.-Y. Autotelic agents with intrinsically motivated goal-conditioned reinforcement learning: a short survey. J. Artif. Intell. Res. 74, 1159–1199 (2022).

    Article 
    MathSciNet 

    Google Scholar 

  • Icarte, R. T., Klassen, T. Q., Valenzano, R. & McIlraith, S. A. Reward machines: exploiting reward function structure in reinforcement learning. J. Artif. Intell. Res. 73, 173–208 (2022).

    Article 
    MathSciNet 
    MATH 

    Google Scholar 

  • Pell, B. Metagame in Symmetric Chess-Like Games UCAM-CL-TR-277 (Univ. Cambridge, Computer Laboratory, 1992).

  • Hom, V. & Marks, J. Automatic design of balanced board games. In Proc. AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment Vol. 3 (eds Schaeffer, J. & Mateasvol, M.) 25–30 (AAAI Press, 2007).

  • Browne, C. & Maire, F. Evolutionary game design. IEEE Trans. Comput. Intell. AI Games 2, 1–16 (IEEE, 2010).

  • Togelius, J. & Schmidhuber, J. An experiment in automatic game design. In 2008 IEEE Symposium On Computational Intelligence and Games 111–118 (IEEE, 2008).

  • Smith, A. M., Nelson, M. J. & Mateas, M. Ludocore: a logical game engine for modeling videogames. In Proc. 2010 IEEE Conference on Computational Intelligence and Games 91–98 (IEEE, 2010).

  • Zook, A. & Riedl, M. Automatic game design via mechanic generation. In Proc. AAAI Conference on Artificial Intelligence Vol. 28, (AAAI Press, 2014).

  • Khalifa, A., Green, M. C., Perez-Liebana, D. & Togelius, J. General video game rule generation. In 2017 IEEE Conference on Computational Intelligence and Games 170–177 (IEEE, 2017).

  • Lake, B. M., Ullman, T. D., Tenenbaum, J. B. & Gershman, S. J. Building machines that learn and think like people. Behav. Brain Sci. 40, e253 (2017).

    Article 
    MATH 

    Google Scholar 

  • Cully, A. Autonomous skill discovery with quality-diversity and unsupervised descriptors. In Proc. Genetic and Evolutionary Computation Conference (ed. López-Ibáñez, M.) 81–89 (Association for Computing Machinery, 2019).

  • Grillotti, L. & Cully, A. Unsupervised behavior discovery with quality-diversity optimization. IEEE Trans. Evol. Comput. 26, 1539–1552 (2022).

    Article 
    MATH 

    Google Scholar 

  • Ullman, T. D., Spelke, E., Battaglia, P. & Tenenbaum, J. B. Mind games: game engines as an architecture for intuitive physics. Trends Cogn. Sci. 21, 649–665 (2017).

    Article 

    Google Scholar 

  • Chen, T., Allen, K. R., Cheyette, S. J., Tenenbaum, J. & Smith, K. A. ‘Just in time’ representations for mental simulation in intuitive physics. In Proc. Annual Meeting of the Cognitive Science Society Vol. 45 (UC Merced, 2023); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”48.”>

    Tang, H., Key, D. & Ellis, K. WorldCoder, a model-based LLM agent: building world models by writing code and interacting with the environment. Preprint at (2024).

  • Reed, S. et al. A generalist agent. Trans. Mach. Learn. Res. 1ikK0kHjvj (2022).

  • Gallouédec, Q., Beeching, E., Romac, C. & Dellandréa, E. Jack of all trades, master of some, a multi-purpose transformer agent. Preprint at (2024).

  • Florensa, C., Held, D., Geng, X. & Abbeel, P. Automatic goal generation for reinforcement learning agents. In Proc. 35th International Conference on Machine Learning Vol. 80 (eds Dy, J. & Krause, A.) 1515–1528 (PMLR, 2018).

  • Open Ended Learning Team et al. Open-ended learning leads to generally capable agents. Preprint at (2021).

  • Du, Y. et al. Guiding pretraining in reinforcement learning with large language models. In Proc. of the 40th International Conference on Machine Learning (eds Krause, A. et al.) 8657–8677 (JMLR, 2023).

  • Colas, C., Teodorescu, L., Oudeyer, P.-Y., Yuan, X. & Côté, M.-A. Augmenting autotelic agents with large language models. Preprint at (2023).

  • Littman, M. L. et al. Environment-independent task specifications via GLTL. Preprint at (2017).

  • Leon, B. G., Shanahan, M. & Belardinelli, F. In a nutshell, the human asked for this: latent goals for following temporal specifications. In 10th International Conference on Learning Representations (OpenReview, 2022); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”57.”>

    Ma, Y. J. et al. Eureka: Human-Level Reward Design via Coding Large Language Models (ICLR, 2023).

  • Faldor, M., Zhang, J., Cully, A. & Clune, J. OMNI-EPIC: open-endedness via models of human notions of interestingness with environments programmed in code. In 12th International Conference on Learning Representations (OpenReview, 2024); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”59.”>

    Colas, C. et al. Language as a cognitive tool to imagine goals in curiosity-driven exploration. In Proc. 34th International Conference on Neural Information Processing Systems (NIPS ’20) (eds Larochelle, H. et al.) 3761–3774 (Curran Associates, 2020).

  • Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D. & Meder, B. Generalization guides human exploration in vast decision spaces. Nat. Hum. Behav. 2, 915–924 (2018).

    Article 
    MATH 

    Google Scholar 

  • Ten, A. et al. in The Drive for Knowledge: The Science of Human Information Seeking (eds. Dezza, I. C. et al.) 53–76 (Cambridge Univ. Press, 2022).

  • Berlyne, D. E. Novelty and curiosity as determinants of exploratory behaviour. Br. J. Psychol. Gen. Sect. 41, 68–80 (1950).

    Article 
    MATH 

    Google Scholar 

  • Gopnik, A. Empowerment as causal learning, causal learning as empowerment: a bridge between Bayesian causal hypothesis testing and reinforcement learning. PhilSci-Archive / (2024).

  • Addyman, C. & Mareschal, D. Local redundancy governs infants’ spontaneous orienting to visual-temporal sequences. Child Dev. 84, 1137–1144 (2013).

    Article 

    Google Scholar 

  • Du, Y. et al. What can AI learn from human exploration? Intrinsically-motivated humans and agents in open-world exploration. In NeurIPS 2023 Workshop: Information-Theoretic Principles in Cognitive Systems (OpenReview, 2023); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”66.”>

    Ruggeri, A., Stanciu, O., Pelz, M., Gopnik, A. & Schulz, E. Preschoolers search longer when there is more information to be gained. Dev. Sci. 27, e13411 (2024).

    Article 

    Google Scholar 

  • Liquin, E. G., Callaway, F. & Lombrozo, T. Developmental change in what elicits curiosity. In Proc. Annual Meeting of the Cognitive Science Society Vol. 43 (UC Merced, 2021); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”68.”>

    Taffoni, F. et al. Development of goal-directed action selection guided by intrinsic motivations: an experiment with children. Exp. Brain Res. 232, 2167–2177 (2014).

    Article 
    MATH 

    Google Scholar 

  • Ten, A., Kaushik, P., Oudeyer, P.-Y. & Gottlieb, J. Humans monitor learning progress in curiosity-driven exploration. Nat. Commun. 12, 5972 (2021).

    Article 

    Google Scholar 

  • Baldassarre, G. et al. Intrinsic motivations and open-ended development in animals, humans, and robots: an overview. Front. Psychol. 5, 985 (2014).

    Article 
    MATH 

    Google Scholar 

  • Spelke, E. S. & Kinzler, K. D. Core knowledge. Dev. Sci. 10, 89–96 (2007).

    Article 
    MATH 

    Google Scholar 

  • Jara-Ettinger, J., Gweon, H., Schulz, L. E. & Tenenbaum, J. B. The naïve utility calculus: computational principles underlying commonsense psychology. Trends Cogn. Sci. 20, 589–604 (2016).

    Article 

    Google Scholar 

  • Liu, S., Brooks, N. B. & Spelke, E. S. Origins of the concepts cause, cost, and goal in prereaching infants. Proc. Natl Acad. Sci. USA 116, 17747–17752 (2019).

    Article 

    Google Scholar 

  • Jara-Ettinger, J. Theory of mind as inverse reinforcement learning. Curr. Opin. Behav. Sci. 29, 105–110 (2019).

    Article 
    MATH 

    Google Scholar 

  • Arora, S. & Doshi, P. A survey of inverse reinforcement learning: challenges, methods and progress. Artif. Intell. 297, 103500 (2021).

    Article 
    MathSciNet 
    MATH 

    Google Scholar 

  • Baker, C., Saxe, R. & Tenenbaum, J. Bayesian theory of mind: Modeling joint belief–desire attribution. In Proc. Annual Meeting of the Cognitive Science Society Vol. 33 (UC Merced, 2011); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”77.”>

    Velez-Ginorio, J., Siegel, M. H., Tenenbaum, J. B. & Jara-Ettinger, J. Interpreting actions by attributing compositional desires. In Proc. Annual Meeting of the Cognitive Science Society Vol. 39 (eds Gunzelmann, G. et al.) (UC Merced, 2017); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”78.”>

    Ho, M. K. & Griffiths, T. L. Cognitive science as a source of forward and inverse models of human decisions for robotics and control. Annu. Rev. Control Robot. Auton. Syst. 5, 33–53 (2022).

    Article 
    MATH 

    Google Scholar 

  • Palan, S. & Schitter, C. Prolific.ac—a subject pool for online experiments. J. Behav. Exp. Finance 17, 22–27 (2018).

    Article 
    MATH 

    Google Scholar 

  • Icarte, R. T., Klassen, T., Valenzano, R. & McIlraith, S. Using reward machines for high-level task specification and decomposition in reinforcement learning. In Proc. 35th International Conference on Machine Learning Vol. 80 (eds Dy, J. & Krause, A.) 2107–2116 (PMLR, 2018).

  • Brants, T., Popat, A. C, Xu, P., Och, F. J. & Dean, J. Large language models in machine translation. In Proc. 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (ed. Eisner, J.) 858–867 (Association for Computational Linguistics, 2007).

  • Rothe, A., Lake, B. M. & Gureckis, T. M. Question asking as program generation. In Advances in Neural Information Processing Systems 30 (eds Von Luxburg, U. et al.) 1047–1056 (Curran Associates, 2017).

  • LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M.’A. & Huang, F. J. in Predicting Structured Data (eds Bakir, G. et al.) Ch. 10 (MIT Press, 2006).

  • van den Oord, A., Li, Y. & Vinyals, O. Representation learning with contrastive predictive coding. Preprint at 7 (2018).

  • Charity, M., Green, M. C., Khalifa, A. & Togelius, J. Mech-elites: illuminating the mechanic space of GVG-AI. In Proc. 15th International Conference on the Foundations of Digital Games (eds Yannakakis, G. N. et al.) 8 (Association for Computing Machinery, 2020).

  • GPT-4 Technical Report (OpenAI, 2023).

  • Mann, H. B. & Whitney, D. R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947).

    Article 
    MathSciNet 
    MATH 

    Google Scholar 

  • Castro, S. Fast Krippendorff: fast computation of Krippendorff’s alpha agreement measure. GitHub (2017).

  • Radenbush, S. W. & Bryk, A. S. Hierarchical Linear Models. Applications and Data Analysis Methods 2nd edn (Sage Publications, 2002).

  • Hox, J., Moerbeek, M. & van de Schoot, R. Multilevel Analysis (Techniques and Applications) 3rd edn (Routledge, 2018).

  • Argesti, A. Categorical Data Analysis 3rd edn (Wiley, 2018).

  • Greene, W. H. & Hensher, D. A. Modeling Ordered Choices: A Primer (Cambridge Univ. Press, 2010).

  • Christensen, R. H. B. ordinal—regression models for ordinal data. R package version 2023.12-4 (2023).

  • R Core Team. R: A Language and Environment for Statistical Computing Version 4.3.2 / (R Foundation for Statistical Computing, 2023).

  • Long, J. A. jtools: analysis and presentation of social scientific data. J. Open Source Softw. 9, 6610 (2024).

    Article 
    MATH 

    Google Scholar 

  • Lenth, R. V. emmeans: estimated marginal means, aka least-squares means. R package version 1.10.0 (2024).

  • Davidson, G., Todd, G., Togelius, J., Gureckis, T. M. & Lake, B. M. guydav/goals-as-reward-producing-programs: release for DOI. Zenodo (2024).

  • 2025-02-21 00:00:00

    Related Articles

    Back to top button