Goals as reward-producing programs | Nature Machine Intelligence
Dweck, C. S. Article commentary: the study of goals in psychology. Psychol. Sci. 3, 165–167 (1992).
Google Scholar
Austin, J. T. & Vancouver, J. B. Goal constructs in psychology: structure, process, and content. Psychol. Bull. 120, 338–375 (1996).
Google Scholar
Elliot, A. J. & Fryer, J. W. in Handbook of Motivation Science Vol. 638 (ed. Shah, J. Y.) 235–250 (The Guilford Press, 2008).
Hyland, M. E. Motivational control theory: an integrative framework. J. Pers. Soc. Psychol. 55, 642–651 (1988).
Google Scholar
Eccles, J. S. & Wigfield, A. Motivational beliefs, values, and goals. Annu. Rev. Psychol. 53, 109–132 (2002).
Google Scholar
Brown, L. V. Psychology of Motivation (Nova Science Publishers, 2007); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”7.”>
Fishbach, A. & Ferguson, M. J. in Social Psychology: Handbook of Basic Principles Vol. 2 (eds Kruglanski, A. W. & Higgins, E. T.) 490–515 (The Guilford Press, 2007).
Pervin, L. A. Goal Concepts in Personality and Social Psychology (Taylor & Francis, 2015); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”9.”>
Moskowitz, G. B. & Grant, H. The Psychology of Goals Vol. 548 (Guilford Press, 2009).
Molinaro, G. & Collins, A. G. E. A goal-centric outlook on learning. Trends Cogn. Sci. 27, 1150–1164 (2023).
Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Google Scholar
Chu, J., Tenenbaum, J. B. & Schulz, L. E. In praise of folly: flexible goals and human cognition. Trends Cogn. Sci. 28, 628–642 (2024).
Google Scholar
Chu, J. & Schulz, L. E. Play, curiosity, and cognition. Annu. Rev. Dev. Psychol. 2, 317–343 (2020).
Google Scholar
Lillard, A. S. in Handbook of Child Psychology and Developmental Science Vol. 3 (eds Liben, L. & Mueller, U.) 425–468 (Wiley-Blackwell, 2015).
Andersen, M. M., Kiverstein, J., Miller, M. & Roepstorff, A. Play in predictive minds: a cognitive theory of play. Psychol. Rev. 130, 462–479 (2023).
Google Scholar
Oudeyer, P.-Y., Kaplan, F. & Hafner, V. V. Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11, 265–286 (2007).
Google Scholar
Nguyen, C. T. Games: Agency as Art (Oxford Univ. Press, 2020).
Kolve, E. et al. AI2-THOR: an interactive 3D environment for visual AI. Preprint at (2017).
Fodor, J. A. The Language of Thought (Harvard Univ. Press, 1979).
Goodman, N. D., Tenenbaum, J. B., Feldman, J. & Griffiths, T. L. A rational analysis of rule-based concept learning. Cogn. Sci. 32, 108–154 (2008).
Google Scholar
Piantadosi, S. T., Tenenbaum, J. B. & Goodman, N. D. Bootstrapping in a language of thought: a formal model of numerical concept learning. Cognition 123, 199–217 (2012).
Google Scholar
Rule, J. S., Tenenbaum, J. B. & Piantadosi, S. T. The child as hacker. Trends Cogn. Sci.24, 900–915 (2020).
Google Scholar
Wong, L. et al. From word models to world models: translating from natural language to the probabilistic language of thought. Preprint at (2023).
Ghallab, M. et al. PDDL—The Planning Domain Definition Language Tech Report CVC TR-98-003/DCS TR-1165 (Yale Center for Computational Vision and Control, 1998).
Chopra, S., Hadsell, R. & LeCun, Y. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 539–546 (IEEE, 2005).
Le-Khac, P. H., Healy, G. & Smeaton, A. F. Contrastive representation learning: a framework and review. IEEE Access 8, 193907–193934 (2020).
Google Scholar
Pugh, J. K., Soros, L. B & Stanley, K. O. Quality diversity: a new frontier for evolutionary computation. Front. Robot. AI (2016).
Chatzilygeroudis, K., Cully, A., Vassiliades, V. & Mouret, J. B. Quality-diversity optimization: a novel branch of stochastic optimization. Springer Optim. Appl. 170, 109–135 (2020).
Google Scholar
Mouret, J.-B. & Clune, J. Illuminating search spaces by mapping elites. Preprint at (2015).
Ward, T. B. Structured imagination: the role of category structure in exemplar generation. Cogn. Psychol. 27, 1–40 (1994).
Google Scholar
Allen, K. R. et al. Using games to understand the mind. Nat. Hum. Behav. (2024).
Liu, M., Zhu, M. & Zhang, W. Goal-conditioned reinforcement learning: problems and solutions. In Proc. 31st International Joint Conference on Artificial Intelligence: Survey Track (ed. De Raedt, L.) 5502–5511 (IJCAI, 2022).
Colas, C., Karch, T., Sigaud, O. & Oudeyer, P.-Y. Autotelic agents with intrinsically motivated goal-conditioned reinforcement learning: a short survey. J. Artif. Intell. Res. 74, 1159–1199 (2022).
Google Scholar
Icarte, R. T., Klassen, T. Q., Valenzano, R. & McIlraith, S. A. Reward machines: exploiting reward function structure in reinforcement learning. J. Artif. Intell. Res. 73, 173–208 (2022).
Google Scholar
Pell, B. Metagame in Symmetric Chess-Like Games UCAM-CL-TR-277 (Univ. Cambridge, Computer Laboratory, 1992).
Hom, V. & Marks, J. Automatic design of balanced board games. In Proc. AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment Vol. 3 (eds Schaeffer, J. & Mateasvol, M.) 25–30 (AAAI Press, 2007).
Browne, C. & Maire, F. Evolutionary game design. IEEE Trans. Comput. Intell. AI Games 2, 1–16 (IEEE, 2010).
Togelius, J. & Schmidhuber, J. An experiment in automatic game design. In 2008 IEEE Symposium On Computational Intelligence and Games 111–118 (IEEE, 2008).
Smith, A. M., Nelson, M. J. & Mateas, M. Ludocore: a logical game engine for modeling videogames. In Proc. 2010 IEEE Conference on Computational Intelligence and Games 91–98 (IEEE, 2010).
Zook, A. & Riedl, M. Automatic game design via mechanic generation. In Proc. AAAI Conference on Artificial Intelligence Vol. 28, (AAAI Press, 2014).
Khalifa, A., Green, M. C., Perez-Liebana, D. & Togelius, J. General video game rule generation. In 2017 IEEE Conference on Computational Intelligence and Games 170–177 (IEEE, 2017).
Lake, B. M., Ullman, T. D., Tenenbaum, J. B. & Gershman, S. J. Building machines that learn and think like people. Behav. Brain Sci. 40, e253 (2017).
Google Scholar
Cully, A. Autonomous skill discovery with quality-diversity and unsupervised descriptors. In Proc. Genetic and Evolutionary Computation Conference (ed. López-Ibáñez, M.) 81–89 (Association for Computing Machinery, 2019).
Grillotti, L. & Cully, A. Unsupervised behavior discovery with quality-diversity optimization. IEEE Trans. Evol. Comput. 26, 1539–1552 (2022).
Google Scholar
Ullman, T. D., Spelke, E., Battaglia, P. & Tenenbaum, J. B. Mind games: game engines as an architecture for intuitive physics. Trends Cogn. Sci. 21, 649–665 (2017).
Google Scholar
Chen, T., Allen, K. R., Cheyette, S. J., Tenenbaum, J. & Smith, K. A. ‘Just in time’ representations for mental simulation in intuitive physics. In Proc. Annual Meeting of the Cognitive Science Society Vol. 45 (UC Merced, 2023); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”48.”>
Tang, H., Key, D. & Ellis, K. WorldCoder, a model-based LLM agent: building world models by writing code and interacting with the environment. Preprint at (2024).
Reed, S. et al. A generalist agent. Trans. Mach. Learn. Res. 1ikK0kHjvj (2022).
Gallouédec, Q., Beeching, E., Romac, C. & Dellandréa, E. Jack of all trades, master of some, a multi-purpose transformer agent. Preprint at (2024).
Florensa, C., Held, D., Geng, X. & Abbeel, P. Automatic goal generation for reinforcement learning agents. In Proc. 35th International Conference on Machine Learning Vol. 80 (eds Dy, J. & Krause, A.) 1515–1528 (PMLR, 2018).
Open Ended Learning Team et al. Open-ended learning leads to generally capable agents. Preprint at (2021).
Du, Y. et al. Guiding pretraining in reinforcement learning with large language models. In Proc. of the 40th International Conference on Machine Learning (eds Krause, A. et al.) 8657–8677 (JMLR, 2023).
Colas, C., Teodorescu, L., Oudeyer, P.-Y., Yuan, X. & Côté, M.-A. Augmenting autotelic agents with large language models. Preprint at (2023).
Littman, M. L. et al. Environment-independent task specifications via GLTL. Preprint at (2017).
Leon, B. G., Shanahan, M. & Belardinelli, F. In a nutshell, the human asked for this: latent goals for following temporal specifications. In 10th International Conference on Learning Representations (OpenReview, 2022); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”57.”>
Ma, Y. J. et al. Eureka: Human-Level Reward Design via Coding Large Language Models (ICLR, 2023).
Faldor, M., Zhang, J., Cully, A. & Clune, J. OMNI-EPIC: open-endedness via models of human notions of interestingness with environments programmed in code. In 12th International Conference on Learning Representations (OpenReview, 2024); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”59.”>
Colas, C. et al. Language as a cognitive tool to imagine goals in curiosity-driven exploration. In Proc. 34th International Conference on Neural Information Processing Systems (NIPS ’20) (eds Larochelle, H. et al.) 3761–3774 (Curran Associates, 2020).
Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D. & Meder, B. Generalization guides human exploration in vast decision spaces. Nat. Hum. Behav. 2, 915–924 (2018).
Google Scholar
Ten, A. et al. in The Drive for Knowledge: The Science of Human Information Seeking (eds. Dezza, I. C. et al.) 53–76 (Cambridge Univ. Press, 2022).
Berlyne, D. E. Novelty and curiosity as determinants of exploratory behaviour. Br. J. Psychol. Gen. Sect. 41, 68–80 (1950).
Google Scholar
Gopnik, A. Empowerment as causal learning, causal learning as empowerment: a bridge between Bayesian causal hypothesis testing and reinforcement learning. PhilSci-Archive / (2024).
Addyman, C. & Mareschal, D. Local redundancy governs infants’ spontaneous orienting to visual-temporal sequences. Child Dev. 84, 1137–1144 (2013).
Google Scholar
Du, Y. et al. What can AI learn from human exploration? Intrinsically-motivated humans and agents in open-world exploration. In NeurIPS 2023 Workshop: Information-Theoretic Principles in Cognitive Systems (OpenReview, 2023); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”66.”>
Ruggeri, A., Stanciu, O., Pelz, M., Gopnik, A. & Schulz, E. Preschoolers search longer when there is more information to be gained. Dev. Sci. 27, e13411 (2024).
Google Scholar
Liquin, E. G., Callaway, F. & Lombrozo, T. Developmental change in what elicits curiosity. In Proc. Annual Meeting of the Cognitive Science Society Vol. 43 (UC Merced, 2021); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”68.”>
Taffoni, F. et al. Development of goal-directed action selection guided by intrinsic motivations: an experiment with children. Exp. Brain Res. 232, 2167–2177 (2014).
Google Scholar
Ten, A., Kaushik, P., Oudeyer, P.-Y. & Gottlieb, J. Humans monitor learning progress in curiosity-driven exploration. Nat. Commun. 12, 5972 (2021).
Google Scholar
Baldassarre, G. et al. Intrinsic motivations and open-ended development in animals, humans, and robots: an overview. Front. Psychol. 5, 985 (2014).
Google Scholar
Spelke, E. S. & Kinzler, K. D. Core knowledge. Dev. Sci. 10, 89–96 (2007).
Google Scholar
Jara-Ettinger, J., Gweon, H., Schulz, L. E. & Tenenbaum, J. B. The naïve utility calculus: computational principles underlying commonsense psychology. Trends Cogn. Sci. 20, 589–604 (2016).
Google Scholar
Liu, S., Brooks, N. B. & Spelke, E. S. Origins of the concepts cause, cost, and goal in prereaching infants. Proc. Natl Acad. Sci. USA 116, 17747–17752 (2019).
Google Scholar
Jara-Ettinger, J. Theory of mind as inverse reinforcement learning. Curr. Opin. Behav. Sci. 29, 105–110 (2019).
Google Scholar
Arora, S. & Doshi, P. A survey of inverse reinforcement learning: challenges, methods and progress. Artif. Intell. 297, 103500 (2021).
Google Scholar
Baker, C., Saxe, R. & Tenenbaum, J. Bayesian theory of mind: Modeling joint belief–desire attribution. In Proc. Annual Meeting of the Cognitive Science Society Vol. 33 (UC Merced, 2011); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”77.”>
Velez-Ginorio, J., Siegel, M. H., Tenenbaum, J. B. & Jara-Ettinger, J. Interpreting actions by attributing compositional desires. In Proc. Annual Meeting of the Cognitive Science Society Vol. 39 (eds Gunzelmann, G. et al.) (UC Merced, 2017); class=”c-article-references__item js-c-reading-companion-references-item” data-counter=”78.”>
Ho, M. K. & Griffiths, T. L. Cognitive science as a source of forward and inverse models of human decisions for robotics and control. Annu. Rev. Control Robot. Auton. Syst. 5, 33–53 (2022).
Google Scholar
Palan, S. & Schitter, C. Prolific.ac—a subject pool for online experiments. J. Behav. Exp. Finance 17, 22–27 (2018).
Google Scholar
Icarte, R. T., Klassen, T., Valenzano, R. & McIlraith, S. Using reward machines for high-level task specification and decomposition in reinforcement learning. In Proc. 35th International Conference on Machine Learning Vol. 80 (eds Dy, J. & Krause, A.) 2107–2116 (PMLR, 2018).
Brants, T., Popat, A. C, Xu, P., Och, F. J. & Dean, J. Large language models in machine translation. In Proc. 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (ed. Eisner, J.) 858–867 (Association for Computational Linguistics, 2007).
Rothe, A., Lake, B. M. & Gureckis, T. M. Question asking as program generation. In Advances in Neural Information Processing Systems 30 (eds Von Luxburg, U. et al.) 1047–1056 (Curran Associates, 2017).
LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M.’A. & Huang, F. J. in Predicting Structured Data (eds Bakir, G. et al.) Ch. 10 (MIT Press, 2006).
van den Oord, A., Li, Y. & Vinyals, O. Representation learning with contrastive predictive coding. Preprint at 7 (2018).
Charity, M., Green, M. C., Khalifa, A. & Togelius, J. Mech-elites: illuminating the mechanic space of GVG-AI. In Proc. 15th International Conference on the Foundations of Digital Games (eds Yannakakis, G. N. et al.) 8 (Association for Computing Machinery, 2020).
GPT-4 Technical Report (OpenAI, 2023).
Mann, H. B. & Whitney, D. R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947).
Google Scholar
Castro, S. Fast Krippendorff: fast computation of Krippendorff’s alpha agreement measure. GitHub (2017).
Radenbush, S. W. & Bryk, A. S. Hierarchical Linear Models. Applications and Data Analysis Methods 2nd edn (Sage Publications, 2002).
Hox, J., Moerbeek, M. & van de Schoot, R. Multilevel Analysis (Techniques and Applications) 3rd edn (Routledge, 2018).
Argesti, A. Categorical Data Analysis 3rd edn (Wiley, 2018).
Greene, W. H. & Hensher, D. A. Modeling Ordered Choices: A Primer (Cambridge Univ. Press, 2010).
Christensen, R. H. B. ordinal—regression models for ordinal data. R package version 2023.12-4 (2023).
R Core Team. R: A Language and Environment for Statistical Computing Version 4.3.2 / (R Foundation for Statistical Computing, 2023).
Long, J. A. jtools: analysis and presentation of social scientific data. J. Open Source Softw. 9, 6610 (2024).
Google Scholar
Lenth, R. V. emmeans: estimated marginal means, aka least-squares means. R package version 1.10.0 (2024).
Davidson, G., Todd, G., Togelius, J., Gureckis, T. M. & Lake, B. M. guydav/goals-as-reward-producing-programs: release for DOI. Zenodo (2024).
2025-02-21 00:00:00