Telescopes Drive Themselves: Optimizing Cosmic Survey Scheduling with Reinforcement Learning

Date:

The planning and execution of observational cosmology campaigns have undergone a substantial increase in complexity, particularly for advanced telescopes like the Rubin Observatory’s LSST, JWST, and the Nancy Grace Roman telescope. Traditionally, astronomical observatories have relied on manual planning to scan a predefined list of astronomical objects, which usually results in suboptimal observations. We are developing a framework for statistical learning-based optimization of telescope pointings to gather data that is most useful for a pre-defined scientific reward. We frame the observational campaign as a Markovian Decision Process, which captures the nature of sequential decision-making. We implement this through reinforcement learning (RL), which has emerged in the field of artificial intelligence as a powerful approach to training autonomous systems. In this study, we focus on the application of RL algorithms on an offline dataset containing simulated observations with a discrete set of sky locations the telescope is allowed to visit, referred to as the “action space.” Two key aspects are investigated: 1) the preprocessing of the dataset using normalization techniques and potential observation space reduction, and 2) the application of value-based networks for decision-making. Considering the range of well-known RL algorithms, this study has mainly targeted value-based networks, and in particular Deep Q-Networks (DQNs), since they outperform policy-based networks on the offline dataset. Our experimental results demonstrate that the combination of preprocessing techniques, along with value-based networks, yields high performances and capabilities to generalize on unseen data for our task. Furthermore, the analysis highlighted how varying certain hyperparameters led to a significant impact on the obtained results. Our results contribute to the advancement of autonomous systems, specifically in the context of process scheduling.

Indico Presentation