- Understanding sequential decision problems and the exploration-exploitation dilemma: multi-armed bandits
- Algorithms for earning while learning: epsilon-greedy, upper confidence bound, Thompson sampling
- Modelling the state-space: Markov processes and Markov decision processes
- Solution methods for dynamic programming: Bellman equation, value iteration and policy iteration
- Learning methods: temporal-difference learning and deep learning
- Many applications from business, economics, and social sciences
- Unit 1:
- Multi-armed bandits and learning algorithms. Watch video lectures to learn about sequential decision problems and the exploration-exploitation dilemma. Application: How to maximize survey responses (~60 min).
- Interactive Online Session (~60 min).
- Unit 2:
- Solution methods for dynamic programming. Watch video lectures to understand and apply value iteration methods. Application: How to make your firm rich and famous (~60 min).
- Interactive Online Session (~60 min).
- Unit 3:
- Learning in the state-action-space. Watch video lectures about relevant theory and demonstrations of Monte Carlo and temporal difference learning. Application: Cliff Walking (~60 min).
- Interactive Online Session (~60 min).
- Unit 4:
- Deep Learning. Watch video lectures about how neural networks work and how to apply them to learning problems (~60 min).
- Interactive Online Session (~60 min).
Reinforcement Learning for Business, Economics, and Social Sciences (2026)
09.06.2026 16:30 - 17:30
How do machines learn to make the right choices? How can individuals, firms, organizations, and researchers use automated decision-making?
This course provides an introduction and intuition for designing algorithms that allow machines to learn based on reinforcements. Reinforcement learning provides decision rules based on learning from partial, implicit, and delayed feedback.
This is particularly useful in sequential decision-making tasks where a machine repeatedly interacts with the environment or users. Reinforcement learning is well known for its successes in robotic control, autonomous vehicles, game playing, and chat agents. In business, economics, and the social sciences, there is a recent explosion of applications including temporal difference learning and adaptive experiments.
Application
The extended deadline to apply for a seat in this free course is May 8, 2026 (CEST).
As the number of participants is limited, and to ensure the best fit between the course’s content and its participants, we will ask you a few questions in the application form about your background and interest in the course. Given that demand usually exceeds our seat capacity, we encourage you to be as detailed as possible in your responses. A specific description of your experience and interests strongly assists our selection process and helps us to identify the candidates who will likely benefit most from the course.
We will review all applications after the deadline and notify you of the outcome by mid-May. You will get access to the course materials at the latest one week before the first meeting as you are expected to review the videos and materials of the first unit before the first meeting.
The course is open to all researchers. However, as part of our role within the consortium for business, economics, and related data, priority will be given to researchers working in these fields.
The course is completely free of charge. As part of this opportunity, we kindly ask all participants to actively contribute to our evaluation process. This helps us to continuously improve the course.
Topics
Format
This is an online course.
After Unit 2 and Unit 4, you are tasked with practical assignments to apply the course’s contents.
Weekly Meetings
The course includes 4 live Online Meetings, in which you will discuss the week’s contents with the instructor and fellow participants:
Meeting 1: June 09, 4:30pm-5:30pm CEST
Meeting 2: June 16, 4:30pm-5:30pm CEST
Meeting 3: June 30, 4:30pm – 5:30pm CEST
Meeting 4: July 07, 4:30pm – 5:30pm CEST
Prerequisites
Anyone with an interest in learning about reinforcement learning is welcome to take the course. It is designed for graduate students in their second year or beyond (advanced master students, advanced PhD students) to support them during their research phase.
Required: A Zoom account to participate in the online meetings.
Recommended: Knowledge of basic statistics and previous experience with any programming language is helpful, but not required.