Verstärkungslernen ohne Optimalität
View on FWF Research RadarKeywords
Research Disciplines
Research Fields
The research area of reinforcement learning develops algorithms that are able to learn complex behavior (like driving or playing a computer or board game). Some of the considered learning problems aim to learn some optimal behavior, where the goal is to be able to do something as good as possible. For example, when learning to play a computer game the goal might be to score the maximum number of points. Most reinforcement learning algorithms are indeed based on optimization, that is, they aim to maximize rewards (such as the scoring points in a computer game). However, there are many learning problems that actually do not contain an optimization component. Thus, an autonomous car that shall get us to work needs neither be as fast as possible nor take the shortest route. It would usually be sufficient if it manages to be right on time. For most of the currently available learning algorithms it would still be necessary to formulate the problem setting as an optimization problem to be able to apply them. This not only means additional work. The arising optimization problems are usually also hard to solve. For example, computing the shortest or fastest route to work (up to inches or seconds) is practically infeasible. Accordingly, most learning algorithms are hardly applicable to typical real world problems. The project at hand aims to find algorithms that are not able to solve problems optimally but just good enough, but do that much faster. In a first step it will be necessary to work on suitable mathematica l models, for which in a second step we shall develop learning algorithms that are more widely applicable to real world problems.
| Title | Year(s) | DOI / Link |
|---|---|---|
| Online Regret Bounds for Satisficing in Markov Decision ProcessesMathematics of Operations Research | 2025 | 10.1287/moor.2023.0275 |
No additional funding sources recorded.