ongoingPrincipal Investigator Projects

From Satisficing to Optimization in Reinforcement Learning

Von Satisfizierung zur Optimierung im Verstärkungslernen

View on FWF Research Radar

Principal Investigator

Name: Ronald Ortner
Role: Projektleiter:in
ORCID: 0000-0001-6033-2208
Institution: Montanuniversität Leoben

Grant Details

Approval Date: 3 Mar 2025
Start Date: 8 Sept 2025
End Date: 7 Sept 2027
Approved Amount: € 187.741

Keywords & Classification

Keywords

Reinforcement LearningSatisficingRegretMulti-Armed BanditComputational Learning TheoryMarkov decision process

Research Disciplines

Operations researchMachine learning

Project Summary

The research area of reinforcement learning develops algorithms that are able to learn complex behavior (like driving or playing a computer or board game). Some of the considered learning problems aim to learn some optimal behavior, where the goal is to be able to do something as good as possible. For example, when learning to play a computer game the goal might be to score the maximum number of points. Most reinforcement learning algorithms are indeed based on optimization, that is, they aim to maximize rewards (such as the scoring points in a computer game). However, there are many learning problems that actually do not contain an optimization component. Thus, an autonomous car that shall get us to work need neither be as fast as possible nor take the shortest route. It would usually be sufficient if it manages to be right on time. For most of the currently available learning algorithms it would still be necessary to formulate the problem setting as an optimization problem to be able to apply them. This not only means additional work. The arising optimization problems are usually also hard to solve. For example, computing the shortest or fastest route to work (up to inches or seconds) is practically infeasible. Accordingly, most learning algorithms are hardly applicable to typical real world problems. A precursor project investigated the question whether there is an advantage in solving problems not optimally bot only sufficiently. While it was known that an optimal strategy can only be solved in approximation, it could be shown that a sufficient strategy with respect to given satisficing level can also be learned exactly. Remarkably, this also means that an optimal strategy can be learned exactly if the learner knows a sufficiency level that is only satisfied by the optimal strategy. Accordingly, in the current project we aim to look at reinforcement learning algorithms that try to adaptively determine such an appropriate satisficing level. These algorithm may be able to learn in real world problems more efficiently and hence much faster.

From Satisficing to Optimization in Reinforcement Learning

Principal Investigator

Grant Details

Keywords & Classification

Project Summary

Research Outputs (0)

No outputs linked

Further Funding (0)