Department of Computer Science | Institute of Theoretical Computer Science | CADMO

Theory of Combinatorial Algorithms

Prof. Emo Welzl and Prof. Bernd Gärtner

Mittagsseminar (in cooperation with A. Steger, D. Steurer and B. Sudakov)

Mittagsseminar Talk Information

Date and Time: Tuesday, September 20, 2022, 12:15 pm

Duration: 30 minutes

Location: OAT S15/S16/S17

Speaker: Maxime Larcher

A Random Walk Algorithm for the 2-Armed Bandit

In the 2-Armed Bandit Problem, an agent faces two slots machines. At round 1, 2, ..., T the agent pulls the arm of their choice and receives a random reward, sampled according to the (hidden) distribution of that arm. Naturally, the goal of the agent is to minimise the total regret, i.e. the total expected missed reward over the T rounds.

When the reward distribution is allowed to change up to L times over the T rounds (without the agent knowing when such changes happen), it was shown by Auer et al. '02 that no algorithm can achieve regret better than Ω((LT)1/2). In 2019, Auer et al. used Azuma's inequality to bound the probability of 'bad events' and presented an algorithm achieving regret O((LT log T)1/2).

We present a new algorithm based on random walks and which achieves regret O((LT)1/2) when L is known and O((LT log L)1/2) when L is unknown. In particular, our algorithm is optimal when L is known. We also obtain improved bounds for the general K-Armed bandit for a wide range of K.

Upcoming talks     |     All previous talks     |     Talks by speaker     |     Upcoming talks in iCal format (beta version!)

Previous talks by year:   2024  2023  2022  2021  2020  2019  2018  2017  2016  2015  2014  2013  2012  2011  2010  2009  2008  2007  2006  2005  2004  2003  2002  2001  2000  1999  1998  1997  1996  

Information for students and suggested topics for student talks

Automatic MiSe System Software Version 1.4803M   |   admin login