Mittagsseminar (in cooperation with A. Steger, D. Steurer and B. Sudakov)

Mittagsseminar Talk Information

Date and Time: Thursday, September 22, 2011, 12:15 pm

Duration: 30 minutes

Location: OAT S15/S16/S17

Speaker: Thomas Dueholm Hansen (Aarhus University)

Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor

A fundamental model of operations research is the finite, but infinite-horizon, discounted Markov Decision Process. Ye showed recently that the simplex method with Dantzig pivoting rule, as well as Howard's policy iteration algorithm, solve discounted Markov decision processes, with a constant discount factor, in strongly polynomial time. More precisely, Ye showed that for both algorithms the number of iterations required to find the optimal policy is bounded by a polynomial in the number of states and actions. We improve Ye's analysis in two respects. First, we show a tighter bound for Howard's policy iteration algorithm. Second, we show that the same bound applies to the number of iterations performed by the strategy iteration (or strategy improvement) algorithm used for solving 2-player turn-based stochastic games with discounted zero-sum rewards. This provides the first strongly polynomial algorithm for solving these games.

Markov Decision Processes and 2-player turn-based stochastic games define Acyclic Unique Sink Orientations of cubes, and in this abstract framework the strategy iteration algorithm is sometimes referred to as the Bottom-Antipodal algorithm. We also present a conjecture by Hansen and Zwick, saying that the number of iterations for an n-dimensional cube is at most the (n+2)-th Fibonacci number.

Joint work with Peter Bro Miltersen and Uri Zwick.

Upcoming talks | All previous talks | Talks by speaker | Upcoming talks in iCal format (beta version!)

Previous talks by year: 2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996

Information for students and suggested topics for student talks

Automatic MiSe System Software Version 1.4803M | admin login

Theory of Combinatorial Algorithms

Mittagsseminar (in cooperation with A. Steger, D. Steurer and B. Sudakov)

Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor