Commit e997f549 authored by Yasser Gonzalez's avatar Yasser Gonzalez

Fix V initialization in PolicyIterationModified

V was initialized with the wrong dimensions for undiscounted MDPs,
which caused an exception to be raised in _bellmanOperator.
parent 3d20ccb4
......@@ -852,7 +852,7 @@ class PolicyIterationModified(PolicyIteration):
self.thresh = self.epsilon
if self.discount == 1:
self.V = _np.zeros((self.S, 1))
self.V = _np.zeros(self.S)
else:
Rmin = min(R.min() for R in self.R)
self.V = 1 / (1 - self.discount) * Rmin * _np.ones((self.S,))
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment