Commit 9347d6a9 authored by Steven Cordwell's avatar Steven Cordwell
Browse files

update the documentation

parent ca769613
v4.0b1 - fist beta release created on the testing branch
v4.0a4 - Package is useable with Python 3.3, used 2tp3 script to convert. Still
needs a bit of cleaning up.
The first alpha release of v4.0 will be tagged today, which indicates that the
documentation and testing is starting to approach a level suitable for a final
release. A 4.0 release date has not been set, but once there are unittests for
all classes and functions, and the documentation is in a stable state, then it
can be tagged.
The release of v0.14 will be made today. With this all the functionality should
be available to use. The initial version release of 4.0 will be made once the
documentation has been finalised and unit tests have been written to cover all
of the classes. The first release will be 4.0 to match the MATLAB, Octave,
Scilab and R MDP Toolboxes that are available at
......@@ -6,6 +6,22 @@ descrete-time Markov Decision Processes. The list of algorithms that have been
implemented includes backwards induction, linear programming, policy iteration,
q-learning and value iteration along with several variations.
The classes and functions were developped based on the
`MATLAB <>`_
`MDP toolbox>`_ by the
`Biometry and Artificial Intelligence Unit <>`_ of
`INRA Toulouse <>`_ (France). There are editions
available for MATLAB, GNU Octave, Scilab and R.
- Eight MDP algorithms implemented
- Fast array manipulation using `NumPy <>`_
- Full sparse matrix support using
`SciPy's sparse package <>`_
- Optional linear programming support using
`cvxopt <>`_
Documentation is available as docstrings in the module code and as html in the
......@@ -22,7 +38,7 @@ Installation
``tar -xzvf pymdptoolbox-<VERSION>.tar.gz``
``unzip pymdptoolbox-<VERSION>``
3. Change to the MDP toolbox directory
3. Change to the PyMDPtoolbox directory
``cd pymdptoolbox``
4. Install via Distutils, either to the filesystem or to a home directory
......@@ -33,11 +49,24 @@ Quick Use
Start Python in your favourite way. Then follow the example below to import the
module, set up an example Markov decision problem using a discount value of 0.9,
and solve it using the value iteration algorithm.
solve it using the value iteration algorithm, and then check the optimal policy.
>>> import mdptoolbox, mdptoolbox.example
>>> import mdptoolbox.example
>>> P, R = mdptoolbox.example.forest()
>>> vi = mdptoolbox.mdp.ValueIteration(P, R, 0.9)
>>> vi.policy
(0, 0, 0)
Issue Tracker:
Source Code:
The project is licensed under the BSD license. See LICENSE.txt for details.
1. Improve the documentation and rewrite it as neccessary.
2. Work on converting the storage of the transitions and rewards as tuples of
3. Implement a nicer linnear programming interface to cvxopt, or write own
2. Implement a nicer linnear programming interface to cvxopt, or write own
linear programming code.
4. Write unittests for all the classes.
5. Implement own exception class.
6. Move evalPolicy* functions to be module functions, as these are useful for
checking policies
3. Write unittests for all the classes.
7. Try to use rows for the Bellman computations rather than columns (perhaps
this storage is more Pythonic? or more NumPyic?)
4. Implement own exception class.
8. Create the mask as a sparse matrix in exampleRand
5. Move evalPolicy* functions to be functions of the util module, as these are
useful for checking policies in cases other than policy iteration algorithms.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment