### [rewards] Handle sparse rewards better

Sparse rewards were broken. This commit adds several more functions to MDP class to make the reward computation code more modular and hopefully more correct. Rewards given as sparse matrices are converted to a dense vector. Future work will ensure that rewwards gieven in sparse format remain sparse. Fixes #7.

