How to define in Julia ReinforcementLearning.jl an action space that changes with each state?
I want to implement an environment in Julia ReinforcementLearning.jl that has continuous action space that change as a function of the state. The state is a positive integer n <= nmax for a given nmax. The action space is the n-dimensional vector space [0, 1]^n, that is, an action is a vector of size n that has elements in [0,1].
What I implemented is the RLBase.action_space(env::MyEnv) which is essentially
using ReinforcementLearning
using IntervalSets # for ClosedInterval
RLBase.action_space(env::MyEnv) = Space(ClosedInterval{Float32}[0..1 for _ in 1:state(env)])
# state(env) is an integer between 1 and nmax.
I think this implementation is not complete because in the documentation of ReinforcementLearning.jl, it is mentioned that legal_action_space and legal_action _space_mask should be implemented when ActionStyle is FULL_ACTION_SET.
How should I implemented legal_action_space and legal_action_space_mask and should I use ActionTransformedEnv when defining my environment?
Comments
Post a Comment