I've been working on these libraries for some time and can share some of my experiments.
Let us first consider as an example of custom environment a text environment, https://github.com/openai/gym/blob/master/gym/envs/toy_text/hotter_colder.py
For a custom environment, a couple of things should be defined.
- Constructor__init__ method
- Action space
- Observation space (see https://github.com/openai/gym/tree/master/gym/spaces for all available gym spaces (it's a kind of data structure))
- _seed method (not sure that it's mandatory)
- _step method accepting action as a param and returning observation (state after action), reward (for transition to new observational state), done (boolean flag), and some optional additional info.
- _reset method that implements logic of fresh start of episode.
Optionally, you can create a _render method with something like
def _render(self, mode='human', **kwargs):
outfile = StringIO() if mode == 'ansi' else sys.stdout
outfile.write('State: ' + repr(self.state) + ' Action: ' + repr(self.action_taken) + '\n')
return outfile
And also, for better code flexibility, you can define logic of your reward in _get_reward method and changes to observation space from taking action in _take_action method.