I am a newbie in reinforcement learning working on a college project. The project is related to optimizing x86 hardware power. I am running proprietary software in Linux distribution (16.04). The goal is to use reinforcement learning and optimize the power of the System (keeping the performance degradation of the software as minimum as possible). The proprietary software is a cellular network.
As we already know, the primary functional blocks of Reinforcement learning are Agent and Environment. The basic idea is to use the cellular network running on x86 hardware as the environment for RL. This environment interacts with the agent implementing RL using state, actions, and reward.
From reading different materials, I could understand that I need to make my software as a custom environment from where I can retrieve the state
features. The state
features are the application layer KPIs like latency, throughput. Action
space may include instructions to Linux to change the power (I can use some predefined set of power options). I did not decide about the reward function.
I read this post and decided that I should use OpenAI gym to create my custom environment.
My doubt is that using OpenAI gym for creating custom environments (for these type of setup) is correct. Am I going in the right direction (or) is there any alternative/best tools to create a custom environment. any tutorial or direction to create this custom environment is appreciated.