i have just started to study Q-learning and see the possibilities of using Q-learning to solve my problem.
Problem: I am supposed to detect a certain combination of data, i have four matrices that acts as an input to my system, i have already categorised the inputs ( each input can either be Low (L) , or High (H) ). I need to detect certain types of input for example LLLH, LLHH, HHHH etc
NOTE: 1)LLLH means the first input in L, second input is L, third input is L and the fourth input is H! 2)I have labelled each type of input type as state, for example LLLL is state 1, LLLH is state 2, so on.
What i have studied in Q-learning is that most of the time you have one goal (only one state as a goal) which makes it easier for the agent to learn and create the Q-matrix from the R-matrix . Now in my problem i have many goal ( many states act as goal and need to be detected). I dont know how to design the states, how to create the Reward-matrix by having many goals and how the agent will learn. Can you please help me how can i use Q-learning in this kind of situation. Taking into account i have like 16 goals in 20+ states!
as i have mentioned above, i know what is q-learning, how the states and the goal works, the calculation of Q_matrix (how it learns).... but the problem is now i have many goals, i dont really know how to relate my problem to q-learning.. how many states do i need, and how to label the Rewards as i have many goals.
I need help on at least how can i create reward matrix with many goals