Obviously while you are getting score rising (train and test) it's meaning you on the right way and you are going to local/global minimum. When you'll see changing of direction score moving (traing still going down, test going up) or both scores go in stagnation, then it's time to stop.
BUT
While you are using accuracy as metric of evaluating you can just get anomal behavior of model. for ex: all result of network output would be number of most valuable classes. Explanation. This problem can be solved by using another metric of evaluating like f1, logloss and so on and you'll see any problems on learning period.
Also for imbalanced data you can use any strategies for avoiding negative effects of imbalance. Like weights in softmax_cross_entropy in tensorflow. Implementation you can find there.