You could make an argument for either case, but I would say that variant 2 makes more sense - a:link, a:visited, a:hover, a:focus, a:active.
If we go from a link to active, we are going from less intent to more intent. Focus implies a bit more intent than hover does, since a focus means that the next action taken (such as an enter key press) applies to that element. A hover merely indicates a sort of flirting interest if you will. A user may hover their cursor all over a page as they move their cursor around, but when they actually enter into a textbox (focus), that implies a far greater degree of intent.
Additionally, you can't progress to active without first reaching the focus state. Let's take a submit button - to trigger a submit, a) the user may tab to the submit button and press enter or b) the user may simply click on the submit button. In case (a) the user first focuses on the submit button when they tab to it, and then activates it by pressing enter (no hover in this case). In case (b) the user hovers over the button, and then clicks - upon click the button ultimately goes to the active state, but the browser also fires any event handlers attached to focus.
The idea of ascending intent doesn't apply as clearly to link vs visited - in this case I would argue that visited is mostly a variant of link. Nevertheless, visited does imply some greater degree of intent as it indicates an action (click or page visit) that was previously taken, and would perhaps more likely be taken again. Even if you don't agree with this logic, if you think of link as the base, and visited as a variant of link, it would still make sense to include visited (the variant) after link (the base).