I am trying to understand ranking loss(a.k.a, Maximum Margin Objective Function, MarginRankingLoss ...) based on CS 224D: Deep Learning for NLP lecture note.
In this note, the cost is defined as follows: J = (1 + sc − s)
s= f(θ,x), sc = f(θ,xc), x is the input of the correct, and xc is the input of the wrong.
So, s is score of good thing, sc is score of bad thing.
My question is this: To update the weights, do I have to get ∂J/ ∂θ or ∂s/∂θ?
I thought I had to do ∂J / ∂θ to update θ.
Therefore, since J = 1 + sc-s, ∂J / ∂θ = ∂sc / ∂θ - ∂s / ∂θ.
So I thought that ∂sc / ∂θ and ∂s / ∂θ should be obtained, respectively.
In a lecture note, however, calculate ∂J / ∂s = -1 and use this value to update the network.
What am I doing wrong?