The answer to a related, but different question here talks of how FPUs perform such instructions:
Once you've reduced your argument, most chips use a CORDIC algorithm to compute the sines and cosines. You may hear people say that computers use Taylor series. That sounds reasonable, but it's not true. The CORDIC algorithms are much better suited to efficient hardware implementation. (Software libraries may use Taylor series, say on hardware that doesn't support trig functions.) There may be some additional processing, using the CORDIC algorithm to get fairly good answers but then doing something else to improve accuracy.
Note though that it says "most chips", as attempts to improve performance, accuracy or (ideally) both would obviously be something that chip manufacturers strive for, and so, there will be differences between them.
Those differences my well lead to greater performance at the cost of less accuracy, or vice-versa (and of course, they can just be plain bad at both, since we live in an imperfect world) so there would be times when one might favour performing the algorithm in the CPU (as would happen if you coded the algorithm yourself) rather than in the FPU like fsin passes to.
This archived blog post talks of how Sun's implementation of the JVM on Intel only uses a plain call to fsin
with inputs of a certain range, because of flaws in that implementation. The paper linked to from that article presumably discusses that implementation of fsin
, and it's issues, in more detail, but you'll need to be a subscriber or pay to read that article (which I have hence not done).