If in doubts - give a try to the interactive tool, linked below, and judge afterwards.
Tl;Dr;
@Richard's view is based on Amdahl's argument, yet does not fit, nor work for your original question:
"How to apply Amdahl's law on a given piece of code?"
This question deserves a two-fold answer :
a)
The original, overhead naive, resources-agnostic, formulation of the Amdahl's law ( if applied on a given piece of code ) answers a principal limit of a "Speedup", an idealised process-flow is subjected to, if operated in "improved" organisation ( using more, parallel, mutually independent lines, that allow some parts of the original process to get organised better ( possibly processing them in parallel ) and thus improve the overall end-to-end process duration ). So Amdahl's argument was not related to CPU-s, CPU-cores, tools for spawning more threads et al, Dr. Gene M. AMDAHL has expressed that original formula for general process-flow orchestrations, re-using and acknowledging admiration of prior prof. Kenneth E. KNIGHT, Stanford School of Business Administration 1966/Sep published work.
b)
no matter how tempting the other question was, Amdahl's law does not answer how fast it will actually run, but only states a principal Speedup limit, that will remain a limit one will never achieve, not even under the most abstract & extremely idealised conditions ( zero-latency, zero-jitter, zero add-on overhead times, zero add-on data-SER/DES-overhead times, zero PAR-work-segments batches SER-scheduling, and many more to name here )
There was published, somewhere in 2017, both a criticism of the weaknesses of using the original Amdahl's argument in contemporary contexts and also an extended formulation of the original Amdahl's argument, to better reflect either of the said naive-usage weaknesses, years ago*. After helping about three years to indeed "Learn more...", as is explicitly written on the click-through-link, it was "redacted".
There is also a visual GUI-tool, one may interact & play with, changing parameters & visually seeing their immediate impact on the resulting principal-Speedup-ceiling. This may help test & see hard impacts way better than by just reading the rest of this article.
Your second question :
"How much will the following code speed up if we run it simultaneously on 8 threads?"
is practical and common in real-world problems, yet Amdahl's law, even the overhead-strict, resources- and atomicity-of-work aware re-formulated version does not directly answer it.
We can ( and shall ) do our professional duty and profile the key phases of the real-hardware process-flow, if we aim at chances to seriously answer this second question, no matter how fuzzy and jitter-dependent out observations might get ( scale-, background workloads-, CPU-cores' thermal throttling effects and other inter-related dependencies matter always - "less" on small scales, but may cause our HPC-scheduled processing killed if running out of HPC-quota, just due our ill-performed or missing add-on overhead analyses ) :
what is the overhead add-on cost of a thread, pool-of-threads ( sometimes even a whole Python-interpreter process, incl. its internal state & all of its current data-structures n-many times caused (re)-replication(s) in new RAM allocations, sometimes thus starting O/S resources ' suffocation & swap-thrashing ) (re)-instantiation ... in [ns]
what are the overhead add-on costs associated with data ( parameters ) interexchange ... in [ns]
what are the resources that potentially block achievable level of concurrent / parallel processing ( sharing, false-sharing, limits of I/O, ... ) that cause independent barriers to processing-orchestrations, even if we have "infinitely"-many free CPU-cores ... this reduces the value of denominator, that will decide about the achievable effects one may expect from actual co-existence of independently flowing process-(code)-execution ( we can claim that having 6 Ferrari cars, we can move "data" in full parallel having ( PAR / 6 ) improvement over going one after another in a pure [SERIAL]
-fashion , yet if the road from start to end goes over a bridge, having only 2 lanes, the net effect will degrade into only ( PAR / 2 ) "improvement" on the PAR-section (still not speaking about other overhead costs of loading and unloading our "data" onto and from our sextet of Ferrari "sport-wagens" )
More thoughts on recursive-use in the real-world realm are here and are always device
and instruction-mix specific (architecture & Cache details matter awfully lot)