Calling str
on a string object is pretty cheap: it just returns the original string object. Calling isinstance
explicitly will definitely be slower.
If you want to test this on real data, take a look at the timeit
module.
BTW, you should eliminate the not
from your 2nd version
[val if isinstance(val, basestring) else str(val) for val in arr]
And you can speed things up slightly by caching str
:
def arr_2_strarr(arr, str=str):
return [str(val) for val in arr]
Happy micro-optimizing. :)
Why cache str
? Well, each time you use a name, Python has to look for it. If you're inside a function, first it looks in the local namespace, and if it can't find the name then it looks in the globals. Even though str
is built-in, it still "lives" in the global namespace; it would be inefficient to "import" all the built-ins into every function. By doing
def arr_2_strarr(arr, str=str)
we create a local name str
that gets bound to the built-in str
type, and because it's a default argument that search & bind process happens once, when the function definition is executed, not each time the function is called.
So each time we call arr_2_strarr
the interpreter will immediately find that local str
, which will save a tiny amount of time.
Here's some timeit
code that compares the various strategies. It runs on both Python 2 & Python 3, although on Python 3 it substitutes str
for basestr
, since basestr
doesn't exist in Python 3.
This code runs the functions on lists of various sizes first with integer data, then with string data which is created by converting the integer data to strings.
Each line of output gives the time to perform the given number of loops over 3 repetitions, sorted from fastest to slowest. As the timeit repeat
docs mention, the main number to look at in each run is the smallest one.
The results for all functions on a given list size and type are also sorted from fastest to slowest.
''' Compare the speeds of direct string conversion
with testing first via isinstance
See https://stackoverflow.com/q/44439323/4014959
Written by PM 2Ring 2017.06.09
Python 2 / 3 compatible
'''
from __future__ import print_function, division
from timeit import Timer
import sys
# Python 3 doesn't have basestring
if sys.version_info[0] > 2:
basestring = str
# The functions to test
def plain(arr):
return [str(val) for val in arr]
def cached(arr, str=str):
return [str(val) for val in arr]
def teststr(arr):
return [val if isinstance(val, str) else str(val) for val in arr]
def testbase(arr):
return [val if isinstance(val, basestring) else str(val) for val in arr]
def testbasenot(arr):
return [str(val) if not isinstance(val, basestring) else val for val in arr]
funcs = (
plain,
cached,
teststr,
testbase,
testbasenot,
)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def verify(arr):
results = [func(arr) for func in funcs]
first, results = results[0], results[1:]
return all(first == u for u in results)
def time_test(loops, reps):
''' Print timing stats for all the functions '''
timings = []
for func in funcs:
fname = func.__name__
setup = 'from __main__ import arr, ' + fname
cmd = fname + '(arr)'
t = Timer(cmd, setup)
result = t.repeat(reps, loops)
result.sort()
timings.append((result, fname))
timings.sort()
for result, fname in timings:
print('{0:12} {1}'.format(fname, result))
# Check that all functions return the same results
if 0:
print('Testing all functions')
arr = list(range(10))
print(arr, verify(arr))
arr = list('abcdefghij')
print(arr, verify(arr))
# Do the timing tests
reps = 3
loops = 1 << 16
for i in range(1, 11):
n = 1 << i
# Build a data array of integers
arr = range(n)
print('\n{0}: Size={1}, Loops={2}'.format(i, n, loops))
print('* Integer')
time_test(loops, reps)
# Convert the data array contents to strings
arr = cached(arr)
print('\n* String')
time_test(loops, reps)
loops >>= 1
typical Python 2 output
1: Size=2, Loops=65536
* Integer
cached [0.17268610000610352, 0.19634914398193359, 0.2058720588684082]
plain [0.17906594276428223, 0.18797492980957031, 0.24009895324707031]
teststr [0.32513308525085449, 0.33270597457885742, 0.35080599784851074]
testbasenot [0.32793092727661133, 0.33176803588867188, 0.33498501777648926]
testbase [0.32964491844177246, 0.33154511451721191, 0.33760714530944824]
* String
cached [0.1619560718536377, 0.1628870964050293, 0.16448402404785156]
teststr [0.16335082054138184, 0.16484308242797852, 0.17012500762939453]
plain [0.16956901550292969, 0.1711430549621582, 0.18457293510437012]
testbase [0.22378706932067871, 0.2255101203918457, 0.22593879699707031]
testbasenot [0.22855901718139648, 0.22941207885742188, 0.23271608352661133]
2: Size=4, Loops=32768
* Integer
cached [0.12796807289123535, 0.12807202339172363, 0.12817001342773438]
plain [0.13622713088989258, 0.14297294616699219, 0.14868402481079102]
teststr [0.27701020240783691, 0.27812099456787109, 0.2795259952545166]
testbasenot [0.27815794944763184, 0.28220701217651367, 0.29373884201049805]
testbase [0.2804868221282959, 0.28186416625976562, 0.31699705123901367]
* String
cached [0.12131500244140625, 0.12241697311401367, 0.13379192352294922]
teststr [0.12839889526367188, 0.1314079761505127, 0.14053797721862793]
plain [0.13051795959472656, 0.14696002006530762, 0.18504786491394043]
testbase [0.18404412269592285, 0.1844489574432373, 0.19633579254150391]
testbasenot [0.18416285514831543, 0.18494606018066406, 0.18553614616394043]
3: Size=8, Loops=16384
* Integer
cached [0.10957002639770508, 0.11252093315124512, 0.11768913269042969]
plain [0.11848998069763184, 0.11958003044128418, 0.1292269229888916]
testbase [0.26231694221496582, 0.26471304893493652, 0.26625895500183105]
teststr [0.26410102844238281, 0.2641758918762207, 0.26569199562072754]
testbasenot [0.26910495758056641, 0.26967120170593262, 0.2741539478302002]
* String
cached [0.102294921875, 0.10357999801635742, 0.1050269603729248]
teststr [0.10852217674255371, 0.10861611366271973, 0.1127161979675293]
plain [0.11173510551452637, 0.11183404922485352, 0.12115597724914551]
testbasenot [0.16488981246948242, 0.16509699821472168, 0.16648602485656738]
testbase [0.16622614860534668, 0.16688108444213867, 0.16962814331054688]
4: Size=16, Loops=8192
* Integer
cached [0.10548806190490723, 0.10568594932556152, 0.10611891746520996]
plain [0.11526799201965332, 0.1160120964050293, 0.12486004829406738]
teststr [0.25309896469116211, 0.25549888610839844, 0.25838899612426758]
testbasenot [0.25410699844360352, 0.27252411842346191, 0.32510590553283691]
testbase [0.25414609909057617, 0.26968812942504883, 0.27393984794616699]
* String
cached [0.092885017395019531, 0.096045970916748047, 0.10643196105957031]
teststr [0.098433017730712891, 0.098783016204833984, 0.10051798820495605]
plain [0.10081005096435547, 0.10222005844116211, 0.12018895149230957]
testbasenot [0.15373396873474121, 0.15472292900085449, 0.15676999092102051]
testbase [0.15490198135375977, 0.15572404861450195, 0.15599799156188965]
5: Size=32, Loops=4096
* Integer
cached [0.10568094253540039, 0.10743498802185059, 0.1115870475769043]
plain [0.1163330078125, 0.11633419990539551, 0.12796401977539062]
teststr [0.25122308731079102, 0.26527810096740723, 0.26579189300537109]
testbase [0.25309586524963379, 0.25563716888427734, 0.25917816162109375]
testbasenot [0.25465011596679688, 0.25907588005065918, 0.26110982894897461]
* String
cached [0.085406064987182617, 0.086378097534179688, 0.08651280403137207]
teststr [0.092473983764648438, 0.09324193000793457, 0.093439817428588867]
plain [0.096549034118652344, 0.097501993179321289, 0.10462403297424316]
testbase [0.14794015884399414, 0.14966106414794922, 0.15016818046569824]
testbasenot [0.14796280860900879, 0.14940309524536133, 0.15308189392089844]
6: Size=64, Loops=2048
* Integer
cached [0.10838603973388672, 0.1089630126953125, 0.11129999160766602]
plain [0.11764693260192871, 0.11851096153259277, 0.12583494186401367]
teststr [0.2550208568572998, 0.25540995597839355, 0.26316595077514648]
testbase [0.25723910331726074, 0.25930881500244141, 0.26207089424133301]
testbasenot [0.25864100456237793, 0.25901007652282715, 0.26875495910644531]
* String
cached [0.086635112762451172, 0.087384939193725586, 0.099885940551757812]
plain [0.096493959426879883, 0.12469196319580078, 0.13684391975402832]
teststr [0.096681118011474609, 0.098448991775512695, 0.10569310188293457]
testbase [0.14573216438293457, 0.14696693420410156, 0.14700508117675781]
testbasenot [0.14776277542114258, 0.14852094650268555, 0.15462112426757812]
7: Size=128, Loops=1024
* Integer
cached [0.10915207862854004, 0.11011981964111328, 0.1127631664276123]
plain [0.11721491813659668, 0.11830401420593262, 0.1254270076751709]
testbase [0.25789499282836914, 0.26130795478820801, 0.26179313659667969]
teststr [0.25840306282043457, 0.25889492034912109, 0.26300287246704102]
testbasenot [0.26443600654602051, 0.26498103141784668, 0.26691412925720215]
* String
cached [0.083537101745605469, 0.084954023361206055, 0.086431980133056641]
teststr [0.091158866882324219, 0.09123992919921875, 0.091590166091918945]
plain [0.091225862503051758, 0.092115163803100586, 0.099261045455932617]
testbase [0.14569401741027832, 0.14622306823730469, 0.14650607109069824]
testbasenot [0.14774990081787109, 0.14930200576782227, 0.15020990371704102]
8: Size=256, Loops=512
* Integer
cached [0.10824894905090332, 0.10865211486816406, 0.10895800590515137]
plain [0.11750102043151855, 0.12690877914428711, 0.12890195846557617]
teststr [0.25457501411437988, 0.25542402267456055, 0.25692200660705566]
testbasenot [0.25513482093811035, 0.25664496421813965, 0.25999689102172852]
testbase [0.25680398941040039, 0.25924396514892578, 0.26179695129394531]
* String
cached [0.080662012100219727, 0.081827878952026367, 0.081900119781494141]
teststr [0.089673995971679688, 0.097939014434814453, 0.15471792221069336]
plain [0.094327926635742188, 0.095342159271240234, 0.097375154495239258]
testbasenot [0.14262199401855469, 0.14278602600097656, 0.14302182197570801]
testbase [0.14464497566223145, 0.14674210548400879, 0.16207790374755859]
9: Size=512, Loops=256
* Integer
cached [0.10789299011230469, 0.1092069149017334, 0.110015869140625]
plain [0.11702799797058105, 0.1181950569152832, 0.12698101997375488]
testbase [0.25504207611083984, 0.25520896911621094, 0.25734806060791016]
testbasenot [0.25715017318725586, 0.25747489929199219, 0.25850796699523926]
teststr [0.25783085823059082, 0.25882315635681152, 0.26154208183288574]
* String
cached [0.078849077224731445, 0.079813003540039062, 0.084489107131958008]
teststr [0.086745977401733398, 0.087059974670410156, 0.087485074996948242]
plain [0.088322877883911133, 0.088804960250854492, 0.097378969192504883]
testbasenot [0.14128994941711426, 0.14266705513000488, 0.1427910327911377]
testbase [0.14152097702026367, 0.14231991767883301, 0.14392399787902832]
10: Size=1024, Loops=128
* Integer
cached [0.10892415046691895, 0.11003899574279785, 0.11008000373840332]
plain [0.1192779541015625, 0.12048506736755371, 0.12956619262695312]
teststr [0.25335502624511719, 0.25642204284667969, 0.25892996788024902]
testbase [0.25525593757629395, 0.25550699234008789, 0.25794696807861328]
testbasenot [0.25932693481445312, 0.25960803031921387, 0.26134610176086426]
* String
cached [0.078451156616210938, 0.080369949340820312, 0.080511093139648438]
teststr [0.084844112396240234, 0.085949897766113281, 0.096578836441040039]
plain [0.086302042007446289, 0.087638139724731445, 0.096364974975585938]
testbase [0.14068913459777832, 0.14274501800537109, 0.15559101104736328]
testbasenot [0.14075493812561035, 0.15553092956542969, 0.19578790664672852]
typical python3 output
1: Size=2, Loops=65536
* Integer
plain [0.2957206170030986, 0.2959696320031071, 0.2991539639988332]
cached [0.3058611470005417, 0.30598287599787, 0.3073535650000849]
testbase [0.38803433800057974, 0.39307209699836676, 0.393392562000372]
testbasenot [0.3888578799997049, 0.3951267439988442, 0.42909636100011994]
teststr [0.41290506400036975, 0.41541150199918775, 0.4488242949992127]
* String
testbase [0.23906823500146857, 0.23946705200069118, 0.24624350399972172]
testbasenot [0.24037985899849446, 0.24200722000023234, 0.2462738950016501]
plain [0.25742501500280923, 0.2644229819998145, 0.26711930600140477]
teststr [0.2635171010006161, 0.3559218000009423, 0.3784064870014845]
cached [0.2687887559986848, 0.2711959320004098, 0.38138879500183975]
2: Size=4, Loops=32768
* Integer
cached [0.21332427200104576, 0.21363574399947538, 0.21528891600246425]
plain [0.22395663199858973, 0.22762144099760917, 0.23422862100051134]
testbasenot [0.31939790100295795, 0.32413787499899627, 0.32422161499926005]
testbase [0.3209382370005187, 0.3213516770010756, 0.3215230670029996]
teststr [0.3372085839982901, 0.33786465500088525, 0.33847540900023887]
* String
testbasenot [0.17031173299983493, 0.17143720199965173, 0.17724975699820789]
testbase [0.170390128998406, 0.17118954800025676, 0.18865150499914307]
cached [0.18190538799899514, 0.18262020299880533, 0.183105569001782]
plain [0.18666503399799694, 0.18781541300268145, 0.1955128590016102]
teststr [0.18973677000030875, 0.19112570400102413, 0.19168143299975782]
3: Size=8, Loops=16384
* Integer
cached [0.17012267099926248, 0.18160372200145503, 0.2275817529989581]
plain [0.1890079689983395, 0.1963043950017891, 0.2016476179996971]
testbasenot [0.28168991999700665, 0.2821743839995179, 0.286649605997809]
testbase [0.28295213199817226, 0.28760008400058723, 0.2906435440017958]
teststr [0.2958552290001535, 0.2989299110013235, 0.31747390199961956]
* String
testbase [0.13354753000021446, 0.13377505199969164, 0.14039257600234123]
cached [0.1352838150014577, 0.1353432000032626, 0.13798289999976987]
testbasenot [0.14252334699995117, 0.14301740500013693, 0.1445914210016781]
plain [0.15130633899752866, 0.15166569000211894, 0.1616801599993778]
teststr [0.15267008800219628, 0.1545946529986395, 0.15590016200076207]
4: Size=16, Loops=8192
* Integer
cached [0.144755126999371, 0.14782401300180936, 0.1484048439997423]
plain [0.1726092749995587, 0.1740606339990336, 0.1815100200001325]
testbase [0.26685525399807375, 0.27029573199979495, 0.2716258750006091]
testbasenot [0.2702714350016322, 0.2723204169997189, 0.27288546099953237]
teststr [0.28515160999813816, 0.28523068700087606, 0.2878553769987775]
* String
cached [0.11515368599793874, 0.11579233700103941, 0.11688366999806021]
testbase [0.12178990400207113, 0.13090817400006927, 0.13304468899877975]
testbasenot [0.13121789299839293, 0.14976675499929115, 0.1521548589989834]
teststr [0.13410512400150765, 0.1354981399999815, 0.147247362001508]
plain [0.13691626099898713, 0.1384456069972657, 0.1426525679999031]
5: Size=32, Loops=4096
* Integer
cached [0.13246865899782279, 0.13320018100057496, 0.134628559997509]
plain [0.1636957459995756, 0.16763203899972723, 0.1752369269997871]
testbase [0.26010187700012466, 0.2606812570011243, 0.2647345440018398]
testbasenot [0.2620696090016281, 0.26230394700178294, 0.26258907899682526]
teststr [0.27685887300322065, 0.2787095199964824, 0.28293989099984174]
* String
cached [0.10246079200078384, 0.10416977099885116, 0.10755630499988911]
testbasenot [0.10829716499938513, 0.10918466699877172, 0.10935586699997657]
testbase [0.11739019699962228, 0.11808202800239087, 0.11899654000080773]
plain [0.12601002500014147, 0.12718953500007046, 0.13454839599944535]
teststr [0.13366336599938222, 0.13407608800116577, 0.13510101700012456]
6: Size=64, Loops=2048
* Integer
cached [0.12591946799875586, 0.127094235002005, 0.13223557899982552]
plain [0.160616523000499, 0.16232994500023779, 0.1691623620026803]
testbase [0.2534341589998803, 0.2556092949998856, 0.2571690379991196]
testbasenot [0.2560774869998568, 0.2574564010028553, 0.2606996459981019]
teststr [0.268248238000524, 0.2702014210008201, 0.27107579600124154]
* String
cached [0.09791737100022146, 0.09819723300097394, 0.10752435399990645]
testbasenot [0.1057888709983672, 0.10588572099732119, 0.16173565400094958]
testbase [0.10636284599968349, 0.1179599219976808, 0.12130766799964476]
plain [0.12285572399923694, 0.12589510299949325, 0.13114397300159908]
teststr [0.13122114399811835, 0.13273253399893292, 0.14575592999972287]
7: Size=128, Loops=1024
* Integer
cached [0.12404713899741182, 0.12496110600113752, 0.12496385000122245]
plain [0.15980284800025402, 0.16046370399999432, 0.16711239899814245]
testbasenot [0.25531527800194453, 0.25563639699976193, 0.2586420219995489]
testbase [0.25544935799916857, 0.2558138679996773, 0.257172014000389]
teststr [0.2699256220003008, 0.2712909309993847, 0.27702098800000385]
* String
cached [0.09376715399776003, 0.09393715400074143, 0.09975314399707713]
testbasenot [0.10510071799944853, 0.10511873200084665, 0.10523289399861824]
testbase [0.11240010600158712, 0.11325187799957348, 0.11632439300228725]
plain [0.12139380200096639, 0.12202585699924384, 0.1315958569975919]
teststr [0.12834531499902369, 0.12949470400053542, 0.12955383699954837]
8: Size=256, Loops=512
* Integer
cached [0.12225364700134378, 0.12283446399669629, 0.1285843859986926]
plain [0.15971405900199898, 0.16198832800000673, 0.16777605400056927]
testbase [0.2507534860014857, 0.2527904779999517, 0.25378678199922433]
testbasenot [0.25323686200135853, 0.2547167230004561, 0.25919888999851537]
teststr [0.2652072370001406, 0.2658402630004275, 0.2674206650008273]
* String
cached [0.0906629850032914, 0.0985801380011253, 0.09929232800277532]
testbase [0.10155730300175492, 0.1042869699995208, 0.11276149599871133]
testbasenot [0.10197166099897004, 0.11451221999959671, 0.15595895300066331]
plain [0.11898361400017166, 0.12018223199993372, 0.12760113599870238]
teststr [0.12645652200080804, 0.12671815700014122, 0.14095144699967932]
9: Size=512, Loops=256
* Integer
cached [0.12672984500022721, 0.1462409830019169, 0.2653043659993273]
plain [0.161721200998727, 0.17296033000093303, 0.19699998799842433]
testbase [0.25432757399903494, 0.25851125400004094, 0.258548003002943]
testbasenot [0.25619441399976495, 0.25656893900304567, 0.25998359599907417]
teststr [0.2719232039999042, 0.2744571339972026, 0.2751794379983039]
* String
cached [0.08841608199873008, 0.08848714099804056, 0.09124958899701596]
testbasenot [0.09962382599769626, 0.10016373899998143, 0.10028601600060938]
testbase [0.10713129000214394, 0.10752918499929365, 0.10952026399900205]
plain [0.1163020489984774, 0.12190789400119684, 0.1264930679972167]
teststr [0.1242994140011433, 0.12458201900153654, 0.12523995000083232]
10: Size=1024, Loops=128
* Integer
cached [0.12827690600170172, 0.1294701549995807, 0.13387694999983069]
plain [0.16636216699771467, 0.16866590399877168, 0.17549873600000865]
testbasenot [0.25435296399882645, 0.25515673799964134, 0.2605281959986314]
testbase [0.26351416900070035, 0.26398584699927596, 0.2651360300005763]
teststr [0.26816077799958293, 0.26908816800278146, 0.2715630999991845]
* String
cached [0.08827024300262565, 0.09090095799911069, 0.09729095900183893]
testbase [0.10063145499952952, 0.1010660120009561, 0.10904535399822635]
testbasenot [0.10313185999984853, 0.11444468399713514, 0.14796407999892836]
plain [0.11569941500056302, 0.11579339799936861, 0.12615105800068704]
teststr [0.12353994099976262, 0.12515813500067452, 0.13752399999793852]
These timings were performed on a rather old 32 bit single core 2GHz machine with 2GB of RAM running on a Debian derivative of Linux. I used Python 2.6.6 and Python 3.6.0. Your results may vary. ;) In any case, these results should only be used as a rough guide. timeit
does a pretty good job of only timing the stuff we want to time, but it has no control over other processes that also want to use the CPU.