-1

Would this:

os.makedirs(item_number)
os.makedirs(os.path.join(item_number, 'Project'))
os.makedirs(os.path.join(item_number, 'Project', 'Images'))
os.makedirs(os.path.join(item_number, 'Project', 'Area Templates'))

Be more efficient and create fewer objects written this way:

project = "Project"
os.makedirs(os.path.join(item_number, project))
os.makedirs(os.path.join(item_number, project, 'Images'))
os.makedirs(os.path.join(item_number, project, 'Area Templates'))

Or does python understand those "Project" strings should be the same object and creating the name is needlessly verbose? (regardless of whether or not the real world efficiency is significant enough to warrant it)

3 Answers3

1

You are micro-optimising, you should not worry about how many string objects are created there. A few string objects here and there are not going to affect the memory footprint to any noticeable extent, and as you are touching the filesystem here, any performance gains from one look-up method over another is going to melt into insignificance compared with waiting for I/O to complete. Instead, focus on readability and maintainability.

Your example could be simplified further:

project_dir = os.path.join(item_number, 'Project')
os.makedirs(os.path.join(project_dir, 'Images'))
os.makedirs(os.path.join(project_dir, 'Area Templates'))

The intervening directories will be created for you, there is no need to call os.makedirs() quite so often.

CPython can and will re-use immutable objects created from literals, such as tuples and strings, within the same codeobject. That means that if you used 'Project' multiple times in one function, then Python can opt to only store one such object in the code-object constants:

>>> def demo():
...     project = "Project"
...     os.makedirs(os.path.join(item_number, project))
...     os.makedirs(os.path.join(item_number, project, 'Images'))
...     os.makedirs(os.path.join(item_number, project, 'Area Templates'))
...
>>> demo.__code__.co_consts
(None, 'Project', 'Images', 'Area Templates')

See About the changing id of a Python immutable string for the full details. You should not rely on this behaviour however.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
0

I would suspect the project = "Project" variant may be slower because it has to lookup the content of your project variable each time you use it.

The other variant might only create one "Project" string because that's something python can intern (see for example this question).

However it might just be that doing the os.path.join for the "base path" before might be faster and you don't need to repeat the string at all:

project_base = os.path.join(item_number, "Project")
os.makedirs(project_base)
os.makedirs(os.path.join(project_base, 'Images'))
os.makedirs(os.path.join(project_base, 'Area Templates'))
Community
  • 1
  • 1
MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • Isn't it an example of *premature optimisation is the root of all evil* ? – Gribouillis Dec 20 '16 at 19:23
  • @Gribouillis If you mean my last example: Why would you think that? It's far better to reduce repetition instead of focussing what's the fastest way to repeated something. – MSeifert Dec 20 '16 at 19:28
  • That's why discussing about whether a `project` variable is desirable is premature optimisation. – Gribouillis Dec 20 '16 at 19:32
  • Within a function, where immutable literals are stored as constants, there is no difference whatsoever in lookup speed for either `'Project'` or `project`. In a *global context* (outside functions, at a module top-level), the literal is probably *faster*, because a global name lookup requires a dictionary lookup (requiring hashing), while bytecode constants and locals are array lookups (simple indexing into a memory location). – Martijn Pieters Dec 20 '16 at 19:41
  • That said, this is micro-optimising and not worth fretting over. Code in critical loops may benefit from using locals and literals, code creating filesystem directories definitely does not. The filesystem is the bottleneck here. – Martijn Pieters Dec 20 '16 at 19:44
  • I totally agree with both of you. That's why I presented the variant (see code in my answer) without repeating the string at all. :-) – MSeifert Dec 20 '16 at 19:45
-1

Yes, the strings are the same. Try this

a = 'project'
b = 'project'
a is b # True

This is safe, since strings are immutable, and changing any of them creates a new string, so other variables holding the same value are not affected

blue_note
  • 27,712
  • 9
  • 72
  • 90
  • I think this answers the issue, at least for one implementation of python. – Gribouillis Dec 20 '16 at 19:21
  • With several caveats: for example long strings don't get interned and only string literals _might_ be interned. – MSeifert Dec 20 '16 at 19:26
  • 2
    This is highly dependent on the context and not always the case. See [About the changing id of a Python immutable string](//stackoverflow.com/a/24245514) – Martijn Pieters Dec 20 '16 at 19:31