2

I have a huge collection of strings in my program. I know for a fact that they are heavily duplicated, so I would like to use string interning to reduce the memory usage. However, because they are mostly created by some other libraries at many different places, I do not have direct control of calling intern() during creation. So I'm wondering if there is a way to force CPython (2.7) to do interning by default? Or some clever work-around, such as override the str class, and somehow make python to invoke my new str class everywhere where str is called without modifying the source of the calling programs and libraries.

C Snover
  • 17,908
  • 5
  • 29
  • 39
Yanshuai Cao
  • 1,257
  • 10
  • 14
  • **possible** duplicate of [What does python sys.intern do, and when should it be used?](http://stackoverflow.com/questions/1136826/what-does-python-sys-intern-do-and-when-should-it-be-used) (if not it might help you?) Also, this might be a perfect opportuinity to use `memcache`? – Torxed Mar 27 '14 at 08:48
  • 2
    i don't think it's a duplicate... that describes what it does, this is asking to do it all the time. one thing to keep in mind is that if these strings are created by other libraries, they might even be creating the strings in C code - i.e. making interning the default might not even help you in that case. – Corley Brigman Mar 27 '14 at 13:41
  • @Torxed,this is not a duplicate at all, for the reason mentioned by CorleyBrigman above. – Yanshuai Cao Mar 27 '14 at 17:42
  • @caoy Np, thought it might help if nothing else :) Guess it didn't. – Torxed Mar 27 '14 at 21:15

0 Answers0