I'm pretty new to regular expressions but decided to use them to unserialize PHP arrays. Here's some background info:
I rewrote a database-based website for companies in django which was written in PHP. There is an M2M relation with companies and industries. In the previous model it was solved by using serialized PHP arrays so I now have to sync everything correctly. My first attempt was some splitting and cutting and it was really ugly so I decided to dive into regular expressions. Here is what I got (it's working perfectly fine) now:
def unserialize_array(serialized_array):
import re
matched_sub = re.search('^a:\d+:\{i:\d+;s:\d+:"(.*?)";\}$', serialized_array).group(1)
industry_list = re.sub('";i:\d+;s:\d+:"', "? ", matched_sub).split("? ")
new_dict = dict(enumerate(industry_list))
return new_dict
I was wondering however if I couldn't do all this with a single regular expression instead of two.