Assuming the strings are always delimited by /
's, here's how I would do it in Python
start1 = "A/B/C/D"
start2 = "B/D/E/A/B"
start3 = "D/A/A/B/D/C"
start4 = "C"
startList = [start1, start2, start3, start4]
print "startList: ", startList
fields = []
for start in startList:
for field in start.split('/'):
fields.append(field)
print "fields: ", fields
countDict = dict.fromkeys(fields)
print "countDict 1: ", countDict
for entry in countDict.keys():
countDict[entry] = fields.count(entry)
print "countDict 2: ", countDict
Here is what the print
statements output:
startList: ['A/B/C/D', 'B/D/E/A/B', 'D/A/A/B/D/C', 'C']
fields: ['A', 'B', 'C', 'D', 'B', 'D', 'E', 'A', 'B', 'D', 'A', 'A', 'B', 'D', 'C', 'C']
countDict 1: {'A': None, 'C': None, 'B': None, 'E': None, 'D': None}
countDict 2: {'A': 4, 'C': 3, 'B': 4, 'E': 1, 'D': 4}
However, if the starting string is giant (millions of entries) and speed really matters, Python is probably not your best choice. Its easy to learn, and very readable (and my favorite language), but its just not as fast as compiled languages like C
. That being said, its fast enough for the vast majority of applications
A note on this particular method. There are plenty of 'fancier' ways to count the entries in a list. Many are faster and more "pythonic", but this should suffice for your purposes. If you want to see these methods, just do a quick search around the site. If anything in this method is unclear, let me know, hope this helps!
If what you want is the number of unique entries in each string, this is what you're looking for:
start1 = "A/B/C/D"
start2 = "B/D/E/A/B"
start3 = "D/A/A/B/D/C"
start4 = "C"
startList = [start1, start2, start3, start4]
print "startList: ", startList
countDict = dict.fromkeys(startList)
print "countDict 1: ", countDict
for start in startList:
countDict[start] = len(set(start.split('/')))
print "countDict 2: ", countDict
Here is what the print
statements output:
startList: ['A/B/C/D', 'B/D/E/A/B', 'D/A/A/B/D/C', 'C']
countDict 1: {'B/D/E/A/B': None, 'A/B/C/D': None, 'C': None, 'D/A/A/B/D/C': None}
countDict 2: {'B/D/E/A/B': 4, 'A/B/C/D': 4, 'C': 1, 'D/A/A/B/D/C': 4}