0

this are two strings

a=  PVT corner         |    TYP_25    |    SLOW_125 |    SLOW_0_   |    SLOW_40 |    FAST_125 

b= Description         |  RD   |  WR | A   |  RD     |  WR     |  RD     |  WR     |  RD     |  WR     |  RD     |  WR     

I need to check the length of each item in "a" and compare with "b". And if any "|" found in between item in "b" then it has to be concatenated with a's item, like this, "RD" ,"WR" and "A" should be concatenated with TYP_25 of a. how to Merge two strings based on this condition?

no1
  • 717
  • 2
  • 8
  • 21
  • Have you tried anything, yet? – filmor Sep 05 '13 at 09:18
  • i tried to split and concatenate, but didnt work.. – no1 Sep 05 '13 at 09:20
  • It would be better if you posted your code. Is the number of sub-items in string b under a given item in string a always the same, that is 3 for the first and 2 for the others? I need more examples of possible string A's. Also, is the field length in strings a and b identical? – chryss Sep 05 '13 at 09:20
  • I would do a search for | in a, keep a track of each range (e.g. first result to 2nd result, 2nd + 1 to 3rd and so on) and do a sub string for that same range in b then split the substring in b on '|' and concatenate all the results into the substring from the range of A and then do whatever you want with the result. EDIT: if this is fixed width then it becomes much simpler. create a and b sublists for each column http://stackoverflow.com/questions/6372228/how-to-parse-a-list-or-string-into-chucks-of-fixed-length and then split,map trim, join each of b and add to corresponding a element – ameer Sep 05 '13 at 09:26
  • What does "concatenate" mean? – Bor Sep 05 '13 at 09:26
  • 1
    This looks like a fixed-width format; are all the columns the same width perhaps? – Martijn Pieters Sep 05 '13 at 09:26
  • What's the expected output? – Ashwini Chaudhary Sep 05 '13 at 09:43

3 Answers3

4

You have a | character every 20 positions; split your strings into sections of 20 characters, and pair up the results:

def by_width(line, width=20, stripchars='|'):
    i = 0
    while i < len(line):
        yield line[i:i+width].strip(stripchars)
        i += width

Zip the results together gives:

>>> for column_a, column_b in zip(by_width(a), by_width(b)):
...     print [column_a.strip()] + [v.strip() for v in column_b.split('|')]
... 
['PVT corner', 'Description']
['TYP_25_0P85', 'RD', 'WR', 'A']
['SLOW_125_0P765', 'RD', 'WR']
['SLOW_0_0P765', 'RD', 'WR']
['SLOW_M40_0P765', 'RD', 'WR']
['FAST_125_0P935', 'RD', 'WR']

From there on out you can do what you want with the columns; in the above sample I merely put them together into lists of whitespace-stripped strings.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
1

if the format is confirmed, I think this would be ok:

tmp_flag1 = "#"
tmp_flag2 = "|"
tmp_str1 = a.replace(tmp_flag1, "")
tmp_str2 = b.replace(tmp_flag1, "")

tmp_str3 = ""
tmp_pos_head = 0
tmp_pos_tail = 0
tmp_is_equal = False

tmp_ret = tmp_str1.find(tmp_flag2)
while tmp_ret != -1:
        tmp_pos_tail = tmp_ret
        if tmp_str2[tmp_ret] == tmp_flag2:
                tmp_buf1 = tmp_str1[tmp_pos_head:tmp_pos_tail].replace(tmp_flag2, "")
                tmp_buf2 = tmp_str2[tmp_pos_head:tmp_pos_tail].replace(tmp_flag2, "")
                tmp_str3 += tmp_buf1 + ":" + tmp_buf2 + "\n"
                tmp_pos_head = tmp_ret + 1
                tmp_is_equal = True

         tmp_ret = tmp_str1.find(tmp_flag2, tmp_ret + 1)

if tmp_is_equal == True:
        tmp_buf1 = tmp_str1[tmp_pos_tail:].replace(tmp_flag2, "")
        tmp_buf2 = tmp_str2[tmp_pos_tail:].replace(tmp_flag2, "")
else:
        tmp_buf1 = tmp_str1[tmp_pos_head:].replace(tmp_flag2, "")
        tmp_buf2 = tmp_str2[tmp_pos_head:].replace(tmp_flag2, "")
tmp_str3 += tmp_buf1 + ":" + tmp_buf2

print tmp_str3
Mark
  • 422
  • 1
  • 5
  • 15
1

Using a zip works, regardless of whether the widths are equal.

a = "  PVT corner         |    TYP_25    |    SLOW_125    SLOW_0_   |    SLOW_M40|    FAST_12 "
b = " Description         |  RD   |  WR | A   |  RD     |  WR     |  RD     |  WR     |  RD     |  WR     |  RD     |  WR     "
head = 0
res = []
for i,(s,t) in enumerate(zip(a,b)):
    if (s,t) == ("|","|"):
        res.append([a[head:i].strip()]+[m.strip() for m in b[head:i].split("|")])
        head = i + 1
res.append([a[head:].strip()]+[m.strip() for m in b[head:].split("|")])

for r in res:
    print r

The output is

['PVT corner', 'Description']
['TYP_25', 'RD', 'WR', 'A']
['SLOW_125', 'RD', 'WR']
['SLOW_0', 'RD', 'WR']
['SLOW_40', 'RD', 'WR']
['FAST_125', 'RD', 'WR']
Alexander
  • 3,129
  • 2
  • 19
  • 33
Bor
  • 189
  • 7
  • 1
    This is a very inefficient solution, however. This loops over the inputs character by character instead of using more efficient slicing methods. Use this only if your column widths are variable, not fixed. – Martijn Pieters Sep 05 '13 at 11:11