2

What I want to do

I am trying to parse the geometry information of a nastran file using python. My current attempts use NumPy as well as regular expressions. It is important to read the data fast and that the result is a NumPy array.

Nastran file format

A nastran file can look like the following:

GRID           1        3268.616-30.0828749.8656    
GRID           2        3268.781  -3.-14749.8888
GRID           3        3422.488580.928382.49383
GRID           4        3422.488     10.-2.49383
...

I am only interested in the right part of the file. There the information is present in chunks of 8 characters for the x, y and z coordinates respectively. A common representation of the coordinates above would be

3268.616, -30.0828, 749.8656    
3268.781,  -3.e-14, 749.8888
3422.488, 580.9283, 82.49383
3422.488,      10., -2.49383

What I tried so far

Up until now, I tried to use regular expressions and NumPy to avoid all kinds of python for loops to be as fast a possible about dealing with the data. After reading the complete file into memory and store it in the fContent variable I tried:

vertices = np.array(re.findall("^.{24}(.{8})(.{8})(.{8})", fContent, re.MULTILINE), dtype=float)

However, this falls short for the -3.-14 expressions. A solution would be to loop over the resulting string tuples of the regex and substitude all .- with .e- and then create the NumPy array from the list of string tuple. (Not shown in the code above). However, I think that this approach would be slow since it involves a loop over all found tuples of the regular expression and perform a substitution.

What I am looking for

I am looking for any fast way to read in the data. My current hopes are on a smart regular expression that successfully deals with the "-3.-14" problem. The regex would need to substitute all .- characters with .e- but only if the . is not at the end of an 8 character block. Up until now, I was not able to create such a regular expression. But as I said, any other fast way of reading in the data is also very welcome.

Woltan
  • 13,723
  • 15
  • 78
  • 104
  • Possible duplicate of [Import nastran nodes deck in Python using numpy](http://stackoverflow.com/questions/33254351/import-nastran-nodes-deck-in-python-using-numpy) – WombatPM Aug 05 '16 at 07:01
  • @WombatPM This looks very much like a duplicate. Thank you for the reference. However, the linked post does not deal with the "`-3.-14`" problem. Otherwise it is very much like how do it at the moment. Only I have to add `converters` to the `genfromtxt` call. – Woltan Aug 05 '16 at 07:21

1 Answers1

0

Would something like this work fine? Match the .- and replace with .e-.

Regex: (\.-)(?!(.{7})?$)

DEMO

Anshul Rai
  • 772
  • 7
  • 21
  • Thank you for your answer. I think your regex falls short since it would also substritute the `10.` in the last line of my example above. This is actually a `10` and not a `10e-2` because the 8 character block ends at the `.` and must not be substituted. – Woltan Aug 05 '16 at 06:54
  • Actually I made the changes only at Grid number `5` & `6`, I changed the length of the second character block(the one containing `10.`). Number `4` is the example you provided where `10.` is NOT matched. – Anshul Rai Aug 05 '16 at 08:58