I am trying to create sub-sentences from a combined English sentence using Python. I need help in figuring out the best way and right way to do so.
I have looked into a similar question asked on this Finding meaningful sub-sentences from a sentence. The reason of asking a new question is because I wanted to rephrase the newly created sentence, once I identify a set of sub-statements from the original sentences.
Posting the minimal code, which I am currently using to do my task:
def create_SLEW_Messages(req):
print(req)
# Pass Fail statement will be something like
# "Verify that " + InitPart
# first split the requirement with SHALL keyword, which is the subject
InitPart = re.split('shall', req, flags=re.IGNORECASE)[0]
# to get the exact word after shall
verbWord = re.split('shall', req, flags=re.IGNORECASE)[1].split()[0]
TempPassFailStatement1 = re.split('and', req, flags=re.IGNORECASE)[0]
PassFailStatement1 = re.split('shall', TempPassFailStatement1, flags=re.IGNORECASE)[1]
PassFailStatement2 = re.split('and', req, flags=re.IGNORECASE)[1].split(",")[0]
returnPassFailStatement1 = "Verify that " + InitPart + " will " + PassFailStatement1
print(returnPassFailStatement1)
returnPassFailStatement2 = ""
if verbWord in PassFailStatement2:
returnPassFailStatement2 = "Verify that " + InitPart + " will " + PassFailStatement2
print(returnPassFailStatement2)
else:
returnPassFailStatement2 = "Verify that " + InitPart + " will " + verbWord + " " + PassFailStatement2
print(returnPassFailStatement2)
return returnPassFailStatement1, returnPassFailStatement2
Pre-condition - The statement which is provided to the above function will always have a "shall" keyword
Data 1 ==> Input 1 - The alpha tape shall move down for increasing xyz_alphaBeta and up for decreasing xyz_alphaBeta.
Ouptut (actual):
Verify that The alpha tape will move down for increasing xyz_alphaBeta
Verify that The alpha tape will move up for decreasing xyz_alphaBeta.
the above output is as per my requirement, however when I pass a sentence of similar sort and different complexity, my algo fails to detect correct sub-sentences and frames incorrect or incomplete sentences as shown below for Data 2
Data 2 ==> Input 2 - The Minimum Data Bug shall be positioned on the alpha tape, move upwards for increasing fac_alphaV3 and downwards for decreasing fac_alphaV3.
Ouptut (actual):
Verify that The Minimum Data Bug will be positioned on the alpha tape, move upwards for increasing fac_alphaV3
Verify that The Minimum Data Bug will be downwards for decreasing fac_alphaV3.
Ouptut (required):
Verify that The Minimum Data Bug will be positioned on the alpha tape, and move upwards for increasing fac_alphaV3
Verify that The Minimum Data Bug will be positioned on the alpha tape, and move downwards for decreasing fac_alphaV3.
PS. I understand, that regex or splitting technique is not a good technique in terms of splitting a language text, which can vary from one form to another. That is the reason, I am looking for some suggestions on this
Any suggestions or inputs are welcome !!!