1

So i'm pretty new to using python but i have some data constantly being piped to a python script that reads the information from sys.stdin.readline() and then using re.search to filter for a specific bit of information. The problem is that it only reads the string of information that comes and then exits.

while True:

 the_line = sys.stdin.readline()
 m = re.search(',"data":"(.+?)}]}', the_line)
 if m:
  print (m.group(1))

A sample input (sorry i know it is messy)

stat update: {"stat":{"time":"2018-02-03 19:37:59       GMT","lati":6.81661,"long":-       58.11185,"alti":0,"rxnb":0,"rxok":0,"rxfw":0,"ackr":0.0,"dwnb":0,"txnb":0,"pfrm":"Single Channel Gateway","mail":"kevic.lall@yahoo.com","desc":"433 MHz          gateway test project 1.0"}}
Packet RSSI: -56, RSSI: -97, SNR: 9, Length: 10
rxpk update: {"rxpk":                                                                                                                                                                  [{"tmst":4153364745,"chan":0,"rfch":0,"freq":433.000000,"stat":1,"modu":"LORA"   ,"datr":"SF7BW125","codr":"4/5","lsnr":9,"rssi":-   56,"size":10,"data":"aGVsbG8gMzA1Nw=="}]}
 Packet RSSI: -49, RSSI: -96, SNR: 9, Length: 10
rxpk update: {"rxpk":[{"tmst":4155404009,"chan":0,"rfch":0,"freq":433.000000,"stat":1,"modu":"LORA","datr":"SF7BW125","codr":"4/5","lsnr":9,"rssi":-49,"size":10,"data":"aGVsbG8gMzA1OA=="}]}
Packet RSSI: -51, RSSI: -97, SNR: 9, Length: 10
....

these are just a couple lines of what is constantly streaming.

NOTE The input does not appear as is here but rather appears line by line as the program i am piping to the python script continues to run

thus, the output i want should be

aGVsbG8gMzA1Nw=="
aGVsbG8gMzA1OA=="
....

constantly streaming

but instead of that, i do not get anything printed, instead the program just hangs until i manually hit Ctrl+C

the 1st string just exits because it doesn't contain the required information and even if i did change it to filter something that is there, it prints the output i want then exist and stops the program being piped to the python script as well upon exit Is there a more efficient way to read and filter the information? can i still use the re.search function?

Also, the reason i am reading it line by line with sys.stdin.realine() is because i want to filter each line to send via MQTT

Edited for clarity

kl27
  • 13
  • 3
  • Please provide some sample input, the actual output, and the desired output (and of course any error messages). This helps us understand the problem and test solutions in line with your expectations. – sorak Feb 12 '18 at 03:40
  • recently edited – kl27 Feb 12 '18 at 03:56
  • if you horizontally scroll to the end of the line you will see "data" there; sorry, it's a bit messy – kl27 Feb 12 '18 at 04:11
  • Your code works for me, with the only change being indenting "print..." because it is inside the conditional for m – sorak Feb 12 '18 at 04:27
  • Strange, but do remember, the input does not appear complete like this, rather it a line at a time, sorry if i wasn't clear about that before. What happens is, when the 1st line appears, it is unable to find what is specified (as it is not there in that line) and so just hangs there – kl27 Feb 12 '18 at 04:30
  • Does it work if you indent the print line? – sorak Feb 12 '18 at 04:32
  • i will try in in the morning, i'm shutting down for the night, but i will let you know how it goes – kl27 Feb 12 '18 at 04:33
  • actually the way the print is there is a mistake from when i pasted it, sorry it still doesn't work :( just edited it – kl27 Feb 12 '18 at 04:35

3 Answers3

0

I am not seeing the same behavior.

import sys
import re
while True:
    the_line = sys.stdin.readline()
    m = re.search('he(.+?)you', the_line)
    if m:
        print(m.group(1))

I run the program. I am prompted to type and hit enter. Your regex is tested against wahtever I type. The matching pattern is printed. Then, my the prompt is returned to me again. I can then type another random string of characters all over again. The program does not stop for me. And from your code, there should be no reason that the program ends.

Your code is fairly efficient. There are other ways to prompt for input in Python just as there are other ways to search strings. Check out:

It depends on what you are searching for; if you are searching for pattern that varies, you are not going to get much faster than re.search(). But, if you know the exact phrase or if you are looking for a smaller group of exact phrases, string.find() or the in operator may be faster.

coltoneakins
  • 849
  • 6
  • 14
0

Try it like this using this pattern: (?<="data":")[\w=]+(?=")

import sys
import re
regex = r'(?<="data":")[\w=]+(?=")'
while True:
    text = sys.stdin.readline()
    matches = re.finditer(regex, text)
    for match in matches:
        print ("{match}".format(match = match.group()))
wp78de
  • 18,207
  • 7
  • 43
  • 71
0

The following script, which contains minor modifications, works well for me:

import fileinput
import re

for the_line in fileinput.input():                                              
    m = re.search(',"data":"(.+?)}]}', the_line)
    if m:
        print (m.group(1))

Output:

aGVsbG8gMzA1Nw=="
aGVsbG8gMzA1OA=="
sorak
  • 2,607
  • 2
  • 16
  • 24