0

So I'm trying to read only the last line of a message and then detect certain text after specific spaces back and forth.

Message:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed quis lacus efficitur, efficitur mauris et, vulputate sapien. Duis pellentesque semper diam, vel lacinia nisl facilisis quis. 

Morbi placerat, elit ut finibus faucibus, nisi metus luctus quam, ut scelerisque neque lectus sed lorem. Maecenas eu lectus tincidunt, hendrerit ex quis, ullamcorper arcu.

Trade BTC/USDT on Cryptocom

The text I want to detect is the last sentence, and only "BTC/USDT" in the sentence: Trade BTC/USDT on Cryptocom

I tried the following code:

message = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed quis lacus efficitur, efficitur mauris et, vulputate sapien. Duis pellentesque semper diam, vel lacinia nisl facilisis quis. 
    
Morbi placerat, elit ut finibus faucibus, nisi metus luctus quam, ut scelerisque neque lectus sed lorem. Maecenas eu lectus tincidunt, hendrerit ex quis, ullamcorper arcu.
    
Trade BTC/USDT on Cryptocom"
    
pairs = ["TC/USDT", "OTC/USDT", "BTC/USDT"]

for pair in pairs:
    if pair in message:
        print(f"PAIR FOUND: {pair}")

The output I get is - it also detects TC/USDT and OTC/USDT as "BTC/USDT" because they end similarly as "BTC".

I'd like to know how to detect only "BTCUSDT" text in "TradespaceBTC/USDTspaceonspaceCryptocom" possibly through a reliable method that does not get confused.

What's the proper way?

Cassano
  • 253
  • 5
  • 36
  • 1
    You need to use triple quotes to write a string literal with newlines in it. You should be getting a syntax error. – Barmar Feb 28 '22 at 19:50
  • That message was a demonstration, and thank you for telling me. The actual message is a response, which I then add to a variable like this: variable = response so it's doing fine, but I understand what you mean! – Cassano Feb 28 '22 at 19:54

2 Answers2

2

Use splitlines() to turn the message into a list of lines. Then you can test just the last element of the list.

To match only whole words, use a regular expression with \b to match word boundaries.

import re

lastline = message.splitlines()[-1]

for pair in pairs:
    if re.search(fr"\b{re.escape(pair)\b", lastline):
        print(f"PAIR FOUND: {pair}")
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • 1
    I think OP's actual question was "how to prevent `PAIR FOUND: TC/USDT` when the last line contains `BTC/USDT`" – Pranav Hosangadi Feb 28 '22 at 19:55
  • The thing is, I don't want to iterate over "pairs" and instead find "BTC/USDT" directly such as going to the last line and then detect text between "Trade" and "on" which is "BTC/USDT"... to be reliable as it gets confused otherwise and also throws out similar pairs that end like BTC/USDT, such as TC/USDT, OTC/USDT... you get the idea. – Cassano Feb 28 '22 at 19:57
  • 2
    I've updated the answer. to show how to do it with a regular expression. – Barmar Feb 28 '22 at 19:58
1

Once you've found the last line using Barmar's answer, you could split on spaces and see if any of the words in that line are equal to your pair:

lastline_words = lastline.split()

for pair in pairs:
    if any(pair == word for word in lastline_words):
        print(f"PAIR FOUND: {pair}")

This prints

PAIR FOUND: BTC/USDT

Alternatively, you could use a regular expression with word boundaries around your search term:

import re

pairs_re = [re.compile(rf"\b{re.escape(pair)}\b") for pair in pairs]

for rexp in pairs_re:
    if re.search(rexp, lastline):
        print(f"PAIR FOUND: {pair}")

Re. your comment:

I'm trying to not iterate over a list of pairs to find if that pair exists, instead, I'm trying to go to last line, and then directly detect the text between the words "Trade" and "on"... which will effectively result in what I'm looking for

If you want to simply look for whatever exists between "Trade" and "on", you can use a regular expression to capture that:

match = re.search(r"Trade (.*?) on", lastline)
if match:
    print(f"PAIR FOUND: {match.groups(1)[0]}")

The regex simply captures everything between the literal words "Trade " and " on". If you want it to be more specific, e.g. if you always want the second part of the captured group to be "/USDT", you can use do:

match = re.search(r"Trade (.*?/USDT) on", lastline)
if match:
    print(f"PAIR FOUND: {match.groups(1)[0]}")

If the search terms aren't found in your lastline, match will be None.

Pranav Hosangadi
  • 23,755
  • 7
  • 44
  • 70
  • That's a good answer, but I'm trying to not iterate over a list of pairs to find if that pair exists, instead, I'm trying to go to last line, and then directly detect the text between the words "Trade" and "on"... which will effectively result in what I'm looking for. – Cassano Feb 28 '22 at 20:06
  • 1
    @Cassano See the edit I just made. – Pranav Hosangadi Feb 28 '22 at 23:14
  • 1
    That's the best answer. So much more to learn! Thank you bunches! – Cassano Feb 28 '22 at 23:18