0

I made a simple python script , That scrapes a particular website

Here is the sample code

import requests
site='www.example.com'
f=open("text.txt","a")
page = requests.get(site)
contents = page.content
f.write(contents)
f.close()

After that I Filtered the data to fetch some text from a particular tag by using this code (not the best approach though)

words = []
f = open("text.txt", "r")
for line in f:
    try:
        if(line[0]=="<" and line[1]=="l" and line[2]=="i" and line[3]==">"):
        words.append(line.decode('utf-8'))
    except BaseException,e:
            pass
for a in words:
    print a.encode("utf-8")

Although I am successful at fetching my desired data , However the problem arises when i try to fetch text containing an Emoji .

Here is a snippet from my output

I am pretty happy ☺ coz i can easily recall this ☝stuff
#x1f60f;&#x1f60f;

So any idea how to convert this "#x1f60f" into an emoji ?

P.S - I am trying to save this up in firebase as well but it is still showing these "#x1f60f" up there

  • Use the decode function, check out this [answer](https://stackoverflow.com/questions/41604811/python-unicode-character-conversion-for-emoji#answer-41605038) –  Sep 27 '17 at 08:18

1 Answers1

0
  1. Try to take part from #x till the end (#x1f60f -> 1f60f)

  2. Complete this part up to 8 bits as Unicode requires by adding 0 at the beginning. 1f60f -> 0001f60f

  3. Convert.

emoji = "\U0001f60f"
print(emoji)

SwiftStudier
  • 2,272
  • 5
  • 21
  • 43