I am trying to create a python script that will read data from a text file and then checks if it has .(two letters), which well tell me if is a country code. I have tried using split and other methods but have not got it to work? Here is the code I have so far -->
# Python program to
# demonstrate reading files
# using for loop
import re
file2 = open('contry.txt', 'w')
file3 = open('noncountry.txt', 'w')
# Opening file
file1 = open('myfile.txt', 'r')
count = 0
noncountrycount = 0
countrycounter = 0
# Using for loop
print("Using for loop")
for line in file1:
count += 1
pattern = re.compile(r'^\.\w{2}\s')
if pattern.match(line):
print(line)
countrycounter += 1
else:
print("fail", line)
noncountrycount += 1
print(noncountrycount)
print(countrycounter)
file1.close()
file2.close()
file3.close()
The txt file has this in it
.aaa generic American Automobile Association, Inc.
.aarp generic AARP
.abarth generic Fiat Chrysler Automobiles N.V.
.abb generic ABB Ltd
.abbott generic Abbott Laboratories, Inc.
.abbvie generic AbbVie Inc.
.abc generic Disney Enterprises, Inc.
.able generic Able Inc.
.abogado generic Minds + Machines Group Limited
.abudhabi generic Abu Dhabi Systems and Information Centre
.ac country-code Internet Computer Bureau Limited
.academy generic Binky Moon, LLC
.accenture generic Accenture plc
.accountant generic dot Accountant Limited
.accountants generic Binky Moon, LLC
.aco generic ACO Severin Ahlmann GmbH & Co. KG
.active generic Not assigned
.actor generic United TLD Holdco Ltd.
.ad country-code Andorra Telecom
.adac generic Allgemeiner Deutscher Automobil-Club e.V. (ADAC)
.ads generic Charleston Road Registry Inc.
.adult generic ICM Registry AD LLC
.ae country-code Telecommunication Regulatory Authority (TRA)
.aeg generic Aktiebolaget Electrolux
.aero sponsored Societe Internationale de Telecommunications Aeronautique (SITA INC USA)
I am getting this error now File "C:/Users/tyler/Desktop/Python Class/findcountrycodes/Test.py", line 15, in for line in file1: File "C:\Users\tyler\AppData\Local\Programs\Python\Python36\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 8032: character maps to