0

I have got a text like in this format:

"TEXT1";" TEXT2";"TEXT3";"TEXT4";"TEXT5 ";"";"TEXT6"
"TEXT7";" TEXT8";"TEXT9";"TEXT10";"TEXT11";"";"TEXT12"

I used web-scraping and I want to delete the empty string --> [5]. How can I do this in a loop? The code:

for record in table.find_all('tr', class_="mytable"):
    temp_data = []
    for data in record.find_all("td"):
        temp_data.append(data.text.encode('latin-1'))
    datatable.append(temp_data)
    #how can I delete the [5] here?
Klaus D.
  • 13,874
  • 5
  • 41
  • 48
tardos93
  • 235
  • 2
  • 17
  • Duplicate ? : https://stackoverflow.com/questions/3845423/remove-empty-strings-from-a-list-of-strings – Darkaird Jul 28 '17 at 07:33
  • This is a little bit different question.. – tardos93 Jul 28 '17 at 07:34
  • Does the text include the quotes? Also, is it on multiple lines? Which variable stores the text? – cs95 Jul 28 '17 at 07:40
  • In the csv file no, if i am opening it in Notepad++ i see these quotes. Yes, every rows contains this empty string thats why i want to delete the [5] item maybe? – tardos93 Jul 28 '17 at 07:42
  • 1
    I like how you said this was different to the linked dupe then accepted an answer which is identical to the accepted answer in that dupe :) – SiHa Jul 28 '17 at 07:55
  • Possible duplicate of [Remove empty strings from a list of strings](https://stackoverflow.com/questions/3845423/remove-empty-strings-from-a-list-of-strings) – SiHa Jul 28 '17 at 07:56

3 Answers3

2

If you want to remove empty string then you can simply use this,

newlist = filter(None, oldlist)
Mohamed Thasin ah
  • 10,754
  • 11
  • 52
  • 111
0
a=["a","b","","d"]
i=0
while i<len(a):
    if(a[i]==""):
        del a[i]
        i-=1
    i+=1
Ahmad
  • 906
  • 11
  • 27
0

Instead of deleting the empty string, you could just not append it.

The code could look like this:

datatable = []
for record in table.find_all('tr', class_="mytable"):
    temp_data = []
    for data in record.find_all("td"):
        if data.text != "":     #Check if the data is an empty string or not
            temp_data.append(data.text.encode('latin-1'))   #Append the data if it is not an empty string
    datatable.append(temp_data)
#print(datatable)
#[["TEXT1"," TEXT2","TEXT3","TEXT4","TEXT5 ","TEXT6"],["TEXT7"," TEXT8","TEXT9","TEXT10","TEXT11","TEXT12"]]

This works well, but if you still want to delete the empty string after adding it to the list then you can do that as follows:

datatable = []
for record in table.find_all('tr', class_="mytable"):
temp_data = []
for data in record.find_all("td"):
    temp_data.append(data.text.encode('latin-1'))
datatable.append(temp_data)
#print(datatable)
#[["TEXT1"," TEXT2","TEXT3","TEXT4","TEXT5 ","","TEXT6"],["TEXT7"," TEXT8","TEXT9","TEXT10","TEXT11","","TEXT12"]]

#Now remove the empty strings from datatable
for i in range(len(datatable)):
    datatable[i] = filter(None, datatable[i])
    #Since filter returns a filter object (iterator) in Python 3x, use
    #datatable[i] = list(filter(None, datatable[i]))
    #to convert it to list in Python 3x
#print(datatable)
#[["TEXT1"," TEXT2","TEXT3","TEXT4","TEXT5 ","TEXT6"],["TEXT7"," TEXT8","TEXT9","TEXT10","TEXT11","TEXT12"]]
gaurav
  • 136
  • 1
  • 6