-4

I want to edit a text document which having page number at the end of every 10-12 line (conversion of PDF into text and having page number at end of page). I want to remove these particular page number integer not in text as there can a page number 50 but also can be a line in which there can be 50 as integer. So I want to remove only the line which has page number integer.

Example of text document:

1 





militant Muslims use scriptures such as the 
Genesis story describing the destruction of 
Sodom and Gomorrah as justification (from Allah) 
for the hatred they vent on all things  non-
Muslim and especially on gay men.  

2 


A Word from the Author 

Today, in the 21st Century the majority of Muslims 
hold middle 

3 


Into The Darkness 


the driver assured the exhausted travelers who 
were dozing fitfully in the rear of the van, they

4 


down. It blocked the narrow road.  
Ali Azzizi was the other man accompanying 
the women. 
5 

I want the remove these page number from 1-5 but if these same number appear anywhere in between line it should not removed.

My code

filename = input('filname')
filedata = None

temp = 1

with open(filename, 'r', encoding="utf8") as file:
    filedata = file.read()
    filedata.join(line.strip() for line in file)
    rahul = '                                                                                                                                    '
    for line in file:
        if(line=='1'):
         filedata = filedata.replace(line, ' ')







with open(filename, 'w', encoding="utf8") as file:
  file.write(filedata)
halfer
  • 19,824
  • 17
  • 99
  • 186
Nicky Manali
  • 386
  • 3
  • 22
  • Do you have some code? You should at least try to solve your problem yourself before asking here. You could try to use regex since the number you want to remove seem to be between new lines (except for number 5...). – user2393256 Nov 17 '16 at 19:41
  • What have you tried so far? Where is the problem you are running into? [Reading the file?](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files) [Detecting type?](http://stackoverflow.com/questions/2225038/determine-the-type-of-a-python-object). Show that you have already put in the effort... – Albert Rothman Nov 17 '16 at 19:41
  • read all into memory, edit text, write all to file. Or read lin-by-line and decide which line write to new file.And later delete old file, and rename new file into old name. – furas Nov 17 '16 at 19:45
  • srry i forgot plz see my edit question. – Nicky Manali Nov 17 '16 at 20:07
  • I'm confused by this question, since it is nearly a copy+paste of [another of your questions](https://stackoverflow.com/questions/40664801/how-to-read-and-remove-line-in-java), except the other one is for Java. Why do you want to do the same thing in different languages? – halfer Dec 14 '16 at 23:04

1 Answers1

1

If the use of python is not mandatory you can use grep -v '^[0-9][\s]*' test.txt.

cristian@nb:~/$ grep -v '^[0-9][\s]*' test.txt 





militant Muslims use scriptures such as the 
Genesis story describing the destruction of 
Sodom and Gomorrah as justification (from Allah) 
for the hatred they vent on all things  non-
Muslim and especially on gay men.  



A Word from the Author 

Today, in the 21st Century the majority of Muslims 
hold middle 



Into The Darkness 


the driver assured the exhausted travelers who 
were dozing fitfully in the rear of the van, they



down. It blocked the narrow road.  
Ali Azzizi was the other man accompanying 
the women. 
cjungel
  • 3,701
  • 1
  • 25
  • 19