1

I've come up with this python code that is supposed to take a text file with thousands of RSS urls (one per line) and create one XML file PER 500 lines.

Here is what I have so far:

urls = open('file.txt','r').read().splitlines()
opml = open('opml.xml','w')


opml.write('<?xml version="1.0" encoding="UTF-8"?>\n<opml version="1.0">\n<head>\n<title>My Rss list</title>\n</head>\n<body>\n')
for i in xrange(500):
    opml.write('<outline title="RSS Site %d" type="rss" xmlUrl = "%s"/>\n'%(i + 1, urls[i]))
opml.write('</body>\n</opml>')
opml.close() 

But it's got problems, the result file ends up looking like this:

<?xml version="1.0" encoding="UTF-8"?>
<opml version="1.0">
<head>
<title>My Rss list</title>
</head>
<body>

It is missing the closing body and the closing opml, and obviously it's missing the outline bits.

In the python shell, when I run the script I get this error:

Traceback (most recent call last): File "C:/Users/nzbit/Desktop/Convert To OPML Python Test/converttoopml.py", line 6, in <module> for i in xrange(500): NameError: name 'xrange' is not defined 

UPDATE: Here is the latest issue in the produced XML files:

<outline title="RSS Site 1" type="rss" xmlUrl = "��tp://www.gardenleisurepools.com/forum/external?type=rss2"/>
<outline title="RSS Site 2" type="rss" xmlUrl = ""/>
<outline title="RSS Site 3" type="rss" xmlUrl = "http://www.meguiarsonline.com/forum/external.php?type=RSS2"/>
<outline title="RSS Site 4" type="rss" xmlUrl = ""/>
<outline title="RSS Site 5" type="rss" xmlUrl = "http://www.newportri.com/board/external?type=rss2"/>
<outline title="RSS Site 6" type="rss" xmlUrl = ""/>
<outline title="RSS Site 7" type="rss" xmlUrl = "http://www.abandonware-forums.org/forums/external?type=rss2"/>
<outline title="RSS Site 8" type="rss" xmlUrl = ""/>
<outline title="RSS Site 9" type="rss" xmlUrl = "https://www.lrcsite.com/forum/external.php?type=RSS2"/>
<outline title="RSS Site 10" type="rss" xmlUrl = ""/>
<outline title="RSS Site 11" type="rss" xmlUrl = "http://www.accutane-recall.com/forums/external?type=rss2"/>
  • it seems that there is no content in your `file.txt`? – Tiny.D Jun 28 '17 at 06:11
  • There definitely is, but remember it's not just the content from the file.txt that's missing, it's also the closing body and the closing opml etc – A C Marston Jun 28 '17 at 06:12
  • if `urls[i]` raise error, the consequence code will not execute. – Tiny.D Jun 28 '17 at 06:15
  • In the python shell, when I run the script I get this error: Traceback (most recent call last): File "C:/Users/nzbit/Desktop/Convert To OPML Python Test/converttoopml.py", line 6, in for i in xrange(500): NameError: name 'xrange' is not defined – A C Marston Jun 28 '17 at 06:18
  • 2
    Possible duplicate of [NameError: global name 'xrange' is not defined in Python 3](https://stackoverflow.com/questions/17192158/nameerror-global-name-xrange-is-not-defined-in-python-3) – SiHa Jun 28 '17 at 07:15

2 Answers2

0

I believe you are running your code in python3 not python2, you have to change xrange() to range() in Python 3:

for i in range(500):

Update: this is roughly idea based on your code, it will create multiple file like opml1.xml,opml2.xml,opml3.xml... and write with content per 500 lines. As your file.txt is encoding with UCS-2, it is utf-16, you can open file with encoding='utf-16' like this, then there will be no special characters.

with open('file.txt', encoding='utf-16') as f: #open with utf-16
    urls = f.read().splitlines()
fileNum=1
for i in range(500,len(urls)+500,500): #loop with per 500
    with open('opml'+str(fileNum)+'.xml','w') as opml:
        opml.write('<?xml version="1.0" encoding="UTF-8"?>\n<opml version="1.0">\n<head>\n<title>My Rss list</title>\n</head>\n<body>\n')
        if i >len(urls):
            for j in range(i-500,len(urls)):
                opml.write('<outline title="RSS Site %d" type="rss" xmlUrl = "%s"/>\n'%(j + 1, urls[j]))
        else:
            for j in range(i-500,i):
                opml.write('<outline title="RSS Site %d" type="rss" xmlUrl = "%s"/>\n'%(j + 1, urls[j]))
        opml.write('</body>\n</opml>')
        opml.close()
    fileNum+=1 #increase file number per loop
Tiny.D
  • 6,466
  • 2
  • 15
  • 20
0

this working for me without issues:

urls = open('file.txt','r').read().splitlines()
opml = open('opml.xml','w')


opml.write('<?xml version="1.0" encoding="UTF-8"?>\n<opml version="1.0">\n<head>\n<title>My Rss list</title>\n</head>\n<body>\n')
for i in range(500):
    if len(urls) > i:
        opml.write('<outline title="RSS Site %d" type="rss" xmlUrl = "%s"/>\n'%(i + 1, urls[i]))
    else:
        break
opml.write('</body>\n</opml>')
opml.close() 

run file as python3

IslamTaha
  • 1,056
  • 1
  • 10
  • 17