Parsing files (ics/ icalendar) using Python

Question

I have a .ics file in the following format. What is the best way to parse it? I need to retrieve the Summary, Description, and Time for each of the entries.

BEGIN:VCALENDAR
X-LOTUS-CHARSET:UTF-8
VERSION:2.0
PRODID:-//Lotus Development Corporation//NONSGML Notes 8.0//EN
METHOD:PUBLISH
BEGIN:VTIMEZONE
TZID:India
BEGIN:STANDARD
DTSTART:19500101T020000
TZOFFSETFROM:+0530
TZOFFSETTO:+0530
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID="India":20100615T111500
DTEND;TZID="India":20100615T121500
TRANSP:OPAQUE
DTSTAMP:20100713T071035Z
CLASS:PUBLIC
DESCRIPTION:Emails\nDarlene\n Murphy\nDr. Ferri\n

UID:12D3901F0AD9E83E65257743001F2C9A-Lotus_Notes_Generated
X-LOTUS-UPDATE-SEQ:1
X-LOTUS-UPDATE-WISL:$S:1;$L:1;$B:1;$R:1;$E:1;$W:1;$O:1;$M:1
X-LOTUS-NOTESVERSION:2
X-LOTUS-APPTTYPE:0
X-LOTUS-CHILD_UID:12D3901F0AD9E83E65257743001F2C9A
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID="India":20100628T130000
DTEND;TZID="India":20100628T133000
TRANSP:OPAQUE
DTSTAMP:20100628T055408Z
CLASS:PUBLIC
DESCRIPTION:
SUMMARY:smart energy management
LOCATION:8778/92050462
UID:07F96A3F1C9547366525775000203D96-Lotus_Notes_Generated
X-LOTUS-UPDATE-SEQ:1
X-LOTUS-UPDATE-WISL:$S:1;$L:1;$B:1;$R:1;$E:1;$W:1;$O:1;$M:1
X-LOTUS-NOTESVERSION:2
X-LOTUS-NOTICETYPE:A
X-LOTUS-APPTTYPE:3
X-LOTUS-CHILD_UID:07F96A3F1C9547366525775000203D96
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID="India":20100629T110000
DTEND;TZID="India":20100629T120000
TRANSP:OPAQUE
DTSTAMP:20100713T071037Z
CLASS:PUBLIC
SUMMARY:meeting
UID:6011DDDD659E49D765257751001D2B4B-Lotus_Notes_Generated
X-LOTUS-UPDATE-SEQ:1
X-LOTUS-UPDATE-WISL:$S:1;$L:1;$B:1;$R:1;$E:1;$W:1;$O:1;$M:1
X-LOTUS-NOTESVERSION:2
X-LOTUS-APPTTYPE:0
X-LOTUS-CHILD_UID:6011DDDD659E49D765257751001D2B4B
END:VEVENT

score 98 · Answer 1 · edited Nov 27 '17 at 19:47

98

The icalendar package looks nice.

For instance, to write a file:

from icalendar import Calendar, Event
from datetime import datetime
from pytz import UTC # timezone

cal = Calendar()
cal.add('prodid', '-//My calendar product//mxm.dk//')
cal.add('version', '2.0')

event = Event()
event.add('summary', 'Python meeting about calendaring')
event.add('dtstart', datetime(2005,4,4,8,0,0,tzinfo=UTC))
event.add('dtend', datetime(2005,4,4,10,0,0,tzinfo=UTC))
event.add('dtstamp', datetime(2005,4,4,0,10,0,tzinfo=UTC))
event['uid'] = '20050115T101010/27346262376@mxm.dk'
event.add('priority', 5)

cal.add_component(event)

f = open('example.ics', 'wb')
f.write(cal.to_ical())
f.close()

Tadaaa, you get this file:

BEGIN:VCALENDAR
PRODID:-//My calendar product//mxm.dk//
VERSION:2.0
BEGIN:VEVENT
DTEND;VALUE=DATE:20050404T100000Z
DTSTAMP;VALUE=DATE:20050404T001000Z
DTSTART;VALUE=DATE:20050404T080000Z
PRIORITY:5
SUMMARY:Python meeting about calendaring
UID:20050115T101010/27346262376@mxm.dk
END:VEVENT
END:VCALENDAR

But what lies in this file?

g = open('example.ics','rb')
gcal = Calendar.from_ical(g.read())
for component in gcal.walk():
    print component.name
g.close()

You can see it easily:

>>> 
VCALENDAR
VEVENT
>>>

What about parsing the data about the events:

g = open('example.ics','rb')
gcal = Calendar.from_ical(g.read())
for component in gcal.walk():
    if component.name == "VEVENT":
        print(component.get('summary'))
        print(component.get('dtstart'))
        print(component.get('dtend'))
        print(component.get('dtstamp'))
g.close()

Now you get:

>>> 
Python meeting about calendaring
20050404T080000Z
20050404T100000Z
20050404T001000Z
>>>

edited Nov 27 '17 at 19:47

scls

16,591
10
44
55

answered Aug 04 '10 at 18:17

Wok

4,956
7
42
64

1

However, it seems to return datetimes as naive datetimes, which don't have a utcoffset. :( – kojiro Jun 03 '11 at 14:07
7

@BradMontgomery It looks like the icalendar package has changed maintainers and version 3.0 is available under the BSD license here: https://github.com/collective/icalendar – mpdaugherty May 09 '12 at 11:59
1

@mpdaugherty this is awesome news! It's good to see that code getting some maintenance :) – Brad Montgomery May 09 '12 at 15:50
1

similar examples with the new API (and pytz for UTC handling) can be found here: http://icalendar.readthedocs.org/en/latest/examples.html – x29a Nov 26 '13 at 12:23
1

@x29a Updated link to examples: [http://icalendar.readthedocs.org/en/latest/usage.html#example](http://icalendar.readthedocs.org/en/latest/usage.html#example) – OK. Jun 05 '14 at 20:12
1

UTC doesn't seem to be a package within icalendar any more. – Savara Aug 17 '15 at 22:01
I updated the original content to point to the right UTC (well right, per another SO answer), also the as_string didn't work anymore as well, and that has been updated to to_ical() – onaclov2000 Aug 26 '16 at 17:59
That's nice trivia but doesn't answer the question: how do you read (parse) an ics file? I might be blind but I couldn't find any hint in the documentation either. – zvyn Jan 12 '17 at 00:23
5

@zvyn `print(component.get('dtstart'))` gives me e.g. ``. I need to do `print(component.get('dtstart').dt)` to turn that into a datetime object, e.g. `2019-11-06 10:00:00-01:00`. – AstroFloyd Nov 10 '19 at 12:38
I found a more complete example to start from here: https://github.com/collective/icalendar/blob/master/src/icalendar/cli.py – poleguy Jan 04 '23 at 22:50
Unfortunately icalendar does not seem to handle repeating events easily and the icalevents project has few good examples doesn't use a current version of icalendar. – poleguy Jan 05 '23 at 17:58

score 17 · Answer 2 · answered Jun 24 '11 at 15:32

You could probably also use the vobject module for this: http://pypi.python.org/pypi/vobject

If you have a sample.ics file you can read it's contents like, so:

# read the data from the file
data = open("sample.ics").read()

# parse the top-level event with vobject
cal = vobject.readOne(data)

# Get Summary
print 'Summary: ', cal.vevent.summary.valueRepr()
# Get Description
print 'Description: ', cal.vevent.description.valueRepr()

# Get Time
print 'Time (as a datetime object): ', cal.vevent.dtstart.value
print 'Time (as a string): ', cal.vevent.dtstart.valueRepr()

`readOne` will parse only one vevent. Give example of `readComponents` — Khurshid Alam, May 11 '17 at 16:17

score 10 · Answer 3 · answered Jan 17 '19 at 03:35

New to python; the above comments were very helpful so wanted to post a more complete sample.

# ics to csv example
# dependency: https://pypi.org/project/vobject/

import vobject
import csv

with open('sample.csv', mode='w') as csv_out:
    csv_writer = csv.writer(csv_out, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    csv_writer.writerow(['WHAT', 'WHO', 'FROM', 'TO', 'DESCRIPTION'])

    # read the data from the file
    data = open("sample.ics").read()

    # iterate through the contents
    for cal in vobject.readComponents(data):
        for component in cal.components():
            if component.name == "VEVENT":
                # write to csv
                csv_writer.writerow([component.summary.valueRepr(),component.attendee.valueRepr(),component.dtstart.valueRepr(),component.dtend.valueRepr(),component.description.valueRepr()])

score 5 · Answer 4 · edited Nov 27 '17 at 19:48

Four years later and understanding ICS format a bit better, if those were the only fields I needed, I'd just use the native string methods:

import io

# Probably not a valid .ics file, but we don't really care for the example
# it works fine regardless
file = io.StringIO('''
BEGIN:VCALENDAR
X-LOTUS-CHARSET:UTF-8
VERSION:2.0
DESCRIPTION:Emails\nDarlene\n Murphy\nDr. Ferri\n

SUMMARY:smart energy management
LOCATION:8778/92050462
DTSTART;TZID="India":20100629T110000
DTEND;TZID="India":20100629T120000
TRANSP:OPAQUE
DTSTAMP:20100713T071037Z
CLASS:PUBLIC
SUMMARY:meeting
UID:6011DDDD659E49D765257751001D2B4B-Lotus_Notes_Generated
X-LOTUS-UPDATE-SEQ:1
X-LOTUS-UPDATE-WISL:$S:1;$L:1;$B:1;$R:1;$E:1;$W:1;$O:1;$M:1
X-LOTUS-NOTESVERSION:2
X-LOTUS-APPTTYPE:0
X-LOTUS-CHILD_UID:6011DDDD659E49D765257751001D2B4B
END:VEVENT
'''.strip())

parsing = False
for line in file:
    field, _, data = line.partition(':')
    if field in ('SUMMARY', 'DESCRIPTION', 'DTSTAMP'):
        parsing = True
        print(field)
        print('\t'+'\n\t'.join(data.split('\n')))
    elif parsing and not data:
        print('\t'+'\n\t'.join(field.split('\n')))
    else:
        parsing = False

Storing the data and parsing the datetime is left as an exercise for the reader (it's always UTC)

old answer below

You could use a regex:

import re
text = #your text
print(re.search("SUMMARY:.*?:", text, re.DOTALL).group())
print(re.search("DESCRIPTION:.*?:", text, re.DOTALL).group())
print(re.search("DTSTAMP:.*:?", text, re.DOTALL).group())

I'm sure it may be possible to skip the first and last words, I'm just not sure how to do it with regex. You could do it this way though:

print(' '.join(re.search("SUMMARY:.*?:", text, re.DOTALL).group().replace(':', ' ').split()[1:-1])

@Dirk I think it is beneficial for the community to have multiple ways of doing things. Who knows, maybe in some case the ics parser will not work correctly and Wayne's answer will save someone's day! — Jonathan Komar, Aug 31 '16 at 11:46
@Dirk definitely don't reinvent the wheel, but also don't add anything more than you need. If you just need a couple of simple fields, you don't really need anything more than the std lib. If I was doing much more than this, I probably *would* just go ahead and install a library though - especially if I was actually trying to *create* appointments. — Wayne Werner, Aug 31 '16 at 12:17

score 4 · Answer 5 · answered Sep 01 '21 at 19:00

4

In case anyone else is looking at this, the ics package seems like it's updated better than any others mentioned in the thread. https://pypi.org/project/ics/

Here's some sample code I'm using:

from ics import Calendar, Event

with open(in_file, 'r') as file:
        ics_text = file.read()

c = Calendar(ics_text) for e in c.events:
        print(e.name)

answered Sep 01 '21 at 19:00

clementzach

151
2

1

"updated better" is relative. from the docs: "ics.py always uses UTC for internal representation of dates. This is wrong and leads to many problems." – minusf Jun 16 '22 at 12:20

score -2 · Answer 6 · answered Aug 04 '10 at 17:47

-2

I'd parse line by line and do a search for your terms, then get the index and extract that and X number of characters further (however many you think you'll need). Then parse that much smaller string to get it to be what you need.

answered Aug 04 '10 at 17:47

Brian

4,023
8
29
36

Parsing files (ics/ icalendar) using Python

6 Answers6

Linked

Related