RTSP and RTP streams can be complex and challenging to display, especially RTSP as this protocol is notoriously (mis-)interpreted by a lot of server producers. Writing your own network stack, do your own demuxer and feed the video bitstream to VideoToolbox is certainly possible, but will take time, especially if you also care about audio and want to play both in-sync.
I can recommend you to have a look at the MobileVLCKit framework, which in its latest (pre-release) version includes a VideoToolbox hardware decoder and can transparently fallback on a software decoder as needed (like on iOS 7 where VT is not available or if the codec profile used in the stream does not match the capabilities of the hardware decoder included in the device).
VLCKit is under LGPLv2.1, which is perfectly safe to deploy on the iOS App Store as long as you follow the license (see attribution, repackaging, publication of eventual patches, ...). It comes in static and dynamic flavors as needed.
To try the current dev version, try the cocoapod "MobileVLCKit-unstable" in version "3.0.0a7". We expect to ship a final version by the end of the summer.
Full disclosure: I'm one of the main authors of the aforementioned library. Happy to help on this topic in general :)