Skip to main content

Shifu IoT Knowledge Classroom: RTSP 101

· 4 min read

Shifu can easily connect to devices using the RTSP protocol, making IoT development more efficient. Shifu is a Kubernetes-native IoT development framework that allows developers to easily connect, monitor, and control any IoT device. Shifu brings Kubernetes to the edge computing scenarios of the Internet of Things, helping to achieve scalability and high availability of IoT applications.

RTSP is used by most camera in the world, as it is efficient and classic. Well then why the hell is TikTok using RTMP to deliver us real time trolling videos? It is because Abode, the supreme leader of trolling media creation, developed it. Today we are going to have a high level overview of what RTSP is and what it does. First, one thing you should know is that to stream a video, RTSP (Real-Time Streaming Protocol) is not the only protocol you will need to have - RTP (Real-Time Transport Protocol) and RTCP (Real-Time Control Protocol) are important too.

Roasted Turkey with Sweet Potato

RTSP is used to handle the request and request acknowledgement between client and server, just like HTTP.

You send a request in this format:

<method> <url> <version>
CSeq: <seq>

For example, we can have this:

DESCRIBE rtsp://114.514.19.19:810 RTSP/1.0
CSeq: 2
Accept: application/sdp

And you can expect a response in this format:

<version> <code>
CSeq: <seq>

For example, we can have this:

RTSP/1.0 200 OK
CSeq: 2
Content-Length: 155
Content-Type: application/sdp

In the example above, we send a DESCRIBE method to the rtsp server, asking for the description of all streams on the server; and the response contains an SDP file describing what we just asked, containing a list of descriptions of all streams, like the address, type, communication protocol, encoding, etc. We will talk about the SDP file later in this series.

For a compelete receipe about what each method does, take a look at this official menu:

Note that RTSP itself is nothing about transmitting videos - it just tells the stakeholders that "I'm gonna do this".

The real work is done by RTP and RTCP.

Red Trout Poached

RTP takes care of transmitting the data of video. It uses UDP by default (of course for a real time data), packages the video data (video and audio) and transmits them.

It's header is like this:

Woah too many new terms! But you just need to know that it is a typical header of RTP to identify the data being transmitted. We will go over the details in the future article about RTP in this series.

Resins and Tomato Cooked in Pumpkin

RTCP is all about synchornizing. While RTP is transmitting videos, RTCP is sending the status/metadata of the data being transmitted by RTP. You use that for monitoring the video quality, controlling the load, throttling, etc.

It has five types of messages: Sender Report (SR), Receiver Report (RR), Source Description Items (SDES), BYE, and APP. Well, many of them are self-explanatory as those are acronyms. BYE is just to say this is the end of the transmission; and APP can be considered as "customized", as it is telling us the format and type are not registered and are experimental, so techinally you won't need it.

We will discuss more about each type in the following article about RTCP in this series.

Millions of questions must have arisen. What the hell is SDP? What does an RR message look like? How on the earth can I build an RTSP server? Can I stream myself playing Genshin Impact so I can collect enough donations to pull my next waifu? Just wait a little bit for the next articles...