Preamble:
In this post I will explore how to stream a video and audio capture from one computer to another using ffmpeg and netcat, with a latency below 100ms, which is good enough for presentations and general purpose remote display tasks on a local network.
The problem:
Streaming low-latency live content is quite hard, because most software-based video codecs are designed to achieve the best compression and not best latency. This makes sense, because most movies are encoded once and decoded often, so it is a good trade-off to use more time for the encoding than the decoding.
However, some encoders, particularly the NVENC-based h264_nvenc and hevc_nvenc are very good at handling low-latency encoding situations (as they have dedicated encoder presets tuned to low latency encoding), and they are the perfect solution to this problem.
I wrote a quick script in python that would output the current time in milliseconds to measure the system's and the encoder latency encountered:
#!/usr/bin/python3
import time
import sys
while True:
time.sleep(0.001)
print('%s\r' % (int(time.time() * 1000) % 10000), end='')
sys.stdout.flush()
Using this script, you can then encode and decode the stream on your desktop at the same time. Making a screenshot of both the original desktop and the streamed desktop next to it gives you the total video encode latency.
Solution:
We will be using ffmpeg for the desktop and audio capture on the local system:
ffmpeg \
-f x11grab -s 1920x1080 -framerate 60 -i :0.0 \
-c:v h264_nvenc -preset:v llhq \
-rc:v vbr_minqp -qmin:v 19 \
-f alsa -ac 2 -ar 44100 -i hw:Loopback,1,0,0 \
-f mpegts - | nc -l -p 9000
For further NVENC encoder options, check out this guide.
On capturing the Audio stream from a running application:
Note that on the provided examples, we are capturing audio output from a running application on the desktop using the snd_aloop module. Read more on its' usage here so you can tune it as you see fit.
On the client:
We will use netcat (nc) and mplayer to view (play back) the generated live stream from the remote host:
nc <host_ip_address> 9000 | mplayer -benchmark -
A much cleaner syntax is used below, which spawns a netcat process listening to the specified port:
nc -l -u -p 9000 <host-ip> | mplayer - -benchmark
You can even save the results of the livestream (on the remote host) to an MP4 file or any other container format supported by ffmpeg if you so desire:
nc -l -p 9000 | tee file_containing_the_video.mp4 | mplayer - -benchmark
Note: It is advisable to use the -benchmark flag on the client-side. -framedrop might help as well, especially on slower clients where video decode may present a challenge. Ensure that netcat's specified ports are open on the firewalls on both the local and the remote hosts, and that both the local and the remote netcat instances are using the same port.
Experimental: Using h264_vaapi's encoder on Intel-based hardware:
If you have a supported SKU (a system with an Intel Core or supported Pentium/Atom or Core-M with integrated Ivybridge, Haswell, Broadwell, Skylake or higher GPU), you may also use the VAAPI-based encoders (h264_vaapi and h265_vaapi where supported) as you see fit:
ffmpeg \
-vaapi_device /dev/dri/renderD128 -vf 'format=nv12,hwupload' -i :0.0 \
-f x11grab -s 1920x1080 -framerate 60 \
-c:v h264_vaapi -threads 4 \
-f alsa -ac 2 -ar 44100 -i hw:Loopback,1,0,0 \
-f mpegts - | nc -l -p 9000
Depending on your platform's hardware, your video quality may differ greatly. As a rule of thumb, Sandybridge would give you considerably worse video quality, whereas Haswell and above have greatly improved QuickSync engines, and thus trade off quality for encoder performance. In the same vein, encoder latency for QuickSync may be higher than that encountered with the likes of NVENC, as tested on a multi-GPU system at my disposal.
Extra tips:
If you have lower network bandwidth and or a much weaker processor and GPU combination (Intel's case applies here), you can halve the frame rate to 30 albeit at higher latency spikes. Using a software-based encoder implementation (say, libx264) will always result in a higher quality at the same preset with a much higher system load.
If you want to try to tweak this setup even further, you can pipe the host directly into the client instead of using the network, using the -quiet option of mplayer to see what the encoder is up to.
nc -l -u -p 9000 <host-ip> | mplayer - -benchmark
Have fun out there :-)
PS: You may refer to netcat's advanced usage options here.