H264 HD Encoding at 30FPS

Hello,
There has been a lot of talk of H264/MPEG4 decoding on Bagleboard, but
I am struggling to find any information on the encoding part. I am
trying to encode a live webcam stream using gstreamer-ti using
Beagleboard-xM with precompiled packages from Angstrom distribution.
Everything works well, but on resolutions of up to 320x240. Anything
higher, and my CPU usage caps at 99%. Is that normal, or in other
words, is this all Beagleboard-xM can do? I am clocking the CPU at
1Ghz, not sure what DSP frequency I am running, but according to
dmaiperf it is not the bottleneck. Here are the commands I am using:

if [ $# -lt 6 ]
then
    echo "Usage h264_server_command.sh <width> <heigth> <encoding

<destination_ip> <framerate> <deviceID> "

else
  gst-launch v4l2src always-copy=FALSE ! 'video/x-raw-
yuv,format=(fourcc)YUY2', width=$1, height=$2, framerate=$5/1 !
videorate ! ffmpegcolorspace ! dmaiperf print-arm-load=true engine-
name=codecServer ! TIVidenc1 codecName=h264enc
engineName=codecServer frameRate=$5 resolution=$1x$2
genTimeStamps=false encodingPreset=3 iColorSpace=UYVY bitRate=$3 !
filesink location=/dev/null

fi

The dmaiperf perf output for 320x240 resolution:
bps: 4500567; fps: 29; CPU: 24; DSP: 15;

For 640x480:
fps: 27; CPU: 100; DSP: 15;

Is the DSP benchmark in the same unit as the CPU? *ips? Seems like the DSP doing something, but is it the task you want.

Hello, I believe the units of CPU and DSP are both % utilization.

In addition, omitting the the TVDenc1 from the pipe structure above,
the CPU usage drops to 15% for 640x480 and 3% for 320x240, giving the
baseline usage results - rest of the CPU usage should come straight
from the encoder pipe.

Hello again,
I think I found the problem, thanks to Chris who PM'ed me earlier. The
problem was the ffmpegcolorspace conversion - not the h264 encoder.
Omitting the encoder, and adding explicit caps for conversion did the
trick - the CPU usage went up to 100%. Now the issue is how to remove
the need for the color-conversion. The webcam (logitech) -can only
stream in YUY2 and MJPEG, while the TVidenc1 can only take UYVY, Y8C8,
NV16, and NV12. Does anyone know of any other solutions I might want
to consider?

The dsp runs at 320mhz iirc. Hardware like dsp’s are normally rated at polygons/sec or something similar rather that ips…

-Alex

Mikhail wrote:

Hello,
There has been a lot of talk of H264/MPEG4 decoding on Bagleboard, but
I am struggling to find any information on the encoding part. I am
trying to encode a live webcam stream using gstreamer-ti using
Beagleboard-xM with precompiled packages from Angstrom distribution.
Everything works well, but on resolutions of up to 320x240. Anything
higher, and my CPU usage caps at 99%. Is that normal, or in other
words, is this all Beagleboard-xM can do? I am clocking the CPU at
1Ghz, not sure what DSP frequency I am running, but according to
dmaiperf it is not the bottleneck. Here are the commands I am using:

in terms of capability, the BB-XM cpu "could" encode H264 720p base
profile at round 6-4Mbit/s.

but, that is neither using gstreamer-ti, nor the freely available
codecs which are capped at SD size.

The free h264 encoder should be able to encode in VGA, but that of
course does not take any gst or other overheads into account.

there is also the issue that usb webcam input into the BB is
not very efficient and people have struggled to input vga at
decent frame rates.

so, your finding that the cpu is loaded while the dsp is bored
are correct.

try to not encode at all and just input frames to see how much
load that produces...

Mikhail wrote:

Hello again,
I think I found the problem, thanks to Chris who PM'ed me earlier. The
problem was the ffmpegcolorspace conversion - not the h264 encoder.

right, I missed that in your pipeline. As far as I know ffmpegcolorspace
is a very bad performing components and has no NEON or even ARM
optimization. FFmpeg is not happy that it still carries it's name.

Omitting the encoder, and adding explicit caps for conversion did the
trick - the CPU usage went up to 100%. Now the issue is how to remove
the need for the color-conversion. The webcam (logitech) -can only
stream in YUY2 and MJPEG, while the TVidenc1 can only take UYVY, Y8C8,
NV16, and NV12. Does anyone know of any other solutions I might want
to consider?

as said, a more optimized color space conversion could help. converting
from YUY2 to e.g. UYVY is just shuffling some color components around
and can be done in ~3ms per VGA frame using NEON. Even if the color space
conversion would take a significant amount of the frame time, it can still
be done in parallel with the dsp encoding the previous frame, but I guess
teaching gst to do that is not trivial...

No idea, I’m no gst expert …

Hi Mikhail,

I'm very interested in the same topic. Recently, I came across the
same issue with colorspace conversion eating a lot of CPU.
Theoretically, there is a solution to this problem in gstreamer-ti
trunk, as they introduced new DSP accelerated element TIPrepEncBuf
for color space conversion specially for TIVidenc1. I didn't really
checked if it could convert the formats provided by Logitec cameras,
so no idea here.

If you have a time and energy you could try to get following patch
working under oe:

http://patchwork.openembedded.org/patch/3558/

which will get you the ddompe branch containing this new element.

I would really appreciate if you would post your results here.

Regards,
Maksym.

HY Mikhail,
I also want to do H.264 video encoding at 30 FPS using embedded platforms. I have following option: Tegra K1, Tegra X1, Beagle bone and Raspberry PI. Since i am a beginner in this field. Can u kindly tell me that whether you were successful in doing H.264 encoding on Beagle Bone?

I have done H.264 and H.265 encoding on CPU using FFmpeg and H.264 encoding on GPU using Nvidia NVENC but havent tried it on an embedded platform.

Kind Regards,
Hassan

30 FPS? But at what video resolution and framerate?

Those other devices have dedicated video hardware, we don't have that
on the bone.

Regards,

Wait for X-15 ?

Yeah - it worked pretty well for my application. I needed low latency, low bandwidth with 640x480 resolution - beagleboard worked well for this. I dont think beaglebone would work well here since it does not have a dsp processor. not sure about other platforms and what people have done. I was generally impressed with the amount of open source tools and support provided by TI in regards to gstreamer support. Good luck!