OpenSSL with Crypto Acceleration on BBB

OpenSSL with Crypto Acceleration on BBB

I’m excited to say I’ve got OpenSSL using crypto acceleration working on the BBB using debian! (at least, I’m pretty sure based on my OpenSSL tests :wink: )

The quick instructions are:

  1. Download R. Nelson’s kernel headers for Debian (since that’s what I was using).
  2. Make the cryptodev kernel module*
  3. Recompile OpenSSL to use CRYPTODEV

*I had to tweak arch/arm/include/asm/timex.h to change the line
#include <mach/timex.h> to read: #include
<usr/src/linux-headers-3.8.13-bone26/arch/arm/include/asm/timex.h>

Detailed instructions are on my website:
http://datko.net/2013/10/03/howto_crypto_beaglebone_black/

Josh

Here’s the output of the openssl-tests:

Without cryptodev

debian@arm:~/openssl-1.0.1e/cryptodev-linux-1.6$ time openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 2666405 aes-128-cbc’s in 2.99s
Doing aes-128-cbc for 3s on 64 size blocks: 905987 aes-128-cbc’s in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 240811 aes-128-cbc’s in 2.99s
Doing aes-128-cbc for 3s on 1024 size blocks: 61145 aes-128-cbc’s in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 7677 aes-128-cbc’s in 3.00s
OpenSSL 1.0.1e 11 Feb 2013
built on: Mon Mar 18 21:48:12 UTC 2013
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) blowfish(ptr)
compiler: gcc -fPIC -DOPENSSLPIC -DZLIB -DOPENSSLTHREADS -DREENTRANT -DDSODLFCN -DHAVEDLFCNH -DLENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -DFORTIFY_SOURCE=2 -Wl,-z,relro -Wa,–noexecstack -Wall
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 14268.39k 19327.72k 20617.93k 20870.83k 20963.33k

real 0m15.114s
user 0m15.031s
sys 0m0.041s

With cryptodev

debian@arm:/usr/local/ssl/bin$ time /usr/local/ssl/bin/openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 28166 aes-128-cbc’s in 0.04s
Doing aes-128-cbc for 3s on 64 size blocks: 22445 aes-128-cbc’s in 0.03s
Doing aes-128-cbc for 3s on 256 size blocks: 29933 aes-128-cbc’s in 0.05s
Doing aes-128-cbc for 3s on 1024 size blocks: 16018 aes-128-cbc’s in 0.04s
Doing aes-128-cbc for 3s on 8192 size blocks: 4861 aes-128-cbc’s in 0.02s
OpenSSL 1.0.1e 11 Feb 2013
built on: Fri Oct 4 01:48:18 UTC 2013
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -DOPENSSLTHREADS -DREENTRANT -DDSODLFCN -DHAVEDLFCNH -DHAVECRYPTODEV -DUSECRYPTDEVDIGESTS -march=armv7-a -Wa,–noexecstack -DTERMIO -O3 -Wall -DOPENSSLBNASMMONT -DOPENSSLBNASMGF2m -DSHA1ASM -DSHA256ASM -DSHA512ASM -DAESASM -DGHASH_ASM
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 11266.40k 47882.67k 153256.96k 410060.80k 1991065.60k

real 0m15.326s
user 0m0.225s
sys 0m5.990s

Eh, AES is AES: given the key, the encrypted bits sent on the network
must be the same whether generated in software or hardware. The
breakage, if real, is in the infrastructure, such as the RNG for key
generation.

I have not yet tried to get the HWRNG working on the BBB. According
to the TI Crypto page [1], you just need to reconfigure your kernel
and it should add /dev/hwrng support. If anybody has gotten this
working recently, I'd like to know :slight_smile:

Josh

[1] http://processors.wiki.ti.com/index.php/Cryptography_Users_Guide

I would be very interested in this topic as well. The application I have in mind for this BBB relies on making varying amounts of SSL connections. In a test today I believe I ran the pool out of entropy and some handshakes would hang for a while before completing (typical SSL handshake would take about 1/2 a second but some would hang for 2 to 4 seconds before completing). Basically the performance is such that I can’t use it for the desired application but getting the hwrng working would likely change everything. This unit is operating “headless” with no kbd/mouse or anything except network connected. I finally did break down and installed rng-tools to use /dev/urandom to seed /dev/random but I see that as basically a quick/dirty workaround. Even tried adding randomsound to add some entropy but that didn’t seem to make any difference.

Hmm, looks like:

apt-get install haveged

was all I needed. While it isn’t the hwrng, it is a nice entropy generator.

http://freecode.com/projects/haveged

A hardware RNG is a nice thing to have, but the idea that /dev/urandom is
somehow 'bad' or 'insecure' is completely flawed: it is a PRNG and is designed
to generate an infinite random stream that is indistinguishable from a 'true'
RNG, once it is seeded properly.

I don't know the details about how Ansgstrom, etc. do this, but typically Linux
seeds the PRNG from /dev/random, mixing system state so as long as you
are not getting any errors while seeding the PRNG, using /dev/urandom is
perfectly fine.

Thanks George!

The haveged project looks interesting, I'll have to check it out.

In the case anyone wants to help pick this up, these are my notes so far:

1. There are various documents that mention the HW_RNG.[1][2]

2. I've tried to configure the TRNG in kernel 3.8.13-bone26 and I did
not see Devices->Char Devices->Hardware Random (etc..)->OMAP4 Random

3. OMAP RNG appears to depend on CONFIG_HW_RANDOM && (
CONFIG_ARCH_OMAP16XX || CONFIG_ARCH_OMAP2PLUS ).[3] I believe
CONFIG_HW_RANDOM is set and CONFIG_ARCH_OMAP2PLUS was set.

4. There were some patches back in August[4] on this. But it wasn't
clear to me whether they are in 3.8.13 or not.

I was planning on the following approach:

1. Try building the kernel to see if omap-rng builds a .o or a .ko
2. Try grabbing a later kernel to see if it's there.
3. Try applying the patches directly.
4. Re-visit my assumption that CONFIG_ARCH_OMAP2PLUS is enabled,
because then, based on the logic above, I shouldn't see the OMAP-RNG.

If I make any breakthroughs, I'll keep the list posted. If others are
working on this, please do the same!

Josh

Footnotes:

[1] http://www.ti.com/lit/wp/spry198/spry198.pdf

[2] http://processors.wiki.ti.com/index.php/Cryptography_Users_Guide

[3] http://cateee.net/lkddb/web-lkddb/HW_RANDOM_OMAP.html

[4] http://comments.gmane.org/gmane.linux.ports.arm.omap/102520