R or Python cross compiler for beaglebone

I want to implement machine learning algorithm in Beaglebone(ARM). So I need to build a cross-compiler for R(preferably) or python in my system(Ubuntu / windows). I am very new to this topic “cross-compiler”. A stepwise instruction to build the cross-compiler and how to run a sample code using cross-compiler will be much appreciated.Thanks in advance :slight_smile:

On Tue, 16 Feb 2016 05:37:30 -0800 (PST),
fiem.arghyaxyz@gmail.com declaimed the
following:

I want to implement machine learning algorithm in Beaglebone(ARM). So I
need to build a cross-compiler for R(preferably) or python in my
system(Ubuntu / windows). I am very new to this topic "cross-compiler". A
stepwise instruction to *build the cross-compiler* and *how to run a sample
code using cross-compiler* will be much appreciated.Thanks in advance :slight_smile:

  The first thing to understand is that normally Python is a byte-code
interpreted language, which "compiles" the source to byte-code when the
program is executed. The only need for a "cross-compiler" might be if you
are trying to build /extension libraries/ using the C language that have to
run on a different target.

  Python, itself, is already installed on the BBB.

  I don't have enough experience with the R system, but suspect it is
similar -- most scripts are in source form, so one would just need to get
the R interpreter/run-time installed on the BB (and that might be available
via package manager). It again would be extensions written in C or such
that might need compiling for a target.

Thank you for replying.

BBB has limited resources to make build and install R packages (R-base and some ML packages, or any other packages for that matter). Also, I would like to test the build before actually porting the binaries to BBB. Since debugging and testing may also require more resources than are available on the BeagleBone Black, cross-compilation can be less involved and less prone to errors than native compilation.

Since debugging and testing may also require more resources than are available on the BeagleBone Black, cross-compilation can be less involved and less prone to errors than native compilation.

This is incorrect. Cross compiling anything introduces added complexity, and thus is more prone to errors. The main problem with compiling natively is the time it takes for the beaglebone hardware to finish compiling from source. Another problem is the minimal amount of RAM on the system, which can be mitigated some by using a USB hard drive, as a swap drive.

Anyway, I think I know what you’re saying. So know that when compiling “natively” you’re not limited to using a beaglebone only. “Natively” in this context would mean anything that can run, or is running the same OS, using the same ABI. Which in this case the ABI is armhf. In other words . . . most / all armv7 systems. Such as the rPI 2, wanderboard, Beagleboard X15, etc. Can be made to natively compile a binary for the Beaglebone black. This works great, but does take some time to setup perhaps, and does cost additional monies( to purchase another board ).

Additionally. I personally write C applications in x86 / i386 a lot. These applications will also 99% of the time directly port to an ARM system with nothing more than copying, and compiling the source to / on that system. C is a true compiled language, where Python, and R( as far as I know ), are byte code / interpreted languages. The point here is that if C can be used in this manner, surely Python, and R can too.

What does this mean ? This means that once you have a run-time working on a platform, for an interpreted language. Code written on one system should run just fine on another. Assuming there are no major version barriers interfering. As mentioned above by Dennis. Debian, regardless of ABI is going to have a Python interpreter period. As Python, C, and ruby are all needed in order to build, install and otherwise setup a new Debian( and very probably any Linux system ).

So this means you already have a Python run-time available at minimum through the APT package manager. Probably R too but let’s have a look . . . googling “debian R runtime”

https://packages.debian.org/wheezy/r-base-core
william@beaglebone:~$ apt-cache search r-base-core
r-base-core - GNU R core of statistical computation and graphics system
r-base-core-dbg - GNU R debug symbols for statistical comp. language and environment

So you need to cross compile what / why ?

@William
Thanks for clarifying in detail. Yes, I have checked that the Python code
written in x86 can be run on ARM system by just copying the code.

There are certain barriers as you pointed out, like slow compiling and limited RAM size.
In my algorithm, I have to deal with a continuous data stream so limited ram size may affect the
computation heavily. And, fast computation of large dataset is badly needed in my algorithm.
That is why, I was thinking about cross-compiling.
And also, as you suggested to use swap drive like virtual memory concept,
can you elaborate on how to implement it?

"cross-compiling" isn't going to help you there, as you still need to
run your "algorithm" on the ARM system right???

In the end, native vs cross building should give you the same final
binary (assuming your cross/native compilers are the same
version/etc..)

Regards,

@William
Thanks for clarifying in detail. Yes, I have checked that the Python code
written in x86 can be run on ARM system by just copying the code.

Ok, so what you need to understand. Is that a run-time is just an abstraction layer. It’s this abstraction layer that handles all the system level gory details. So once it is in place, everything else being equal, it’ll “just work” As Robert eluded to however, and I think I also mentioned in my last post. Compiler / run-time versions need to be the same, or very close for the best results.

There are certain barriers as you pointed out, like slow compiling and limited RAM size.
In my algorithm, I have to deal with a continuous data stream so limited ram size may affect the
computation heavily. And, fast computation of large dataset is badly needed in my algorithm.
That is why, I was thinking about cross-compiling.

So, compilers, toolchains etc are getting very complex now days. Sometime, just setting up a cross compile system for a certain situation can take a considerable amount of time. So one needs to weight this possibility against how long it might actually take to compile natively. If neither possibility is acceptable, then one should look into buying a “bigger and better” system to use as a build system for the beaglebone. It’s been done, and is what is refereed to as solving a problem by “throwing” more money at the problem. A perfectly acceptable practice, for some.

And also, as you suggested to use swap drive like virtual memory concept,
can you elaborate on how to implement it?

This is something I would have to write a guide for, and put up on my blog site, and which I might actually do soon. The problem here is that because this is not a PC type computer system. The guides for that on the internet will not work for this situation. These guides can be modified . . . but it can be complex. Better for me to write and test a guide I know that will work. One thing to keep in mind however. The USB drive has to has it’s own power supply. As the beaglebone will not supply enough power for the drive, especially at “spin up”. Where some drives can draw as much as 3A . . .

Another thing to point out, that I forgot to mention in my last post. Depending on what you’re doing with said code, Python can be dog slow. When compared to many other languages for several situations. Performance wise, Python is way down the list. I only mention this, as it seems to me that you have a performance constraint for your situation. R I would assume will not be much better, but possibly better than Python. Honestly 99% of the other languages out there will outperform Python in many cases.

So, if you really need performance, you’re probably going to want to use another language. But to put things in order . . .

  1. Assembly
  2. C
  3. Sometimes C++ Sometimes Javascript ( V8 engine ).

Seriously, Javascript in the context of googles V8 engine ( Nodejs ) really is that fast. However, one thing I have noticed personally. Nodejs, when written to run as a command line executable does have noticeable latency. Meaning, if you need a tool that executes once when run, and then it exists. Nodejs probably is not the best tool for that type of Job. If the code is run once, and runs for long periods of time however . . . Nodejs will perform really well. Still not as fast as C, and depending on what the code is doing, perhaps not as fast as C++ either.

Anyway, C is probably the best well known for being the universal embedded language of choice for many embedded developers. So perhaps you may want to consider which language you use. C++ is also making it’s way into many embedded projects. On a personal note however, I think C++ is a great language especially when it comes to generics, and templates. However because of all this “coolness” C++ brings it also bring complexity with it. So for me personally, I tend to stick with straight C whenever possible. Which has worked out to always for me.

Thank you @William and @Robert
The thing is pretty clear to me now.
As the code has to run in BeagleBone, so it doesn’t matter
where the binary file has been created.

@William, please do share your blog site when you will
finish writing the guide. That will be a great help.

@William, please do share your blog site when you will
finish writing the guide. That will be a great help.

http://www.embeddedhobbyist.com/ is my blog. I’ve actually been thinking about adding a different kind of blog for the last couple days, but this one should not take too long. a couple hours maybe. Perhaps I’ll get around to that tonight. We’ll see. No promises.

@William, I love to work with C.
But, the thing is I want to implement machine learning algorithms in BeagleBone.
So, it is quite easier to implement them with Python or R, as their are some
dedicated packages to analyze the data.

Going through your blogsite. Nice job @William. It will become a good guide for my future works.

The benefit of Python is the easy interface to C code, which means that when your algorithm is slow, you can reimplement that code in C but keep the benefit of writing in a more feature rich language like Python.

Regards,
John

First, “the interface” is slow: meaning the run time. But if you have to “reinvent” a Python API / BCL because it is slow - And do that in C. Why not just use C to begin with ?

Aside from Python forcing coding guidelines on it users, it’s a fine language. What it is not though, is a good language when you need pure all out performance. But not all code needs to be fast. Sometimes, fast enough works. Probably even most of the time.

The OP said he has algorithms already written in Python so I was pointing out that if he need to improve performance, it is easy to do. Anyway, you know that most build systems like Openembedded, yocto, etc, all use Python? If it was so slow, they wouldn’t be using it. Now I’m wondering why they just don’t write the whole thing in C :wink:

Regards,
John

The OP said he has algorithms already written in Python so I was pointing out that if he need to improve performance, it is easy to do. Anyway, you know that most build systems like Openembedded, yocto, etc, all use Python? If it was so slow, they wouldn’t be using it. Now I’m wondering why they just don’t write the whole thing in C :wink:

Regards,

John

I kind of like the idea behind interfaces in C for Python . . . I just don’t like Python. So John, if you read my posts above you’d know that I know that Python is a requirement for building Linux systems. Python Perl, and C are all required. I said “Ruby” above, but meant Perl. In my mind they’re both equivalent( garbage, but useful I guess ), hence my mistake Anyway . . .

There are machine learning libraries for C . . . No idea if a specific one out there would fit the bill for the OP though. Not my job to look into all that.

quick worklog, but do note that I used a 4GB USB thumb drive to make things easier( so as not to have to format an existing USB hard drive ). http://pastebin.com/1zEFAYM3

Do note that the only thing you need to fstab is:

UUID=1808bf31-3313-44e6-b063-2d4d81f02e9e none swap sw,pri=5 0 0

But do keep in mind your UUID will be different.

Anyway, I’ll probably write up a blog on this to have a link for additional queries. But It’ll probably at minimum take me a few hours to get around to it.

OK so further checking, the listing in fstab does not seem to work. I’ll have to look into it when I get some free time.

It’s in relation to using a UUID instead of /dev/sdx. I’ve never had good luck using UUID’s, probably because I do not really know much in way of using them properly.

Also I used the drive object directly when I probably should have targeted the partition instead.