BeagleBone Black PRU: Hello World

This article presents what is meant to be the simplest possible example of using the PRU (programmable realtime unit) on the BeagleBone Black single-board computer. The example program has no inputs and no outputs; it does nothing other than delay for a fixed duration then exit. Read on after the jump…

What is the BeagleBone Black? It’s a tiny single-board computer that’s usually used to run Linux. Surprisingly capable for the US$45 price tag, it’s got lots of general-purpose IO, HDMI video output and on-board flash storage. Lots of specs and additional information can be had from the manufacturer’s website.
What’s a PRU? The PRU (programmable realtime unit, also known as the PRU-ICSS or PRUSSv2) is a subsystem of the AM335x ARM Cortex-A8 processor on the ‘bone. It is an independent CPU with its own memory and instruction set. It can run its own program, completely independent of the Linux kernel on the main CPU. It’s fast (200MHz clock), all the instructions take known constant times and you have it all to yourself, so you can use it for things that require a hard realtime response. The ‘bone has two PRUs.

If you are just getting started with the BeagleBone Black, the information here is not going to be terribly helpful — this is not a fun or instructive first project.

However, if you’re running into a situation where just banging bits into the GPIO registers using memory-mapped IO is too slow or too imprecise for the particular hardware you’re trying to talk to, then using the PRU might solve your problem without requiring extra off-board hardware.

There’s lots of great information about using the PRU available on the ‘net — please see the References section, below. However, in a lot of cases it’s overload. Many of the examples are doing something fairly complicated. Some of the information you need is buried in a 20MB processor reference document, or in the middle of a long forum thread.

The goals here are:

Make simple stuff simple. I want to have all the information I need to start working with the PRU, organized in a reasonable way, in one place. And who knows, maybe it’ll help somebody else.
Get a development environment set up and in a known-working state. Working with the PRU is hard enough without having to worry about problems with your tools or fundamental approach.
See a result (even if it isn’t one that’s practical or especially exciting). I’ve previously mentioned “Fail early, fail often.” It boils down to wanting to learn about problems early before I invest lots of time in an approach that’ll never work. That means being able to see something real happen at lots of intermediate steps on the way to a goal.

I’m specifically and deliberately avoiding dealing with any of the device-tree stuff you’ll need to worry about if you want to do real work with the PRU. (But I do intend to cover that in a future post.)

Advantages and Disadvantages

One of the main points I’d like to make up front is that using the PRU — even in a simple way — is kind of a butt-pain. If you can avoid it, you probably should (unless you just want to learn, in which case: go for it!). For most projects, the PRU is overkill. If you’re just trying to flash an LED or measure the temperature or something, there are much easier ways to do it. If you’re thinking about adding an off-board microcontroller (or FPGA) because you can’t get tight enough timing otherwise, using the PRU just might be a better alternative.

What do you gain by using the PRU?

speed — You get an entire processor to dedicate to a single task; it isn’t just getting time slices in a premptive multitasking system.
predictability — The PRU instruction set is relatively small, every instruction takes a known, fixed number of clock cycles, and you’re not going to swap context or wait for memory paging in the middle of your delicately-tuned timing loop.
offloading — If you can do something CPU-intensive on the PRU, it’s not going to be bogging down the Linux userland and making interactive performance sluggish.
fast IO — Many of the pins have special IO modes for direct access by the PRU. These work much faster than memory-mapped IO from the main processor.

What does using the PRU cost you?

no GCC — Update: TI has made available a PRU C compiler as part of their “Code Composer Studio” (CCS) IDE. However, it is not free as in speech, only free as in beer for limited use, requires registration, and is ITAR regulated. The registration form expects you to be a company or university, not an individual. They also want to know details of what you’re using it for. It is possible to download just the TI PRU tools, without having to download CCS or use the CCS GUI. There’s a great writeup of how to actually use the TI compiler from fabien le mentec at EmbeddedRelated.com.
extra development overhead — You’ll need to install some additional tools, headers and libraries, above and beyond what you need for normal userland software development.
hardware resources — The ‘bone only has two PRUs, and they can’t be virtualized (at least not without sacrificing all the advantages of using a PRU in the first place).
security — To load code into the PRU from user space, you need to be root. While you can certainly develop secure applications that use the PRU, it requires careful thought. There’s no kernel enforcing the rules in PRU-land.
learning curve — Presumably, the whole reason you’re reading this page, nyet?

Assumptions

I’m going to assume that the reader is an experienced Linux user and C programmer.

The Ångström distro gives me hives. Just trying to figure out how to set a static IP with connman made we want to punch a baby. Busybox is a great piece of software, but if I’m using it, it’s because I have to, not because I want to. So: I’ve put Debian one my ‘bone, and some of my instructions are Debian-specific (like using apt-get to install things). If you use something else, that’s fine, but you’ll have to adapt what’s here to your particular system.

All of my examples involve developing natively on the BeagleBone Black itself. Cross-compiling on your big desktop system is certainly an option, but it adds an extra layer of complexity and the ‘bone is fast enough that it isn’t really necessary.

You’ll need root. You may need physical access to the ‘bone (to power cycle it if you screw up badly enough to wedge it — unlikely but possible).

Finally, I assume that you have a basic native development environment on the ‘bone, you can compile and link C programs and that you have things like “make” and a reasonable text editor handy.

Development Tools

In order to write programs that use the PRU, you’ll need to install the pasm assembler, the libprussdrv library and the header files for that library. While there isn’t (at the time of this writing) a convenient Debian library you can install with apt-get, it isn’t terribly hard:

Get a copy of the am335x_pru_package software from the GitHub repository. There are various ways to do this, but the simplest is probably to use wget from the command line to download the package tree as a .zip file from https://github.com/beagleboard/am335x_pru_package/archive/master.zip. Note: If wget complains about problems checking the site certificate, it is likely because you don’t have the root certificates installed. You can either use the –no-check-certificate option or (much better) just: sudo apt-get install ca-certificates
If you downloaded the archive (as opposed to cloning the git tree), unpack it somewhere under your home directory.
Make a new directory /usr/include/pruss/ and copy the files prussdrv.h and pruss_intc_mapping.h into it (from am335x_pru_package-master/pru_sw/app_loader/include). Check the permissions; if you used the .zip file, these headers will likely have the execute bits on. It doesn’t really hurt anything, but is certainly not what you want.
Change directory to am335x_pru_package-master/pru_sw/app_loader/interface then run: CROSS_COMPILE= make (note the space between the = and the command).
The previous step should have created four files in am335x_pru_package-master/pru_sw/app_loader/lib: libprussdrv.a, libprussdrvd.a, libprussdrvd.so and libprussdrv.so. Copy these all to /usr/lib then run ldconfig.
Change directory to am335x_pru_package-master/pru_sw/utils/pasm_source then run source linuxbuild to create a pasm executable one directory level up. Copy it to /usr/bin and make sure you can run it. If you invoke it with no arguments, you should get a usage statement.

There are certainly other places you can install stuff if you feel strongly about it. Using /usr/bin, /usr/lib and /usr/include is simple and works.

I suggest you keep the tree where you unpacked the am335x_pru_package; it contains documentation and example code you’ll probably want to look at later.

Enable the PRU

Before we can use the PRU, we need to enable it, in much the same manner as we would for a UART or a GPIO pin. This means adding it to the device tree. Fortunately, we can use a device-tree fragment that’s already been created for us.

Example command (to be run as root):

echo BB-BONE-PRU-01 >/sys/devices/bone_capemgr.9/slots

This should complete without any error messages. If you cat(1) the slots file, you should see an entry like:

8: ff:P-O-L Override Board Name,00A0,Override Manuf,BB-BONE-PRU-01

(Your result may show a different slot number; I have a couple other unrelated override boards loaded on my ‘bone.)

The output of lsmod should show the uio_pruss module loaded. (This module is a tiny driver that lets us talk to the PRU from user-space. It’s used by the libprussdrv library we installed earlier.)

The dmesg output should show something like:

bone-capemgr bone_capemgr.9: part_number 'BB-BONE-PRU-01', version 'N/A' bone-capemgr bone_capemgr.9: slot #8: generic override bone-capemgr bone_capemgr.9: bone: Using override eeprom data at slot 8 bone-capemgr bone_capemgr.9: slot #8: 'Override Board Name,00A0,Override Manuf,BB-BONE-PRU-01' bone-capemgr bone_capemgr.9: slot #8: Requesting part number/version based 'BB-BONE-PRU-01-00A0.dtbo bone-capemgr bone_capemgr.9: slot #8: Requesting firmware 'BB-BONE-PRU-01-00A0.dtbo' for board-name 'Override Board Name', version '00A0' bone-capemgr bone_capemgr.9: slot #8: dtbo 'BB-BONE-PRU-01-00A0.dtbo' loaded; converting to live tree bone-capemgr bone_capemgr.9: slot #8: #2 overlays omap_hwmod: pruss: failed to hardreset bone-capemgr bone_capemgr.9: slot #8: Applied #2 overlays.

I’m not sure what’s going on with the “failed to hardreset” message, but stuff seems to work anyway…

Create the PRU Program

My goal for the PRU equivalent of the “Hello, world” program was to create the simplest possible program that still produces some kind of visible evidence that it’s working. (That excludes a program that just immediately halts; how would you know it ran at all?)

Most of the options (like manipulating IO pins or communicating back to the host processor) involve extra complexity that I didn’t want for this initial attempt — mostly, dealing with the device tree in a non-trivial way. I settled on creating a program that would busy-loop for a fixed length of time, then exit.

You can download all my example code in a single archive, but here is the PRU assembly source if you want to take a quick look in your browser: example.p

Some additional explanation of the assembly source:

The source is written for the TI-provided “pasm” macro assembler. The authoritative documentation for this is in Section 5.3 of the AM335x PRU-ICSS Reference Guide (1.3MB PDF). This document explains the command-line arguments, input syntax and processor instruction set.

The .origin directive tells where the code is loaded into the PRU memory. (The PRU has its own 8KB instruction memory. I don’t know of any reason you’d use another .origin for the entry point of your code.)

The .entrypoint directive is only for the debugger. Changing this won’t change where the PRU starts executing instructions once your program is loaded.

The only thing my example program does that’s even slightly tricky is the way it signals the host processor to indicate that it has finished. The register r31 is magic; if you write a ‘1’ into bit 5, simultaneously with some value into bits 0-3, then an event will be sent to the host (by way of some excessively complicated mapping in the interrupt controller aka INTC).

The reference guide section 5.2.2.2 talks about how r31 works in this regard, while the INTC is discussed in sections 6 and 7. It’s admittedly some heavy reading; for the purposes of getting started this may be something you can take on faith.

The last instruction halts the PRU. Without this, it would happily continue into whatever happened to be in instruction memory after the end of your program.

Create the Host-side Program

To load our program into the PRU, we’ll use a program that runs in Linux user-space on the host CPU, and interfaces with a TI-supplied kernel module and library. A quick link to the example program: example.c

A minimal outline of what the program does:

initialize the library
set up the interrupt we want to use
load example.bin from the filesystem into PRU instruction memory and start the PRU
wait until the PRU asserts the interrupt, telling us the program has completed
clean up

Most of those steps are reducible to one (or a few) calls into library functions that are part of the TI-supplied libprussdrv library.

The library functions are documented in the AM335x PRU Linux Application Loader User Guide document (632KB PDF). Example code is in the pru_sw/example_apps/ directory within the am335x_pru_package software. (However, the examples are doing more complicated things than I am here.)

If you’re really curious about what the library is doing internally, it comes with source — in the pru_sw/app_loader/interface/ directory within the am335x_pru_package stuff.

Build and Test

Update: An archive containing the complete example including a Makefile is available for download: pru-helloworld.tar.gz (4KB).

To assemble the example.p into example.bin and compile and link example.c into example, just make using the supplied Makefile. There’s nothing complicated going on there.

To run, invoke the example program (as root):

sudo ./example

This should take approximately five seconds to run, and produce output like:

waiting for interrupt from PRU0... PRU program completed, event number 1

(The first line should appear almost immediately; the second, after the delay.) If it returns right away, stalls forever, or produces some kind of error message, then there’s a problem. The “event number” you see in the second line may differ; it will increase with each run, and reset on reboot.

References

I never would have known about the PRU in the first place, much less ever figured out how to use it, without the help of a number of other information sources. For further reading:

Element 14 Forum Thread: BBB – Working with the PRU-ICSS/PRUSSv2 — extremely valuable, and where I got started. Suffers a little for having to sift through a lot of volume (including some outdated stuff) to find what you need.
AM335x PRU-ICSS Reference Guide (1.3MB PDF) — definitive and comprehensive, but not as helpful as some other sources when you’re first getting started. Includes vital documentation of pasm assembler syntax and command line, and the PRU instruction set, as well as information about register usage and memory mapping.
The am335x_pru_package project on GitHub — contains necessary library code and headers, helpful examples and additional documentation.
The AM335x PRU Linux Application Loader User Guide (632KB PDF) (from the am335x_pru_package) — documents the libprussdrv library API.
The AM335x Technical Reference Guide (20MB PDF) — the full processor documentation, from the TI product page; what you need is in there… somewhere.

Updated 17Feb2014 by DGH: Added link to example archive including Makefile. Thanks to John Cutler for telling me about the omission.

Updated 01Jun2014 by DGH: Added mention of and link to TI PRU C compiler. Thanks again to John Cutler.

Updated 07Jun2014 by DGH: Added link to EmbeddedRelated article with TI compiler tutorial. We have a John Cutler trifecta!

By dhenke

Email: dhenke@mythopoeic.org

View all of dhenke's posts.

16 comments

longqi says:

September 24, 2014 at 01:32

You really help me out. Thank you so much.
For the debug, may be you can check this http://www.digikey.com/product-search/en?x=0&y=0&lang=en&site=us&keywords=FTR-110-03-G-D-06
and http://www.supermicros.org/bbb-blog/jtag-and-emulation-on-the-beaglebone-black
for PRU Debug by using JLink.
Joseph Muziki says:

September 24, 2014 at 08:21

In step 6 of Development tools, how are you supposed to run source linuxbuild. I get the following error:

root@beaglebone:/home/am335x_pru_package-master/pru_sw/utils/pasm_source# source linuxbuild
-sh: source: linuxbuild: file not found
dhenke says:

September 24, 2014 at 12:22

Joseph @#2:
In the directory am335x_pru_packages-master/pru_sw/utils/pasm_source, is there a file called “linuxbuild”?

If not, re-download the latest version from GitHub. (I did just now to check, and it has the file.) The URL is: https://github.com/beagleboard/am335x_pru_package/archive/master.zip

If you have a linuxbuild file, check the contents. They should look like:

#!/bin/sh
gcc -Wall -D_UNIX_ pasm.c [ a bunch of other stuff ]

If the contents don’t look like that, then you’ve got a corrupt archive or a bad unzip tool or something of that nature.

If so: Try the command
ls -l /bin/sh

You should see something like:
lrwxrwxrwx 1 root root 4 Feb 29 2012 /bin/sh -> dash*

(If not: are you running Debian on your ‘bone?)

If so: Try the command
which gcc

You should see:
/usr/bin/gcc

If not, you don’t have the compiler installed. The easiest fix for that is probably:
sudo apt-get install build-essential

If none of those are the problem, I’m not sure what’s wrong…
dhenke says:

September 24, 2014 at 12:26

longqi @#1: Thanks for the links! The ability to connect through JTAG and debug interactively is very powerful, and probably essential for doing anything complicated. (The alternative is “caveman” debugging where you use IO pins or memory mailboxes to drop clues about your PRU program’s state.)
longqi says:

October 2, 2014 at 07:02

I have tried jtag already, it can not work. when you are running the PRU program, the registers in the CCS will show you unable to read the value, it will show only after the pru stop running. it’s so bad. I will still use the caveman debugging that can be used anywhere.
Ali says:

November 26, 2014 at 20:15

Extremely helpful, thanks a lot! Do you have any other tutorials about native development (in general, not specifically PRU)?
Arvind says:

December 11, 2014 at 05:01

In step 6 of Development tools, when I run “source linuxbuild” the following error appears-
/usr/bin/ld: can not open output file ../pasm: Permission denied
dhenke says:

December 11, 2014 at 09:58

Arvind at #7: Check that you’re running “source linuxbuild” as a user that has permissions to write to the am335x_pru_package-master/pru_sw/utils/ directory. Also check that there isn’t already a ../pasm file owned by someone else.

That error message means just what it says: ld is trying to create a file name “pasm” one level up from where you ran “source linuxbuild” and can’t because of filesystem permissions.

(Aside: Sorry that your comment didn’t appear right away. This little obscure blog gets titanic levels of spam, and I have to resort to automated tools to keep it at bay. The anti-spam software wasn’t sure about your post, and hurled it into a moderation queue that I had to manually check.)
Thuy says:

May 5, 2015 at 06:31

Hi, I got an error message as below:
-sh: source: linuxbuild: file not found

I check all your steps and it seems that nothing wrong with my linux:
root@beaglebone:~/Desktop/am335x_pru_package-master/pru_sw/utils/pasm_source# source linuxbuild
-sh: source: linuxbuild: file not found
root@beaglebone:~/Desktop/am335x_pru_package-master/pru_sw/utils/pasm_source# source linuxbuild
-sh: source: linuxbuild: file not found
root@beaglebone:~/Desktop/am335x_pru_package-master/pru_sw/utils/pasm_source# source linuxbuild
-sh: source: linuxbuild: file not found
root@beaglebone:~/Desktop/am335x_pru_package-master/pru_sw/utils/pasm_source# ls -l /bin/sh
lrwxrwxrwx 1 root root 4 Mar 18 2013 /bin/sh -> bash
root@beaglebone:~/Desktop/am335x_pru_package-master/pru_sw/utils/pasm_source# which gcc
/usr/bin/gcc
root@beaglebone:~/Desktop/am335x_pru_package-master/pru_sw/utils/pasm_source# source linuxbuild
-sh: source: linuxbuild: file not found

Please tell me how I can fix it

Thanks

Thuy
dhenke says:

May 5, 2015 at 13:25

Thuy @#9:
I just re-downloaded the master.zip file from the GitHub link above. My pasm_source/ directory contains the following files:

dosbuild.bat*
.gitignore
linuxbuild*
macbuild*
pasm.c
pasmdbg.h
pasmdot.c
pasmexp.c
pasm.h
pasmmacro.c
pasmop.c
pasmpp.c
pasm.sln
pasmstruct.c
pasm.vcproj
path_utils.c
path_utils.h
pru_ins.h
test/
vsbuild.bat

Do you see the same list of files?

If not, try re-downloading the master.zip file, and make sure you unpack it using “unzip” on your Beaglebone. (If you unpack it elsewhere and move the files, there are lots of ways to lose files or get the paths wrong.)

If you do see that same list of files, please post the output of the following commands:

ls -l linuxbuild
head -1 linuxbuild

It’s a bit of a long shot, but it may be worth trying:

sh ./linuxbuild

(That’s sh space dot slash linuxbuild.)
Marcelo Gobetti says:

May 29, 2015 at 09:17

Hi! Thanks for the post. I was hoping to see how you managed to make the INTC mappings that you mention in your post, however your program does use only the standard EVTOUT0, without any extra mapping done there. I’m trying to understand why I can’t signal the ARM using any event other than EVTOUT0 and EVTOUT1. Moreover, EVTOUT1 signals a lot faster than EVTOUT0 and I’d like to understand that too. I’ve opened a google groups discussion on the topic too: https://groups.google.com/forum/#!topic/beaglebone/m3yJg6UXJPA
So far no one has even clicked on it so it seems the topic is just way too complex :(
dhenke says:

June 4, 2015 at 14:37

Marcelo @#11:
I posted a reply to your discussion on Google Groups, but I’ll summarize for anyone that happens to be reading here:

Your code did this:
MOV R31, pruX_vec_valid | EVTOUT0 MOV R31, pruX_vec_valid | EVTOUT1

My speculation was that this caused the following sequence of events:
* you assert valid and EVTOUT0
* PRU raises the EVTOUT0 interrupt line
* PRU clears R31
* you asset valid and EVTOUT1
* PRU raises the EVTOUT0 interrupt line and lowers EVTOUT1
* PRU clears R31

Thus, the interrupt line associated with EVTOUT0 is only high for (about) a single PRU clock, and the main CPU may or may not “see” it.
John DeLuca says:

September 5, 2015 at 17:42

Douglas, you have taken me from What-da-heck to Hello World regarding these PRUs. Followed your advice top to bottom and it works! More importantly, I know I have the tools and have begun to see the deeper logic. These page has saved me days of frustration!
Cheers and thanks,
-john
Ryan says:

October 10, 2015 at 11:29

Douglas, thanks so much.

-ryan
David R says:

March 8, 2016 at 19:36

Thanks for taking the time to put this documentation together, It got me out of a endless trial and error loop.
Phillip M says:

July 29, 2016 at 08:25

I can confirm that this works for

Linux beaglebone 4.1.28-bone-rt-r22 #1 PREEMPT RT Sun Jul 17 09:01:04 UTC 2016 armv7l GNU/Linux

The system image came from
https://rcn-ee.com/rootfs/bb.org/release/2015-03-01/lxde-4gb/BBB-eMMC-flasher-debian-7.8-lxde-4gb-armhf-2015-03-01-4gb.img.xz
but in order to avoid a prussdev open error, I had to change the kernel from 4.4.9-ti-r25 to 4.1.28-bone-rt-r22 as discussed in
https://groups.google.com/forum/#!category-topic/beagleboard/pru/3iJ-J-x0-Ko
where 4.1.28 was the latest 4.1.x kernel i.e.

apt-get install linux-image-4.1.28-bone-rt-r22
apt-get install linux-firmware-image-4.1.28-bone-rt-r22

The /boot/uEnv.txt contained

uname_r=4.1.28-bone-rt-r22
# -change- Enable access to many IO pins and resources
dtb=am335x-boneblack-emmc-overlay.dtb

Cheers,
-pm