Utilize IP core on APU in software applications
From RCSWiki
PowerPC Instruction Set Extension
Since user IP core is connected to FCB on APU of PPC405, there is no memory address for the registers in FCM core. So the only way to load data into FCM core is using PowerPC extension instruction set.
Here are some useful instructions:
- ldfcmx Load Double Indexed (Fabric Co-processor Module)
ldfcmx FCM5, rA, rB
An effective address (EA) is calculated by adding an index to a base address, which is formed as follows:
- The contents of register rB are used as the index.
- The contents of register rA are used as the base address. If rA is 0, then 0 is used as the base address.
Two words referenced by EA and EA + 4 are loaded into register(s) inferred by FCM5.
- lqfcmx Load Quad Indexed (Fabric Co-processor Module)
lqfcmx FCM5, rA, rB
An effective address (EA) is calculated by adding an index to a base address, which is formed as follows:
- The contents of register rB are used as the index.
- The contents of register rA are used as the base address. If rA is 0, then 0 is used as the base address.
Four words referenced by EA through EA + 12 are loaded into register(s) inferred by FCM5.
- stdfcmx Store Double Indexed (Fabric Co-processor Module)
stdfcmx FCM5, rA, rB
An effective address (EA) is calculated by adding an index to a base address, which is formed as follows:
- The contents of register rB are used as the index.
- The contents of register rA are used as the base address. If rA is 0, then 0 is used as the base address.
The contents of register inferred by FCM5 is stored into the address referenced by EA. The source register is expected to be 64-bit.
The notation FCM5 in this document indicates a five bit immediate value. The interpretation of the value is left to the Fabric Coprocessor Module (FCM). Typically this would be the register value on the FCM.
For more detailed information about PowerPC Instruction Set Extension, please refer to Xilinx PowerPC Instruction Set Extension Guide Media:Edk71i ppc405 isaext guide.pdf.
Sample Stand Alone Application Code
- Here is a simple stand alone test application for the Double Precision Floating Point Unit on APU, which will load value from src[0] and src[1] into FPU to do the calculation, and store the result back to dst[0], then print out the result.
///////////////////////////////////////////////////////////////////////////////////
//Name: Double Precision Floating Point Unit on APU stand alone test application //
//Author: Liu Hu //
//Email: lhu8@uncc.edu //
//Last modified Date: 02/22/2009 //
///////////////////////////////////////////////////////////////////////////////////
#include "xbasic_types.h"
#include "xcache_l.h"
#include "xparameters.h"
#include "xpseudo_asm.h"
#include "xutil.h"
#include "math.h"
#include "stdio.h"
// Assembly mnemonics
#define lqfcmx(rn, base, adr) __asm__ __volatile__(\
"lqfcmx " #rn ",%0,%1\n"\
: : "b" (base), "r" (adr)\
)
#define stdfcmx(rn, base, adr) __asm__ __volatile__(\
"stdfcmx " #rn ",%0,%1\n"\
: : "b" (base), "r" (adr)\
);
// Data structures
double src[4] = {11,12,3,20};
double dst[2] = {23,32};
int main(void)
{
// initialize caches
XCache_EnableDCache(0x80000001);
XCache_EnableICache(0x80000001);
// initialize APU
mtmsr(XREG_MSR_APU_AVAILABLE);
printf("Starting FCM load and store test1 \r\n");
// print contents of src[0], src[1]
printf(" Value of src: \r\n");
printf(" src[0] = %f\r\n", src[0]);
printf(" src[1] = %f\r\n", src[1]);
lqfcmx(0, src, 0); // load quad word from src[0] to FCM load reg 0
printf(" Load Ok! \r\n");
stdfcmx(0, dst, 0); // store double word from FCM store reg 0 to dst[0]
// print contents of dst[0] after the store
printf(" Final value of dst: \r\n");
printf(" dst[0] = %f\r\n", dst[0]);
return 0;
}
References
- PowerPC Instruction Set Extension Guide
- Xapp717 Accelerated System Performance with the APU Controller and XtremeDSP Slices
- Xilinx DS335 Floating-Point Operator v4.0
