Homebrew Coding: 2008

Monday, December 15, 2008

Upgrading to devkitARM r24 (on Linux)

The latest version of the devkitARM toolchain was released last week, and it was a biggie. A whole new API for sprites and backgrounds was added to libnds. A new default ARM7 binary was added that automatically handles wifi, sound and sleep mode. The other big addition was a new fangled sound library that adds mod playback and fancy sound effects.

This is my post on the upgrade process for Bunjalloo on Linux (Ubuntu 8.04 in particular). Hey, if you read all this and get through it, then you could probably help out hacking on Bunjalloo too :-)

A summary for the impatient: not without hassles, but the gain is worth the short term pain.

I assume you have a directory devkitPro somewhere. This will contain devkitARM and libnds. You should set an environment variable DEVKITPRO to point to this directory. Something like this will do:

export DEVKITPRO=$HOME/devkitpro_r24
mkdir -p $DEVKITPRO

I recommend versioning this release at the devkitpro directory level. Release r24 contains some breaking changes and by having the possibility to change between r23 and r24 you may save yourself some headaches. At the very least, it will ensure you don't mix versions, which would be a big no-no.

Download the devkitARM r24 files from the sf.net project page
- devkitARM_r24-i686-linux.tar.bz2
- libnds-src-1.3.1.tar.bz2
- dswifi-src-0.3.5.tar.bz2
- maxmod-src-1.0.1.tar.bz2
- default_arm7-src-20081210.tar.bz2
- nds-examples-20081210.tar.bz2
- libfat-src-1.0.3.tar.bz2
I saved them all in $HOME/Downloads, and that's what I've put in this post. Change that path as you need.

Install the devkitARM toolchain. This provides the compiler and C/C++ standard libraries. I usually install it in a versioned file, but if you already have named your devkitpro directory "devkitpro_r24" then there's no need.
```
cd $DEVKITPRO
tar xvf ~/Downloads/devkitARM_r24-1686-linux.tar.bz2
mv devkitARM devkitARM_r24
```

Set the DEVKITARM environment variable to point to our installed toolkit
```
export DEVKITARM=$DEVKITPRO/devkitARM_r24
```

Install libnds
!!! CARE !!! these source tars don't have a top level directory, so you need to create them manually. This is the case for all of the devkitPro tar balls, except the main devkitARM_r24 one. They will splurge their files in the current directory, instead of creating their own top level one. Yuck!
```
mkdir libnds-1.3.1
cd libnds-1.3.1
tar xjf ~/Downloads/libnds-src-1.3.1.tar.bz2
```
Now to compile libnds - if you have DEVKITPRO and DEVKITARM set correctly then this should compile the library succesfully.
```
make -j 3 install
```

libfat is optional, but recommended. Open up the tar, compile and install it too, if you like. libfat for the nds depends on having libnds installed, so you'll have to do step 4 first.
```
mkdir libfat-1.0.2
cd libfat-1.0.2
tar xvf ~/Downloads/libfat-src-1.0.2.tar.bz2
make nds-install
```

Now install dswifi 0.3.5 in a similar way - untar the release files, compile and install. dswifi depends on libnds, so you have to do the previous steps before this one.
```
mkdir dswifi-0.3.5
cd dswifi-0.3.5
tar xvf ~/Downloads/dswifi-src-0.3.5.tar.bz2
make -j 3 install
```

Install maxmod. Again, this depends on libnds and won't compile if you have skipped a step.

mkdir maxmod-1.0.1
cd maxmod-1.0.1
tar xvf ~/Downloads/maxmod-src-1.0.1.tar.bz2
make -j 3 install-nds

Install the new default ARM7 core. This requires dswifi and maxmod, if either are missing then you will get compile or link errors.
```
mkdir default_arm7-20081210
cd default_arm7-20081210
tar xvf ~/Downloads/default_arm7-src-20081210.tar.bz2
make install
```

Try out the nds examples! Now that everything is installed, you can compile and run some examples.

mkdir nds-examples-20081210
cd nds-examples-20081210
tar xvf ~/Downloads/nds-examples-20081210.tar.bz2
cd audio/maxmod
make
# test the nds files on your DS!

Write your own program :-) This part is a bit trickier!

Now once I had all this installed, I needed to update Bunjalloo to use the new code (Elite DS coming soon!). The first step was to see what would compile without drastic changes. The list of changes required in my code was:

Remove irqInit() calls - this is now done by libnds before your main is called
Register name changes: DISPLAY_CR -> REG_DISPCNT, BG0_X0 -> REG_BG0HOFS, BG0_CR -> REG_BG0CNT (and their SUB equivalents, where applicable)
Function name changes: powerON, powerOFF -> powerOn, powerOff, touchReadXY() -> touchRead(touch_structure)
Deprecated header: nds/jtypes.h -> nds/ndstypes.h
Look-up table changes: COS[angle] -> cosLerp(angle), SIN[angle] -> sinLerp(angle)
Some #defines have gone: BG_16_COLOR -> BG_COLOR_16, BG_256_COLOR -> BG_COLOR_256, SOUND_8BIT -> SOUND_FORMAT_8BIT
powerOn/Off now expects a PM_Bits enum value, not an int

Nothing major there... the big breaking changes are on the ARM7 side. Here trying to use your own hand-coded arm7 causes plenty of problems - most of the old inter-processor communications code has been removed from libnds, or has changed completely. Unfortunately, that means it's quite tricky to migrate to the new libnds and use your own ARM7 core. The easiest thing to do here is ignore your ARM7 code completely and use the new default_arm7. There's no one-stop solution, since most IPC code has been built in an ad hoc way, but here are some pointers.

For wifi code, your old wifi init code should be replaced by a single call to Wifi_InitDefault(true/false) - no need to faff about setting up the interrupts and timers. Then you can either now use sockets (if you passed "true", which means connect using the firmware settings) or connect using a detected AP first - all this is documented in the dswifi headers and by the new examples. Pretty neat, and cuts down on maintenance for everyone.

If you have a debug console, then consoleInitDefault() is a bit trickier to update. It has disappeared, pretty much. In most cases consoleDemoInit() will probably suffice. If you are doing anything much more complicated, then you will have to get to grips with the PrintConsole structure. This seems to be quite a powerful new feature, for example it allows printing to both screens from within the same program, but will take a while to get used to. I doubt many people used the old consoleInitDefault in "real" programs, and the new API looks like it might be possible to use the printf stuff on something other than a black/white screen.

Sound has seen a rather large shakeup in this release. The new MaxMod sound engine seems to fill a huge gap in the homebrewer's library arsenal. It isn't without its drawbacks though, if you did just basic NDS sound effects. Previously you could play individual samples by sending the raw data to the playGenericSound() function, after having previously set the sample rate, volume, panning and format via the setGenericSound() function. That has now been completely removed, replaced by new MaxMod sound engine functions. This is a flexible sound and music system, but requires that the sound effects are in a special format. A tool (mmutil) is provided to convert wav and mod file formats to the expected static structure. Alternatively, a more complex streaming system can be used. This latter does not require input sounds to be converted, and is the only way to play sound from a file, but is slightly trickier to use than the straight forward playGenericSound() function.

The sound streaming approach requires you the coder to implement an audio-filling callback and to call the mmUpdateStream() function often enough to keep the sound buffer full. Fortunately there are some good examples of all the MaxMod library's usage in the nds-examples pack, and the new system is a great addition. Besides music, you can do looped samples, proper panning, mixing, etc, etc. Bah, I'm just miffed because I'll have to rewrite some of my SDL sound code ;-)

Needless to say, on the ARM7 side of things the sound handling stuff has completely changed - this alone is a good enough reason to abandon any hand-rolled arm7 code. Added to the new sleep function - no more will-it-wont-it wondering when you close the lid - and all in all I think this is a great release.

Monday, December 01, 2008

Things you never wanted to know about assembler but were too afraid to ask

When you write homebrew for the Nintendo DS, or the Game Boy Advance, you can choose to use a friendly language like C or C++. This makes life a bit easier. It's comfortable, familiar. But at the back of your mind there's this nagging doubt... Shouldn't I be writing some of this in assembler? After all, if I'm writing C code, I may as well code for a PC and use SDL or Allegro. But then you think... Assembler, that sounds a bit tricky, I wouldn't know where to begin!

And that's where this post comes in.

Getting started with assembler is probably the hardest part. There's aren't many articles on best practices, what not to do, which idioms to use, and so on, as there are for other languages. Many of the ideas you have about C can't be applied to assembler, or at leas that's what you might think at first, so you probably brush it off as impossible. This is a bit sad, because these Nintendo consoles use an ARM processor and ARM assembler has a nice syntax. Unlike assembler for Intel processors, say, the ARM syntax is fairly small, has few real surprises and can be learned quite easily.

Now, I don't want to claim that I'm some assembler guru. I've written about 6,000 lines of ARM for Pocket Beeb and Elite AGB, plus some in a current "top secret" project I'm working on. It's not really that much, especially compared to the millions of lines of C and C++ I've probably written. But it is a good enough amount to get a feel for how to code in this way, I think.

Before we dive in, let's see at how we go from C to "the magic" that runs on a DS. The first step is to write a function in C. We then use a compiler, which converts this to a file in the ELF format. This binary file is joined together with all the other compiled modules in your program, and the output is a "pure" binary blob with some header information. So that's C to object file, then object files to NDS "ROM" file.

Let's rewind to those first steps. The C function is compiled to an ELF file, which typically ends in ".o", for "object file". This step actually skips out quite a lot. You can see the intermediate files involved by passing --save-temps to your compilation step. Lets try this. Here's a useless function that does some pointless thing. We pass in a couple of variables, and add or subtract them depending on their relative values.


int my_function(int x, int y) {
  if (x < y) {
    return x + y;
  } else {
    return x - y;
  }
}

If we compile this as so: "arm-eabi-gcc --save-temps -c -o my_function.o my_function.c" then the output will be 3 files. These are my_function.o (as expected), my_function.i and my_function.s. The .i file is the post-processed output - what happens after running the C preprocessor on the file. This finds-and-replaces any #defines, pastes in any #include'd files and adds in compiler-specific information. The .s file is what we are after. This is the output of the C code converted to ARM assembler. It contains a lot of "cruft" that you wouldn't write if you were to write this function by hand, but some of the features are important:


	.text
	.align  2
	.global my_function
	.type   my_function, %function
my_function:

The lines that start with a dot are called directives. They are psuedo-instructions that tell the assembler to do extra things.

The .text directive adds headers to let the linker know where abouts the code should be placed. It's not too important on the DS, more so on the GBA, where if you omit it then the code may be placed in RAM. You can also force code to go into fast iwram on the GBA by using the directive .section .iwram,"ax",%progbits. Don't ask what it means, just use the voodoo.

The .align directive ensures that our code is padded to 4 byte boundaries. The "2" means the number of bits that must be zero in the location counter at this point. So 2 bits signifies aligned by 4 bytes. Since ARM assembler instructions are 32 bits big (4 bytes) we need to align the code to 4-byte boundaries, or Very Bad Things can happen.

The .global directive makes the symbol name (my_function) visible outside the compilation unit, so you can use it in other parts of your program. Without .global, the symbol would be the equivalent of using "static" in a C definition.

The .type directive is pretty pointless, but I tend to use it to mark what are functions and what are just helper "scraps" of assembler with a named label. It's use is supposed to be for interoperability with other assemblers, but we always use GCC from devkitPro so its a moot point.

Finally, "my_function:" is a label - this is like a goto label in C and is where the program jumps to when we call this function.

The rest of the ARM code spit out by GCC is as good as could be for this example, if you compile with -O2. Normally, the advantages of assembler tend to be minimal for small functions, they are better when you do things that can't be easily done in C. Such as unrolling loops, keeping often-used memory addresses in a single register, that kind of thing. Also, hand coded ARM assembler can make better use of the registers in some cases, more on registers in a minute, but GCC assembler tends to make more use of the stack. This tends to manifest itself only in more complex functions. Anyway, lets just say that it'd complicate matters needlessly to copy paste the rest of the code here, and for this example GCC does a good job.

Registers

I mentioned the registers back there. The ARM processor in the NDS and GBA has 16 registers - usually named r0, r1, r2... up to r12, then sp, lr, pc instead of r13, r14 and r15. I say usually, because there is an alternate naming scheme that some documents use. Here certain registers are given special names:


  std | alt
 -----+-----
  r0  | a1
  r1  | a2
  r2  | a3
  r3  | a4
  r4  | v1
  r5  | v2
  r6  | v3
  r7  | v4
  r8  | v5
  r9  | v6
  r10 | v7 or sl
  r11 | v8 or fp
  r12 | ip
  r13 | sp
  r14 | lr
  r15 | pc

It's probably best to use the r0-r12,sp,lr,pc naming scheme as that is what the ARM documentation uses. The trick is to pick one scheme and stick to it. sp is a mnemonic for "stack pointer" and points to, you guessed it, the stack. lr means "link register" and is used as the return address for function calls. pc is the program counter and is the address of the current instruction (i.e. where we are in the program).

The stack is an area of RAM. Initially it points to the end of the RAM area and grows downwards as programs allocate memory (on the stack). When writing assembler I find it's not that usual to use the stack - generally I only use it for storing register values that I may need later, such as the lr when calling functions recursively.

Code set up

Remember how gcc spat out a .s file? Well, that means "pure assembler code". The common way to store ARM code is in a file ending in ".S". This is interpreted by gcc as a source file containing assembler and pre-processor directives. So here we can use #defines and #includes if we really need to. It also distinguishes hand coded assembler files from any --save-temps left-overs.

Lets try writing our my_function in assembler. First I fire up an $EDITOR and write out the preamble to my_function.S. Now lets think about what we want. This function has the following C prototype:

int my_function(int, int);

So that means it accepts 2 input values and returns a value. The input values in ARM are in the first 4 registers - any more than 4 inputs are stored on the stack. The original function wanted an addition if x was less than y, else it subtracted y from x. Here's my final code to do this:


	.text
	.align  2
	.global my_function
	.type   my_function, %function
my_function:
	cmp	r0, r1
	addlt	r0, r0, r1
	subge	r0, r0, r1
	bx	lr

The header is as we saw before. The my_function label is the start of the function. The r0 and r1 registers contain the input values.

The cmp instruction means "compare r0 to r1 and set the flags accordingly"... now I don't really want to copy out the whole ARM assembler guide here, but suffice to say that this sets some flags based on the mathematical result of performing "r0 - r1". These tests are in the form of those "lt", "ge" that follow the end of the other instructions. This is one of the neat parts about ARM - every instruction is conditional!

The default condition is "al" for "always", but may be omitted for obvious reasons. We don't want to write al after every instruction, right? So here we say, if r0 is less than (lt) r1, then add r0 to r1 and store the result in r0. A normal add would just be "add", but here we write "addlt". This makes the processor skip over the instruction if the "less than" state is not set. It's important to note that conditional instructions like this still have some overhead, the processor still has to read the operation in some way, so if you have more than say 3 or 4 conditional instructions with the same condition, it may be faster to use a branch to skip the code completely.

If the result was greather than or equal (ge) then the subtraction is done and the result stored in r0, followed by a return. The bx instruction is used to return from a function - it means "branch to the address in the register lr". As you see, we use the link register to return to the calling code.

Finally, the result is always stored in r0 in this example. This is part of the ARM binary interface - results are returned in registers r0 and r1. This convention allows C and ARM code to interoperate sensibly. If you are returning to your own ARM function, then you can invent any convention you like - return values in r4, r7 and r11 if you want - but sticking to the "C way" means that you can use the function from C later, if that becomes necessary.

You may have noticed that addition and subtraction seem to work "backwards" compared to C convention. Instead of reading left-to-right and the result being stored at the end, the result is actually stored in the first register given. This is fairly common in other flavours of assembler code too, and once you get used to it, it isn't so bad.

We can write a simple test program to run this code:

#include <nds.h>
#include <stdio.h>

int my_function(int x, int y);

int main(void) {
 consoleDemoInit();
 int x = 1;
 int y = 2;
 iprintf("my_function(%d, %d) returns %d\n", x, y, my_function(x, y));
 x = 100;
 y = 20;
 iprintf("my_function(%d, %d) returns %d\n", x, y, my_function(x, y));

 while(1) { swiWaitForVBlank(); }
 return 0;
}

This prints out the results on the DS screen. Nothing very exciting, but you can see that combining C and ARM assembler in a project is not as difficult as you first thought!

Tips and tricks

There are quite a few traps for the unwary - apart from the shift in thought process needed to code assembler, of course. The first of these is that loading a value into a register requires some thought. You can only load values with "mov" that are shifted 8 bit values - 0xff0, 0x1c00, but not 0x101, for example. There is a psuedo opcode to get around this easily - "ldr r1, =0x101" for example - but it may bite you if you didn't know this. The error would look like this "Error: invalid constant (101) after fixup", just in case you were wondering.

When you write ARM, always try to maximise what each instruction does! Shifts can be added onto the end of instructions, so often instead of a shift then logical orr, you can do the lot in one go:

orr r3, r0, r1, lsl r2

This is the same as r3 = r0 | (r1 << r2) all in one instruction! Similarly you can load data from structures using offsets with ldr, with "ldr r0, [r1, r2, lsl #2]". That means r0 = r1[r2*4] - useful for loading 32-bit values from structures into registers. Remember that in assembler pointer arithmetic works as if all pointers were to char, there are no data types. Adding 1 to a pointer really just adds 1 to it, not like in C where it can increase by the size of the data type.

Generally, make use of all the registers you have available and avoid pushing/popping to and from the stack. You can use r0-r12 however you want, make the most of these and you'll find some nice shortcuts that provide speedy, optimised code. Hopefully. If you need to push multiple registers to the stack, then the idiom to use is the stmfd/ldmfd one (store/load multiple full descending). This stores a comma seprated list and/or range of registers to the stack, and decrements/increments the stack by the right amount. It is written as follows, to store here registers r0 to r7 and lr (r14):


	stmfd sp!,{r0-r7, lr}
... do stuff with registers r0-r7 and lr ...
	ldmfd sp!,{r0-r7, lr}

Always try and reduce duplication - not only is it a maintenance problem, but each copied line is a wasted cycle. Try and reduce code to the minimum number of instructions, that's the name of the game here. Ah, but remember: get a working version first, then optimise heavily. No point having a fast bit of code that doesn't do what you want :-)

I'd recommend this PDF cheat sheet for a quick guide to the ARM instruction set. Keep in mind that the DS doesn't support all the instructions on there - rbit, bfc, and some others - but they are generally the more exotic operations that are used less. If in doubt, compile a quick test file. You'll get the error "Error: selected processor does not support..." if the instruction is not supported by the CPU.

Conclusions.

So why would you want to do all this? Isn't this just for head cases?

I think learning how to write code at this level leads to a better understanding of why things are done in certain ways in high level languages. You'll certainly grok pointers, if you haven't already. Learning new coding styles - not just using C-like languages - will make you a better coder too. Heck, maybe a better person even ;-)

Friday, November 07, 2008

Recovering history with git reflog

I use git-svn quite a bit at work. One of the nice things about this is that I can create perfect patch sets - instead of commit a whole raft of random chunks, I can polish a change until it works. Quite often I'll reset the HEAD to a few commits back, to get rid of changes that add crap like "printf" or were just bug-fix dead ends. You know, the usual cargo cult "I'll change this to see what happens", that once you understand the problem realise had nothing really to do with the root cause of the problem.

One of the side effects of this is that, after a while I tend to end up with lots of "dead" commits. While using gitk I can see these as little commit balls that aren't joined to anything, like the headless branchy thing in the image. If I close gitk, or click the "Reload" option then the commit tree is cleaned up to show only commits that have a head.

Occasionally, once I've finished a change, squashed all my commits and committed them to subversion, it turns out that one of those lost commits really should have been part of the patch set. Oops! I've closed gitk by now, so all those lost commits can't be seen. How can I get them back? Here's where git reflog comes to the rescue.

I didn't discover the wonders of git reflog until quite recently. Its man page is not too helpful in describing what it does: "Manage reflog information". What it actually does is print out a list of commits that are in a branch, or are in limbo. It doesn't distinguish between the two. So you can use it, together with gitk, to show your lost commits. For example I use this:

gitk --all `git reflog | cut -c1-7`

That's it. It shows all branches (--all) and all the commits in the reflog, so all those lost commits are reachable, cherry-pickable and branchable again. I just wanted to share that with the internet, because I know I'll forget it by next week :-)

Monday, October 13, 2008

Alternatives to Make, Part II

Last time I described how somebody insane enough could use CMake for their cross compiling needs. This entry I'll consider the next entry on the list of interesting Make replacements: SCons.

SCons

When using SCons as your build system you have 2 options: install scons on the system or include a version in your source tree. The easiest way is to install a version of scons on your system, and mark your build scripts that they require that version or newer of scons to run. This does put a burden on end users as they must appropriate their own copy of SCons, but it means your repository of code isn't full of 3rd party stuff. Additionally, the machine doing the compiling must have Python installed. On Linux this is no problem, but on Windows it's Yet Another Dependency. This is because SCons, unlike CMake, uses an already-existing programming language to describe the build process. In this case, Python. The build scripts are called "SConstruct" and, apart from some oddities, work like normal Python modules.

So on to the cross compiling issue - compiling our C or C++ code into a DS program. Unlike CMake, there's no set way to handle cross compilation. In fact, there are no real "best working practices" outlined anywhere in the (quite extensive) SCons documentation - you're free to do what you like. My recommendation here is to take a leaf out of CMake's book and separate out as much as possible into a toolchain file.

As with the CMake example, let's start off with the ansi_console example from devkitPro. This has a source sub directory and a main.c file. First, create a SConstruct file that will read in a toolchain called "arm":

env = Environment(tools='arm', toolpath='.')

The toolpath means "look for the file in the current directory". The toolchain file is a normal python module. This means a file called "arm.py". In order to make it a true SCons tool file we need to add 2 functions, "generate" and "exists". So that would be:

def generate(env, **kwargs):
   pass

def exists(env):
 return 1

I get the feeling that exists() is not actually used - it certainly isn't in SCons 1.0.1, but the documentation says it is required. The generate function is called when we create the Environment, passing the Environment into the generate function so we can mess about with the innards. So lets do the usual setup, which is to check DEVKITARM and DEVKITPRO are set in the user's environment. SCons by default doesn't pull all the environment variables into its own Environment object. This is good, as we don't really want a build to depend "randomly" on some rogue variable. But! we do want to use some of them. Anyway, on to the code:

from os import environ
from os.path import pathsep,join
def check_devkit(env):
   ENV = env['ENV']
   for var in ('DEVKITARM', 'DEVKITPRO'):
       if var not in environ:
           print 'Please set %s. export %s=/path/to/%s'%(var, var, var)
           env.Exit(1)
       ENV[var] = environ[var]
   ENV['PATH'] = ENV['PATH']+pathsep+join(environ['DEVKITARM'], 'bin')
   if not find_devkitarm(env):
       print 'DevkitARM was not found'
       env.Exit(1)

def generate(env, **kwargs):
   check_devkit(env)

That is pretty straight forward - it checks the environ(ment) and adds $DEVKITARM/bin to the path. Now we need to find out if we have a suitable compiler. Here things get a bit icky. Because SCons isn't a priori designed for cross compilation, it assumes that your GNU-based compiler is called just "gcc". This means that, in order to X-compile, you'll need to install gcc, since we use the gcc tool detection and environment set up and overwrite parts with our arm-eabi-gcc. A bit of an irritating flaw, and one which CMake has understood and implemented correctly. The alternative, of course, is to copy paste the entire SCons gcc tools and replace gcc with arm-eabi-gcc - or suitable prefix. In fact, this alternative approach may well work out better... anyway. For now we'll use the approach that assumes vanilla gcc and overwrites with arm-eabi.

def setup_tools(env):
   gnu_tools = ['gcc', 'g++', 'gnulink', 'ar', 'gas']
   for tool in gnu_tools:
       env.Tool(tool)
   env['CC'] = prefix+'gcc'
   env['CXX'] = prefix+'g++'
   env['AR'] = prefix+'ar'
   env['AR'] = prefix+'as'
   env['OBJCOPY'] = prefix+'objcopy'
   env['PROGSUFFIX'] = '.elf'

def generate(env, **kwargs):
   check_devkit(env)
   setup_tools(env)

So that sets up the environment for compiling, then overwrites the tool names with the arm-eabi equivalent. We also set the PROGSUFFIX (program suffix) to .elf - this makes life easier for the objcopy step. Now we need the "magic flags" that cause our Nintendo DS program to compile.

def add_flags(env):
   # add arm flags
   env.Append(CCFLAGS='-march=armv5te -mtune=arm946e-s'.split())
   env.Append(CPPDEFINES='ARM9')
   env.Append(LIBS='nds9')
   env.Append(LINKFLAGS=['-specs=ds_arm9.specs'])
   # add libnds
   libnds = join(environ['DEVKITPRO'], 'libnds')
   env.Append(LIBPATH=[ join(libnds, 'lib')])
   env.Append(CPPPATH=[ join(libnds, 'include')])

def generate(env, **kwargs):
   check_devkit(env)
   setup_tools(env)
   add_flags(env)

These flags are the usual ARM9 flags and libnds include path for compiling C code, and libnds9 and the specs flag when linking. Without these devkitArm complains (this is as expected, it is a multi-platform ARM compiler, so you need to tell it the exact platform you want to use). There's one more step here - we need a way to build the nds file from an .elf file - but first lets go back to the SConstruct. If you recall, at the top of the post there, we had just created an Environment. Now we have added a load of ARM stuff to that Environment. The next step here is to create a program. In SCons we can do this as follows:

env.Program('ansi_console', os.path.join('source','main.c'))

That's it! Except, no it isn't. This produces ansi_console.elf, which needs to be objcopy'd and ndstool'd. To do that we can go back to our arm.py tool file. Here we add in new "Builders" to do the work. A Builder is like the Program method we saw in the SConstruct - it takes the names of the source files and produces the outputs... so we add in the builders to our arm.py file:

def add_builders(env):
   def generate_arm(source, target, env, for_signature):
       return '$OBJCOPY -O binary %s %s'%(source[0], target[0])
   def generate_nds(source, target, env, for_signature):
       if len(source) == 2:
           return "ndstool -c %s -7 %s -9 %s"%(target[0], source[0], source[1])
       else:
           return "ndstool -c %s -9 %s"%(target[0], source[0])
   env.Append(BUILDERS={
       'Ndstool': SCons.Builder.Builder(
                   generator=generate_nds,
                   suffix='.nds',
                   src_suffix='.arm'),
       'Objcopy': SCons.Builder.Builder(
                   generator=generate_arm,
                   suffix='.arm',
                   src_suffix='.elf')})
def generate(env, **kwargs):
   [f(env) for f in (check_devkit, setup_tools, add_flags, add_builders)]

That's a fair amount of typing! What does it do? Well, the "generate_arm" function is called when we want to generate an .arm file from an .elf file. It returns a SCons-y string that will be executed to do the actual work. Here it is our old OBJCOPY string - the $OBJCOPY is replaced automagically by SCons with the equivalent Environment variable. The "generate_nds" function is called when we want to generate an .nds from an .arm, it too returns the command line that will be executed. There's a bit of a trick there that checks if we need to combine an ARM7 and ARM9 core, or just use the default ARM7, but apart from that it is straightforward. The "env.Append(BUILDERS=...)" bit creates new functions called Ndstool and Objcopy that can be used like Program. Passing in a generator function means we use functions - you could use a fixed string and pass it as an action too.

Armed with our new methods, lets go back to the SConstruct. We can objcopy and ndstool the Program/elf file as follows:

env.Ndstool(env.Objcopy(env.Program('ansi_console', join('source','main.c'))))

That's it. There are a couple of things we can do to make this better. For example, rather than splurge the build artifacts all over the source tree, we can use a build directory. To do this we need to use SCons recursively. The Recursive file is called a "SConscript", and all we have to remember is that to pass objects (like the env created in the top level SConstruct) down to the other SConscripts, we have to use an Export/Import mechanism. A bit confusing, but the code's easy enough:

#in SConstruct
env = Environment(tools=['arm'], toolpath=['.'])
SConscript(join('source','SConscript'),
           variant_dir='build',
           duplicate=0,
           exports='env')

# in source/SConscript
Import('env')
env.Ndstool(env.Objcopy(env.Program('ansi_console', 'main.c')))

Passing exports to the SConscript file exports the named variables. Using Import imports them into the local namespace. Magical! There's also a Return() function to return variables from sub scripts. That's usefull when tracking local library dependencies.

OK, so what about "dual core" programs? It turns out that it is a bit of effort. More even than CMake, I'd say. We can either create a second Environment and set up the variables for ARM9 or ARM7 according to the core in question, or use a single Environment instance, set flags that should be used for ARM9 or ARM7, and use these appropriately for each Program call that we make. The first approach ends up a bit messy, with ifs for processor type in several places. The second approach is cleaner, but means there is a bit more code used when calling env.Program. I use the latter approach in Elite DS, so that's what I'll outline here. You have to set the compiler flags for arm9 and arm7 in the env variable. This can be done as follows, modifying the add_flags() function of our arm.py tool:

THUMB_FLAGS = ' -mthumb -mthumb-interwork '
PROCESSOR_CFLAGS = {
 '9': ' -march=armv5te -mtune=arm946e-s',
 '7': ' -mcpu=arm7tdmi -mtune=arm7tdmi'
}

PROCESSOR_LDFLAGS = THUMB_FLAGS + \
       ' -specs=ds_arm%c.specs -g -mno-fpu -Wl,-Map,${TARGET.base}.map  -Wl,-gc-sections'
EXTRA_FLAGS = ' -Wno-strict-aliasing -fomit-frame-pointer -ffast-math '

def add_flags(env):
   ccflags = ' '.join([EXTRA_FLAGS, THUMB_FLAGS])
   CCFLAGS_ARM9 = ' '.join([ccflags, PROCESSOR_CFLAGS['9']])
   CCFLAGS_ARM7 = ' '.join([ccflags, PROCESSOR_CFLAGS['7']])
   CPPDEFINES_ARM9 = 'ARM9'
   CPPDEFINES_ARM7 = 'ARM7'
   LIBS_ARM9 = ['fat', 'nds9']
   LIBS_ARM7 = ['nds7']
   LINKFLAGS_ARM9 = PROCESSOR_LDFLAGS%'9'
   LINKFLAGS_ARM7 = PROCESSOR_LDFLAGS%'7'

   env.Append(CCFLAGS_ARM9=CCFLAGS_ARM9)
   env.Append(CCFLAGS_ARM7=CCFLAGS_ARM7)
   env.Append(CPPDEFINES_ARM9=CPPDEFINES_ARM9)
   env.Append(CPPDEFINES_ARM7=CPPDEFINES_ARM7)
   env.Append(LIBS_ARM9=LIBS_ARM9)
   env.Append(LIBS_ARM7=LIBS_ARM7)
   env.Append(LINKFLAGS_ARM9=LINKFLAGS_ARM9)
   env.Append(LINKFLAGS_ARM7=LINKFLAGS_ARM7)

   # add libnds
   libnds = join(environ['DEVKITPRO'], 'libnds')
   env.Append(LIBPATH=[ join(libnds, 'lib')])
   env.Append(CPPPATH=[ join(libnds, 'include')])

def generate(env, **kwargs):
   [f(env) for f in (check_devkit, setup_tools, add_flags, add_builders)]

Most of those flags are taken from the devkitPro examples and should be pretty familiar. In order to actually use them, we can override the flags used for each individual program. So in the SConscript, the env.Program line would become:

env.Program('ansi_console', os.path.join('source','main.c'),
       CCFLAGS=env['CCFLAGS_ARM9'],
       CPPDEFINES=env['CPPDEFINES_ARM9'],
       LIBS=env['LIBS_ARM9'],
       LINKFLAGS=env['LINKFLAGS_ARM9'])

More typing, but there's no need to pass extra Environments about. The equivalent for an ARM7 program would obviously replace the 9's for 7's.

As before with my CMake entry, here is a quick summary to help you decide whether or not to use SCons as your build system for cross compiling NDS programs.

The Bad

Scalability: I haven't mentioned this yet, but the biggest problem SCons has is that when you reach several hundred source files, the time that it takes to parse the SConscripts and generate dependencies becomes considerable. If you also use SCons autoconf-like Configure functions, then the configuration step is, by default, run each and every time you compile. The results are cached, but it takes several seconds to go through the tests. This became an issue for me on Bunjalloo, which used to use SCons. Before compiling anything SCons would typically sit around in silence for 15 seconds contemplating the build. I've seen builds that do "nothing" for minutes at a time that, when changed to CMake or whatever, did similar thinking time in seconds. KDE and OpenMoko are two famous SCons "deserters" due to performance problems. On the other hand for Elite DS, which only has around a hundred source files, the do-nothing time is negligible and SCons works great.

Not really cross-compiling friendly: Unlike CMake, SCons is not built with cross compiling in mind. This is demonstrated by the hard-coding of "gcc", etc, in the gcc.py SCons-tools. This means you probably will have to install native tools as well as the cross tools in order to compile anything. (Note: the new 1.1.0 may have fixed this, it was released as I was writing this post!)

Syntax can get complicated: When writing scripts that use multiple local libraries, the Return() and Import/Export() mechanisms that SCons uses can get a bit unwieldy. You end up with lots of global variable names that you have to Import() and Export() across SConscripts, or else Return() everything to the parent file and let it sort out the mess.

Python can look intimidating: Unlike CMake, which still looks a bit Makefile-like, SCons scripts can have any old Python code in it. Without discipline, this can result in build scripts that are difficult to follow, especially if Python is not one of your strong languages.

Dependencies: As with CMake, you still need to install stuff - namely the SCons build system itself. As more projects use scons this will become less of a problem, but at the moment it can be annoying for users ("oh no! where is my Makefile?")

The Good

Python code: If you are familiar with Python already, you don't need to learn yet another language. There are no hacks to get loops working. Proper data structures can be used to describe the build. Any of the Python libraries can be imported and used. This is a very powerful advantage, as well as being multi-platform (providing you stick to cross platform safe python code, of course).

Stable API: SCons is backwards compatible with the practically stone-age Python 1.5.2 and its APIs for building things changes infrequently, first deprecating functions and rarely removing anything. This makes changing from an older version to a newer one fairly painless. A full suite of tests means that regressions on new versions are pretty rare - if something goes wrong, it is likely to be due to a bug in your SConscript, rather than a bug in SCons.

Great documentation: There is loads of good documentation on the SCons website. The manual is particularly well done, with an easy step-by-step guide. Once installed, a very complete man page is installed (on Linux, FreeBSD, etc at least) that contains examples as well as the full API.

Conclusion

As with CMake, you probably need a good reason to not use a bog-standard Makefile. For me, being able to code the build logic in Python is the deciding factor. I think SCons is pretty good, and use it for Elite DS. I'd probably use SCons for other projects I do too, depending on their size (and my mood). The Bunjalloo build became a bit too slow with SCons, but I'll talk more about that next time when I discuss Waf. There seems to be quite an uptake of SCons for new OSS projects (Google's Chrome browser for one), especially in favour of autotools, and hopefully we'll see some improvement in the speed because of this (the latest release, 1.1.0, boasts improved memory use, but I have yet to see any benchmarks). Until then, you could probably do a lot worse than download the latest release and have a mess around to see what it's all about.

Monday, October 06, 2008

DSi and thoughts on the future

Nintendo have announced a new version of the DS, calling it the DSi. No idea what the i means, a nod at the good work Apple are doing presumably. But what does this mean for Bunjalloo development?

I think it is time to call it a day.

The thing is, this new console includes a web browser and, I assume, it will kitted out with a lot more RAM than the current-gen DS. Pure speculation, but you can't do much with 4 megabytes nowadays. Especially on the ever-expanding internet. While the actual DS can also run a full-blown browser, if you bought Opera, having one by default is the killer. Especially if it is free (not Free, but free "as in beer") and doesn't need a RAM expansion pack. Additionally, the rumour mill is already speculating about the security upgrades on the new DSi, like it'll have a software black list to disallow homebrew carts, upgradable firmware like the PSP and so on. I don't think anyone actually buys DS games anymore. Everyone I've seen IRL with a DS has a flash cartridge thingy of some sort. It just makes economic sense, really. So Nintendo will want to end this rampant piracy with the next gen console, and at the same time make it pretty tricky for normal folk to run homebrew games for a while. At least until a new hack comes out and the cartridge manufacturers take it mainstream. Oh, and region locking a handheld? Grrr...

Anyhow, that doesn't mean I'll give up on Bunjalloo completely, but it will enter Maintainance Mode, where only bug fixes will go in with no new stuff. See? You should have sent me patches to add those cool features, duh! There's a new release out now, 0.7.1, that fixes some of the bugs in 0.7, just to show I'm still hacking around.

Saturday, September 27, 2008

Alternatives to Make, Part 1

In the Nintendo DS scene the build tool that everyone uses is plain ol' Make. There are several alternatives for the brave developer, however. I'm going to write a few posts describing how you can use CMake, SCons or Waf to compile your code. Yes, I'm a bit of a build system junkie. This week viewers we'll look at CMake.

CMake

CMake is a build system that generates files for use with the build tool on your platform. You write a configuration file, called CMakeLists.txt, run the cmake program to generate a load of Makefiles or project files, then run make (or whatever) to compile your project. On Windows, CMake generates project files for use with Visual Studio. It can also generate project files for CodeBlocks, Eclipse CDT and KDevelop.

A hefty obstacle for using CMake with the DS is that currently CMake doesn't support building multiple target architectures in one build tree. So compiling ARM7 and ARM9 cores and combining the result into the final .nds file requires a bit of fiddling. More on that later, as it can be done.

The first thing you will need is the latest version of CMake. The features needed to compile on the DS were only added quite recently (version 2.6 onwards).

After you've installed CMake, you can start hacking away. First we need to create a Toolchain File that CMake uses to understand your target platform. Normally CMake autodetects the local platform type and configures the compiler, linker, compilation flags and so on, to the correct values. When cross compiling for the DS, you have to tell CMake which compiler it should use. We don't want it to try and guess based on any native compiler installed.

As described in the CMake wiki we first have to turn off the auto detection. This is done by setting the variable CMAKE_SYSTEM_NAME to "Generic". Here we come across one of the oddities that takes a bit of getting used to. Instead of the usual BNF variable = value style syntax for assignment, CMake uses a function. So we have set(variable value). Whatever. The other 2 variables set here are optional:

set(CMAKE_SYSTEM_NAME Generic)
set(CMAKE_SYSTEM_VERSION 1)
set(CMAKE_SYSTEM_PROCESSOR arm-eabi)

A note on style here. Commands like if, message, endif, set and so on can be written either in ALL CAPS or in lower case. The keywords inside commands have to be in UPPER CASE. I prefer to write the commands in lowercase and everything else in upper case.

So, after those first set() lines we begin to describe which compiler we'll use. The approach I'm going to use doesn't stray from the standard DEVKITARM and DEVKITPRO environment variables, which should point to where you have the devkitARM toolchain and libnds installed. We set 2 CMake variables based on environment variables as follows:

set(DEVKITARM $ENV{DEVKITARM})
set(DEVKITPRO $ENV{DEVKITPRO})

Actually this is quite good - it means that CMake by default doesn't drag the whole environment into the namespace. This means there is a better chance of our build being repeatable without having to set up a million environment variables. After this, we can check if they are set with these lines of code:

if(NOT DEVKITARM)
 message(FATAL_ERROR "Please set DEVKITARM in your environment")
endif(NOT DEVKITARM)

And similarly for DEVKITPRO. So now we need to check if the compiler is installed. This is done by setting the CMAKE_C(XX)_COMPILER variables as follows. Also we set the CMAKE_FIND_ROOT_PATH, which lets us use objcopy, ar, ranlib and friends. The other flags tell CMake to just use the compiler headers and libraries, to not search DEVKITARM for other libs and headers. Really we could create an install base for the DS, with lib, include and so on, but that is not the usual way to do things. It would probably break a lot of other hand-coded Makefiles that people use, so we'll stick with the $DEVKITPRO/libnds convention. Anyway, here is the code for setting up the compilers:

set(CMAKE_C_COMPILER ${DEVKITARM}/bin/arm-eabi-gcc)
set(CMAKE_CXX_COMPILER ${DEVKITARM}/bin/arm-eabi-g++)
set(CMAKE_FIND_ROOT_PATH ${DEVKITARM})
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)

Here if DEVKITARM points somewhere that doesn't really have the arm-eabi compiler installed, then CMake will throw up an error message when we run it:

CMake Error: your C compiler: "/path/to/devkitARM/bin/arm-eabi-gcc" was not found.   Please set CMAKE_C_COMPILER to a valid compiler path or name.
CMake Error: your CXX compiler: "/path/to/devkitARM/bin/arm-eabi-g++" was not found.   Please set CMAKE_CXX_COMPILER to a valid compiler path or name.

Apart from the compiler, we also need to indicate where libnds lives. This is done by telling CMake how to construct a library name - prepending lib, and adding .a at the end - and where to search for the library and headers:

set(CMAKE_FIND_LIBRARY_PREFIXES lib)
set(CMAKE_FIND_LIBRARY_SUFFIXES .a)
include_directories(${DEVKITPRO}/libnds/include)
link_directories(${DEVKITPRO}/libnds/lib)
find_library(NDS9 nds9)
find_library(NDS7 nds7)

OK, that's it. The rest is now "vanilla" CMake. When we run the cmake program we need to tell it to use cross compiling. This is done with the CMAKE_TOOLCHAIN_FILE variable. We can set it with a gcc-like "-D" flag:

cmake -DCMAKE_TOOLCHAIN_FILE=/full/path/to/devkitArm.cmake .

So now we have this, we can take an example from the nds-examples that are distributed with devkitArm and try to have it compile using CMake. Let's use the ansi_console example. Unzip the examples, and copy ansi_console to some place - it's in the Graphics/2D directory. Now create a file ansi_console/CMakeLists.txt and start editing it. The first thing to do here is to give the project a name:

project(NDS-EXAMPLES)

In order to compile correctly when using libnds, we need to set a C define ARM9. For this, we use the add_definition macro. I'll also create a variable EXE_NAME that we can use instead of copying ansi_console everywhere. This way if we want to change the name of the binary later it won't be too much hassle.

add_definitions(-DARM9)
set(EXE_NAME ansi_console)

Now we can describe how to build the "executable". The next line tells CMake to create an executable from the file source/main.c

add_executable(${EXE_NAME} source/main.c )

That alone isn't enough, as we also need to link with libnds for the ARM9 and pass in the required -specs flag.

target_link_libraries(${EXE_NAME} nds9)
set_target_properties(${EXE_NAME}
 PROPERTIES
 LINK_FLAGS -specs=ds_arm9.specs
 COMPILER_FLAGS "-mthumb -mthumb-interwork")

The set_target_properties macro allows us to fiddle with the compilation and link flags. A bit later, if you decide to use CMake for your NDS projects, you could make a wrapper around this to set up the usual flags for a NDS binary, rather than copying the above lines everywhere.

So to here this would normally be enough on a platform that has an operating system and knows how to load the executable file format. We could run cmake, passing in the devkitArm toolchain file and the path to the ansi_console directory. The DS doesn't have an OS however and we need to strip out all the "elfyness" - the ELF headers and whatnot -, and add the special DS header. Converting the ELF to a pure binary is done with objcopy. Adding the header is done with ndstool, provided as part of the devkitArm distribution. We can write macros to help with these steps, and that allows me to write this kind of thing to finish off the process:

objcopy_file(${EXE_NAME})
ndstool_file(${EXE_NAME})

So where do these come from? Well, we have 2 choices. Either add the macro inline in the CMakeLists.txt file or create a module. The first is easier, but the second approach lets us reuse the module elsewhere. So in a new file called ndsmacros.cmake I'll add the following lines to define the macros OBJCOPY_FILE and NDSTOOL_FILE

macro(OBJCOPY_FILE EXE_NAME)
 set(FO ${CMAKE_CURRENT_BINARY_DIR}/${EXE_NAME}.bin)
 set(FI ${CMAKE_CURRENT_BINARY_DIR}/${EXE_NAME})
 message(STATUS ${FO})
 add_custom_command(
  OUTPUT "${FO}"
  COMMAND ${CMAKE_OBJCOPY}
  ARGS -O binary ${FI} ${FO}
  DEPENDS ${FI})
 get_filename_component(TGT "${EXE_NAME}" NAME)
 add_custom_target("TargetObjCopy_${TGT}" ALL DEPENDS ${FO} VERBATIM)
 get_directory_property(extra_clean_files ADDITIONAL_MAKE_CLEAN_FILES)
 set_directory_properties(
  PROPERTIES
  ADDITIONAL_MAKE_CLEAN_FILES "${extra_clean_files};${FO}")
 set_source_files_properties("${FO}" PROPERTIES GENERATED TRUE)
endmacro(OBJCOPY_FILE)

if(NOT NDSTOOL_EXE)
 message(STATUS "Looking for arm-eabi-objcopy")
 find_program(NDSTOOL_EXE ndstool ${DEVKITARM}/bin)
 if(NDSTOOL_EXE)
  message(STATUS "Looking for arm-eabi-objcopy -- ${NDSTOOL_EXE}")
 endif(NDSTOOL_EXE)
endif(NOT NDSTOOL_EXE)

if(NDSTOOL_EXE)
 macro(NDSTOOL_FILE EXE_NAME)
  set(FO ${CMAKE_CURRENT_BINARY_DIR}/${EXE_NAME}.nds)
  set(I9 ${CMAKE_CURRENT_BINARY_DIR}/${EXE_NAME}.bin)
  add_custom_command(
   OUTPUT ${FO}
   COMMAND ${NDSTOOL_EXE}
   ARGS -c ${FO} -9 ${I9}
   MAIN_DEPENDENCY ${I9}
   )
  get_filename_component(TGT "${EXE_NAME}" NAME)
  add_custom_target("Target9_${TGT}" ALL DEPENDS ${FO} VERBATIM)
  get_directory_property(extra_clean_files ADDITIONAL_MAKE_CLEAN_FILES)
  set_directory_properties(
   PROPERTIES
   ADDITIONAL_MAKE_CLEAN_FILES "${extra_clean_files};${FO}")
  set_source_files_properties(${FO} PROPERTIES GENERATED TRUE)
 endmacro(NDSTOOL_FILE)
endif(NDSTOOL_EXE)

The first macro that defines OBJCOPY_FILE uses the built-in CMake command "add_custom_command". This has lots of options, but the ones used here say that cmake should run the CMAKE_OBJCOPY command passing in the given arguments (ARGS). CMAKE_OBJCOPY is defined automagically from our CMAKE_FIND_ROOT_PATH in the devkitArm toolchain file, it corresponds to arm-eabi-objcopy in the devkitArm installation. The "add_custom_target" isn't really needed here, but we will need it later for combined arm7/9 cores. It adds a new top level target called TargetObjCopy_[name of exe] that depends on the out file to the "all" target, so when we run "make all" it will build the binary file, if needed. Whew! That was all a bit complicated. The rest of the macro adds the output file to the clean target and marks the output as a generated file.

The "if" block in the middle is a bit like the autoconf functionality from the GNU Build System - it tries to find the ndstool program, and raises an error if it cannot be found. Otherwise we get a varible NDSTOOL_EXE that we can use to run the program. The NDSTOOL_FILE macro uses the ndstool exe to create a "single core" binary. Actually, it uses a default arm7 core, which is why we don't have to provide one. Again, we add a custom command to make sure the thing gets built when we run "make all" and add the nds file to the list of things to get cleaned.

So we have ndsmacros.cmake. To include it there are 2 ways that vary subtly. The best approach is to set the CMAKE_MODULE_PATH variable to the directory containing our macro file and then include the macro as a module:

set(CMAKE_MODULE_PATH ${CMAKE_SOURCE_DIR})
include(ndsmacros)

That assumes you have the macro file at the top of the ansi_console directory, which for a small example like this is fine. On a bigger project, if you want lots of extra tools, you would probably park them off in a sub-directory to avoid clutter.

So that's more or less it. Let's run this and see what happens. Ah! first I should just mention that we will compile in a separate build tree, keeping the source tree free from object files and other build artifacts. This is just good practice.

$ cd ansi_console
$ mkdir build
$ cd build
$ cmake -DCMAKE_TOOLCHAIN_FILE=$(pwd)/../devkitArm.cmake ..
-- The C compiler identification is GNU
-- The CXX compiler identification is GNU
-- Check for working C compiler: /path/to/devkitARM/bin/arm-eabi-gcc
-- Check for working C compiler: /path/to/devkitARM/bin/arm-eabi-gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /path/to/devkitARM/bin/arm-eabi-g++
-- Check for working CXX compiler: /path/to/devkitARM/bin/arm-eabi-g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Looking for grit
-- Looking for grit -- /home/rich/src/ndsdev/devkitARM/bin/grit
-- Looking for arm-eabi-objcopy
-- Looking for arm-eabi-objcopy -- /path/to/devkitARM/bin/ndstool
-- In data
-- Configuring done
-- Generating done
-- Build files have been written to: /path/to/examples-cmake/ansi_console/build

Now if we look at what has been produced we see that there's a normal Makefile in the build directory. So lets run it...

$ make
Scanning dependencies of target ansi_console
[ 25%] Building C object ansi_console/CMakeFiles/ansi_console.dir/source/main.c.obj
Linking C executable ansi_console
[ 25%] Built target ansi_console
Scanning dependencies of target Target9_ansi_console
[ 25%] Generating ansi_console.bin
[ 25%] Generating ansi_console.nds
Nintendo DS rom tool 1.38 - May 14 2008
by Rafael Vuijk, Dave Murphy, Alexei Karpenko
[ 75%] Built target Target9_ansi_console
Scanning dependencies of target TargetObjCopy_ansi_console
[100%] Built target TargetObjCopy_ansi_console

Nice coloured output there. And we have our ansi_console.nds. That's it, job done. As you can see, the Makefile tracks dependencies. For this simple example it isn't too bad - everything really only depends on the main.c file. But for larger projects, not having to think about generating the dependencies manually is a big plus. Sadly we are limited to timestamp checks for updates to dependencies, the make legacy showing through. We can do better than that with hashes. I'll mention that when I discuss SCons next time.

There are still things missing from this, but armed with the basics we can improve the build without too much hassle. For example, we'd need a tool to convert images to data using grit. There are more flags that we need to pass in for C++ code to turn off exceptions and RTTI, for example. And we still have the dual code problem. The grit tool would be pretty much the same as our objcopy_file example, except it'd change a png file to a c source file. A bit of a limitation that, CMake doesn't compile assembler into executables, only C/C++.

Dual core is not too bad - we already have the basics. If we follow the above but for the "combined" example, then the main CMakeLists.txt would have the following lines:

add_subdirectory(arm9)
add_subdirectory(arm7)

We wouldn't actually use the ADD_EXECUTABLE macro here. In the arm7/CMakeLists.txt file, we'd have something like this:

include_directories(${CMAKE_CURRENT_SOURCE_DIR})
add_definitions(-DARM7)
add_executable(combined_arm7 arm7.c)
target_link_libraries(combined_arm7 nds7)
set_target_properties(combined_arm7
 PROPERTIES
 LINK_FLAGS -specs=ds_arm7.specs)
objcopy_file(combined_arm7)

This layout is mandatory because the ADD_DEFINITIONS macro applies the -D flag in the directory where it is used and for all sub directories. I tried using REMOVE_DEFINITIONS to work around the problem, but that does not work:

add_definitions(-DARM7)
add_executable(combined_arm7 arm7/arm7.c)
remove_definitions(-DARM7)
add_definitions(-DARM9)
add_executable(combined_arm9 arm9/arm9.c)

CMake uses the final set of definitions for all the executables. So we'd end up with the -DARM9 flag for the ARM7 binary too, which is not what we want. Ok, so the arm9/CMakeLists.txt file would have similar content to the arm7/CMakeLists.txt one, just switching all "7"s for "9"s. At the end we'd use the following macro to combine the 2 binaries into the nds file, in combined/CMakeLists.txt:

ndstool_files(arm7/combined_arm7 arm9/combined_arm9 combined)

You have probably guessed that this macro is added to our ndsmacros.cmake file and would take 3 arguments:

macro(NDSTOOL_FILES arm7_NAME arm9_NAME exe_NAME)
set(FO ${CMAKE_CURRENT_BINARY_DIR}/${exe_NAME}.nds)
set(I9 ${CMAKE_CURRENT_BINARY_DIR}/${arm9_NAME}.bin)
set(I7 ${CMAKE_CURRENT_BINARY_DIR}/${arm7_NAME}.bin)
add_custom_command(
 OUTPUT ${FO}
 COMMAND ${NDSTOOL_EXE}
 ARGS -c ${FO} -9 ${I9} -7 ${I7})
get_filename_component(TGT "${exe_NAME}" NAME)
get_filename_component(TGT7 "${arm7_NAME}" NAME)
get_filename_component(TGT9 "${arm9_NAME}" NAME)
add_custom_target("Target97_${TGT}" ALL DEPENDS ${FO} VERBATIM)
add_dependencies("Target97_${TGT}"
 "TargetObjCopy_${TGT7}"
 "TargetObjCopy_${TGT9}")
get_directory_property(extra_clean_files ADDITIONAL_MAKE_CLEAN_FILES)
set_directory_properties(
 PROPERTIES
 ADDITIONAL_MAKE_CLEAN_FILES "${extra_clean_files};${FO}")
endmacro(NDSTOOL_FILES)

This is a bit more hacky than I would have liked, that add_dependencies line shouldn't be needed. But if we use DEPENDS in the add_custom_command macro as the documentation suggests, then we only see errors like "No rule to make target `combined/arm7/combined_arm7.bin'" - I imagine it is because of the use of sub directories. We need sub directories though, otherwise we cannot set flags per ARM core, so we'll have to live with the rather hacky solution here. Unless someone knows better and posts a comment! ;-)

That's the lot! Now we can compile either single or dual core nds files and know how to check for the installed tools and libraries. So should you use CMake? Here is my round up to help you decide!

The Bad

Documentation: CMake sometimes requires what feels like cargo cult coding to get the desired result. It took me a lot longer than I would have liked to get this working. Part of the problem as to why I couldn't figure this out was the lack of good documentation. The CMake wiki is, like most wikis, not very well organised. Most help ends up pointing you towards the main site's "documentation" page, which only tells you how to buy a CMake book. The cynical side of me believes that this is a deliberate strategy - why have good free docs when you can sell books?

Syntax: I think the whole mixed case thing is a bit yucky. ALL IN CAPS makes things tricky to read, but at least you know you'll get it right. Mixing case is easier on the eye, but has the potential for cock ups. Meh.

Make's limitations: As we've seen here, there are problems when we push CMake to do things it possibly wasn't meant to do. Often the lowest-common-denominator build tool underneath leaks through the abstraction CMake provides. When I was trying to get the nds file to depend on the 2 binaries built in the sub directories, all the errors came from make - the cmake script looked correct. And those time stamped dependency checks... are you from the past?

Needs installing: Before we can do anything we have to install CMake itself, as well as the bog-standard make (or platform equivalent). This is extra bootstrapping that is not always necessary - waf manages to get round this on Linux, at least.

The Good

Cross platform: we can generate build scripts native to the host platform. Cool!

Configure checks: We can check that the building machine has everything we need to compile our project. If not, we can provide usefull info on how to obtian the code. This is better than having the build explode with "error: foo.h not found" followed by a billion other errors.

Compact syntax: The syntax of CMake is streamlined for building stuff. It's less wordy than make alone and more to-the-point than the other competitors in this field.

Starting to gain critical mass: I think this is important. There's no point using something if you're going to be the only one that uses it. CMake is now gaining enough popularity that it is fairly well tested, has a stable API and enough features to make it worthwhile.

Conclusion

Nobody is going to move away from regular Makefiles to compile the standard "Hello, World!" example. But when your project starts to grow in size and you introduce more sub-libraries, or unit tests, or want to have better control of the way the code gets built, then it's nice to know there are better options out there. I think CMake is a worthy candidate for consideration. The learning curve is quite shallow, especially if you already are familiar with Make or bash scripting, and it is something that is gradually gaining more mainstream recognition. Hey, at least it isn't as difficult to use as autohell ;-)

Tuesday, August 26, 2008

GMail in new Bunjalloo version

I've made a new release of Bunjalloo. This version adds inline images, amongst other things.

What I really want to discuss though is this comment that gronfeldt posted on the Bunjalloo wiki/discussion thing:

Comment by gronfeldt, Aug 07, 2008

The browser is great! It's great that the project is still active. I'm mostly
stoked about being able to access GMail, but after hours of trying everything I
can think of I'm still not having any luck. I found a blog somewhere that said:

"Luckily there is a work around - once logged in to the Google collective, you
can navigate to http://www.google.com/xhtml and from there to the Mobile GMail
portal. That works fine and has a better layout for the DS to boot."

The trouble is I can't get anywhere near a page that would allow me to log on
without having to wait a very long time only to receive a message saying
"Unable to load:........".

If someone who has been able to access GMail has a minute I would love it if
they could add a comment with the steps to getting access to GMail.

Thanks so much Richard for putting all this work into something for the
community, very admirable of you.

No problem gronfeldt! I hope this problem hasn't put you off. It's not too good that he couldn't access GMail, as it definitely works (for me, at least). A couple of things make it more difficult than it should be I suppose, so here are the instructions based on version 0.7.0.

First of all, navigate to https://mail.google.com. All ok so far.

Once the page loads, you will need to allow Google to set cookies. By default not all domains are allowed to set them, as this is a privacy problem and, on a limited memory device, a memory problem too. So select the options icon, then the cookie add icon. This is explained in the Bunjalloo wiki in more detail, just in case.

Allow the google.com top-level domain by selecting the radio button and clicking Accept. This means mail.google.com and www.google.com will both be able to do the cookie thing, which is required.

Now enter your user name and password as usual. Once entered, click the sign on button.

Now comes the tricky/buggy bit. There's a bug *somewhere* in Bunjalloo which causes the sign-on process to stop. You'll reach a redirect page and nothing seems to be happening (no more page loads), and it is because of the bug. I'm not sure why, but redirects screw up occasionally. Hitting refresh, or the "moved here" link fixes the problem though, and the sign on carries on... until it stops again :-)

Here it is due to the lack of javascript. So click the "no javascript" or "mobile device" link to finally log in to gmail

That's it! Wasn't it easy? ;-) Next time cookies will be allowed, so you will only have to log-on/refresh/click. I'll keep trying to improve this as it is currently rather annoying, but hey! it (mostly) works. Shame SSL is so slow :-(

There are 2 places with room for improvement here. First, cookies are currently only partially implemented. They should be stored to disk as well as in memory, obeying the correct caching and expiry rules. This would mean that your log in session would be remembered between reboots. There's a lot of work needed there. Secondly, the redirection logic is broken. This is tricky to debug and has been plaguing Bunjalloo pretty much since I started so I would like to fix it "soon".

Wednesday, May 28, 2008

Google Code, Git and Subversion II

There have been a couple of posts on the Google Open Source Blog. The second one is an in depth article about mirroring git repositories to Google Code's (GC) subversion repo. Having tried this out in the past, and helped a blog-commenter through the steps to mirror his git repo in svn, I can't help but feel that the post causes more problems than it solves.

First off, it's way too complicated! I'm quite experienced at using both subversion and git now, but I really cringe at that last set of commands to push to the svn mirror. Compared to "git push origin master" when using a native git repo, it makes me feel queasy. If I were the cynical type, I'd say that Google are just trying to make git look overly complicated compare to svn by forcing git to jump through hoops. Compare the command-fu there to a normal everyday "add, commit" work cycle and no wonder people have a bad impression of git.

Subversion is a lossy repository format. Author information is lost, date information is munged. The svn repo just has the author who ran dcommit, not the guy wrote the patch. Nor will it preserve the correct date, just the time when you ran dcommit. The admin user may change subversion revprops (date, message, author) but that's more manual steps and more opportunity for messing up.

If anything goes wrong while pushing to svn - and it will, the Google svn servers fail every now and again, they're only mechanical - then your push-to-svn git branch gets hosed. Like "git reset git-svn" hosed, so you can just send one big patch. If you have to sync more than a few commits, it's a bit risky. The blog hints at this problem ("commit to a local repo and use svnsync"), but the fact is that it's all a bit slow, sucky and error prone.

However, there's an easier way - cut Subversion out of the loop. On your GC page, go to Administer > Tabs and fill in the Source field. Add a wiki file named Source, for example, and in it place instructions on how to clone your proper git repository from github.com, repo.or.cz, or wherever. Then just use svn for the wiki, GC for the downloads and the issue tracker.

Git works OK on Windows too, apart from some strange CRLF problems. Yes, CRLF problems. In 2008. I know, it's crap.

"Why not widen the audience?" the GOSB asks us. And I ask, why not enlighten the audience as to the benefits of moving on to a next gen VCS? Open up the minds of the Subversion-users, whose patch sets will be both easier for them to create and easier for you to integrate via Git.

Not that anyone ever sends me patches, grumble, grumble...

Sunday, March 16, 2008

Forthcoming changes in Bunjalloo

An update on the status of Bunjalloo. I'm fiddling about with options and how to present them at the moment. Last release saw a change in the way I deal with configuration. Up until 0.5.4, the release zip would overwrite the config.ini, search.cfg and allowed cookies file. It was pretty unlikely that many people actually changed these files, since the configurable things have been a bit limited and changing them required a bit of effort, but as options are added it is more and more likely. Well, now it will be possible to change the language and default download location from a configuration screen. The design is typically lo-fi, but seems to work okay.

Next up is to make the list of pages allowed to set cookies editable. The default behaviour of all popular browsers - allow anyone to set any cookie - is not a very wise choice from a privacy point of view. So I've gone the other way and made the list of allowed sites whitelist-based. If you're not on the list, you're not setting cookies. However, in order to add a site to the list one has to use an external editor, which isn't too user friendly. A way to do this from a settings screen is on the list of things to do.

If the release doesn't drag on, I also want to add an automatic updater, as suggested by Sarvesh. This will probably be semi-automatic - i.e. a button with "check for updates" - since the DS network is not the fastest and over zealous checking would be most irritating. This requires adding (at least) unzip support, so would be quite a big addition. It would also mean that zip files could be opened up from within Bunjalloo, which might be interesting.

If I manage all that, which is not likely, I also want to add configurable quick searches. These are searches that are triggered by typing a single letter followed by the search term into the address entry text box. Currently these are hard wired in the search.cfg file (g, y and w for Gogle, Yahoo and Wikipedia), but a system similar to Firefox's quick search wouldn't be too tough to add.

Of course if you would like to see any other features, or help speed up the addition of the above ones, patches are always welcome. I'd be interested to hear from anyone who has tried to compile the source code, and how it went.

Thursday, March 13, 2008

My top 6 git commands.

I've been using git for about 6 months now with my Google Code projects but I am by no means an expert. There's a lot to learn. While the documentation is complete, it can be overwhelming seeing all those options and commands. So here are my 6 git commands that I use quite often when working with subversion repositories.

1. git status

My most commonly used command by a fairly large margin. For a subversion user it can look a bit confusing at first. Why have two classes of changed files? Why not just commit all the changes? Hey, at least it can be in colour! But now that I understand the index and when to stage changes, it's absence in svn is sorely missed.

Don't forget to add this alias - it saves a lot of typing.

git config --global alias.st status

Now you can use simply git st.

2. git gui

This is a powerful mix of the status, diff, add and reset commands. By showing the changed files and letting you mix and match which ones to commit, creating meaningful patches and a logical check-in history is a lot easier. It also makes it harder to accidentally check in changes that are just for testing - though not impossible. Did I mention how much I really miss the whole staging area idea when I use subversion?

3. gitk --all

Having branches is nice, but if you can't visualize them it is pointless. This tool makes it dead simple to see what branches you have and, important for git-svn users and those who push to proper git repos, where you can see at a glance where the HEAD is compared to your remote branches.

4. git svn

Usually in the form of git-svn dcommit. I've aliased that one to svnci:

git config --global alias.svnci "svn dcommit"

This makes working with SVN repositories a much nicer experience. Being able to see all the history at a glance, commit changes instantly and "rewrite history" are especially handy.

5. git merge

When I wrote my HOWTO for using git with a Google Code repository, I mentioned that merging between branches was a bad idea. Well, that's not strictly true. At the time I was ignorant to the fact that the merge performs a "fast forward merge" if there are no other changes on the merged-to branch.

Eh?

Basically, it meant that with git-svn, if you merge a branch into master, and master had not moved on since the other branch was created, it could cause master to lose its upstream svn remote trunk tracking. The solution is to use the --no-ff flag when merging.

git checkout master
git merge --no-ff mytopic

This creates an extra commit on master corresponding to the merge, which can now be dcommit'ed to subversion. Ideally the other branch would also be in subversion, otherwise the merge commit just says "merged branch 'mytopic'", with no real way of knowing what commits went into it.

Of course should you lose your git repository there would be no way to know which revisions of which branches are merged into master (trunk), even if the branches are stored in the svn server. This is more a limitation due to svn than git, as svn has no built in merge tracking. One solution would be to push the git repository to a separate server too, just in case.

6. git rebase

So you've created a local branch that you'd like to "publish" in the subversion repository. Here's how you could go about it using git-rebase and a couple of other tricks. First find the svn revision that the git branch is based on:

sha1 = $( git rev-list --boundary $branch...master | grep ^- | cut -c2- )

That should hopefully give you the commit sha1 of the branch point. Convert it to a svn revision number:

revision = $( git svn find-rev $sha1 )

Now create a branch in the subversion repository from trunk at that revision:

svn cp -r $revision $SVNREPO/trunk $SVNREPO/branches/$branch

If you run git svn rebase --all at this point, you'll have a new remote branch at the same point as the base of your git branch. Something like the image here. The hiccup now is that [branch] is not associated to the remote branch in any way. Here's where we need to rebase:

git rebase --onto remotes/$branch master $branch

This "moves" our local branch on top of the remote branch, keeping master as the parent of both. Running git-svn dcommit now will "push" the local changes on [branch] into the remote subversion branch. Ok, so maybe I don't use this command that often, but it is handy to know what you can use it for.

Monday, February 25, 2008

Bunjalloo Update

Lately I've read some positive comments about my home-brewed web browser on a few different forums and news sites. It's nice to know that people find the software useful! I've also received some suggestions via email on how to make the browser even better for people. One of the interesting things about having a homebrew browser is that it can do things that the official browser can't, like download files to the memory card or provide translations for languages other than the official 5.

So with this in mind, I'm now working on improving the way Bunjalloo downloads files. Currently it's not very user friendly and requires you to download the file, then save it. This is pretty unintuitive - it'd be better to sniff out the MIME-type of file that the user clicks on, then either download and show it for supported image and text MIME types, or offer to save it to disk for "exotic" MIME types. Y'know, a bit like everyone else does it :-)

As for translations, there's not much user interface to translate at the moment, but what is there has now been pulled out into resource files, defined per language. Currently only the languages supported internally by the DS are selectable, and the language displayed can only be altered by changing the DS's language option from the system settings. But this is only a beginning, if there is enough interest I can extend language support to those not built in to the DS via a configuration option. The actual translations into other languages need doing, so if you want to help out let me know. I have English and Spanish done, Italian, French and German are needed, as is Japanese. Sadly Japanese won't work without changing the font, the one I have at the moment doesn't include the full range of universal characters.

Speaking of helping out, keep that feedback coming. Star issues that you think are most important, report bugs, even write patches ;-) There's a new mailing list/forum for discussing Bunjalloo if mailing me is too much hassle. I look forward to hearing from you!

Homebrew Coding