Wednesday, 23 December 2015

MDK OS update on file systems.

Currently I will be working in parallel on implementing the FAT32 file system. Initially work will be done to just read the file system. Write and Modify will come later.

Now what is the need for implementing it from scratch?
It is to learn how FAT32 is implemented. I will writing posts on different aspects of the file system and the design decisions taken. This hopefully will act like a reference for people implementing their own or help them debug their system.

The development will be done outside the MDK OS tree. This is mainly for the ease of development and debugging. Later on it will be integrated with the MDK OS.

Friday, 18 December 2015

Interrupt handling in the MDK OS.

In ARM microprocessors the memory map address 0x00000000 is reserved for the vector table which is a set of 32bit words. When an interrupt occurs the processors suspends normal execution and starts loading instructions from the exception vector table. It is usually contains a form of branch instruction to a particular routine.

The interrupt vector table is as follows:

Vector Address
Reset 0x00000000
Undefined 0x00000004
SWI 0x00000008
PABT 0x0000000C
DABT 0x00000010
Reserved 0x00000014
IRQ 0x00000018
FIQ 0x00000018


In the S3C2440 after a power on reset the initial 4KB of the NAND flash memory will be loaded onto an internal boot SRAM called the "stepping stone" buffer and the boot code present in this memory address will be executed. The loader is flashed onto the NAND flash using supervivi.

Interrupt handling in loader:

The "stepping stone" buffer SRAM memory map address is located at 0x00000000. Hence our MDK loader gets executed from there. The MDK loader has the following code at the start:

.section .text
.code 32
.globl vectors

vectors:
 b reset  /* Reset */
 b fault_state  /* Undefined instruction */
 b fault_state /* Software Interrupt */
 b fault_state  /* Abort prefetch */
 b fault_state  /* Abort data */
 b .  /* Reserved */
 b fault_state /* IRQ */
 b fault_state /* FIQ */

The code is placed in .text section. The addresses in this section is generated from 0x00000000. The fragment of the loader script is below:

MEMORY
{
 sram : org = 0x00000000 , len = 0x1000
 sdram : org = 0x30000000 , len = 0x4000000
}

SECTIONS
{
 .text :
 {
  *(.text);
  . = ALIGN(4);
 } > sram

As shown above the section .text is loaded onto the sram section which has origin from 0x00000000 with the length of 0x1000(4096) or 4KB.

Notice that my reset vector contains a branch to the reset label. The reset code fragment is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
reset:
 /* Start by clearing bss section */
 ldr r1, bss_start
 ldr r2, bss_end
 ldr r3, =0

clear_bss:
 cmp r1,r2
 str r3,[r1],#4
 bne clear_bss

 /* load r13 i.e. stack pointer with stack_pointer */
 ldr r13,stack_pointer

 bl main

Here I load the bss_start and bss_end as present in the linker script file. Next in the clear_bss I compare if r1 i.e. the bs_start has reached r2 i.e the bs_end. I clear the bss by storing r3 in r1 memory content and incrementing it by 4. Then if I have not equaled r2 I continue the loop. Else I load the stack pointer in r13 and branch to main. The main is the main() function in os_main.c file.

Where have I got the stack_pointer,bss_start and bss_end variables from?

The code fragment below explains: 


1
2
3
stack_pointer: .word __stack_top__
bss_start : .word __bss_start__
bss_end : .word __bss_end__

Where did the __stack_top__,__bss_start__ and __bss_end__ come from?

The linker script code explains:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
SECTIONS
{
 .text :
 {
  *(.text);
  . = ALIGN(4);
 } > sram

 .data :
 {
  __data_start__ = .;
  *(.data)
  . = ALIGN(4);
  __data_end__ = .;
 } > sram

 .bss  :
 {
  __bss_start__ = .;
  *(.bss); *(COMMON)
  __bss_end__ = .;

  __stack_bottom__ = .;
  . += 0x300;
  __stack_top__ = .;

 } > sram

Notice that the linker script variables has global visibility. Now we can take the generated address and use it in our code. Notice that the __stack_bottom__ and __stack_top__ has 0x300(768) bytes of space. Please note that we are loading __stack_top__ in r13(SP) as the stack is a descending stack.

We are not handling any other interrupts in the loader. So if there are any interrupts that happens we just jump to a fault state as shown below:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
fault_state:
 ldr r3,GPBCON
 ldr r4,GPBDAT
 ldr r5,GPBUP

 ldr r6,=0x15400
 str r6,[r3]  @Set to output
 ldr r6,=0x00
 str r6,[r4]  @Set the led
 ldr r6,=0x1E0
 str r6,[r5]  @Disable pullup 

 b .

I have setup the LED's to glow so that I understand that I am in a fault state.

This completes interrupt handling in the loader after a Power on Reset. Next we will see how we will handle this in the MDK OS.

Interrupt handling in MDK OS:

In the MDK OS the interrupt handling will be done differently. We face several problems with using the initial vectors to jump to a particular interrupt handling routine. First is that if we want to jump to a routine which is placed in the SDRAM at address 0x30000000 it becomes too far a jump.

So how did I fix this? I enabled the MMU and mapped address 0x00000000 to EXCEPTION_INTERRUPT_VECTOR_TABLE_START which is presently hard coded to 0x33F00000. So now whenever the processor jumps to 0x00000000 it will do an address translation and translates it to 0x33F00000 and executes the content at that address.

So how is the implementation done?

First we visit the code where the exception vectors are written(os_vectors.s).


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
.section .vector_reloc,"ax" //Apparent fix for missing section when objcopy is to have allocatable and executable flags-"ax"

.code 32

.globl exception_vectors

exception_vectors:
 ldr pc,=do_handle_reset  //Reset vector
 ldr pc,=do_handle_undef  //Undefined instruction
 ldr pc,=do_handle_swi   //Software Interrupt
 ldr pc,=do_handle_pabt   //Abort prefetch
 ldr pc,=do_handle_dabt  //Abort data
 ldr pc,=do_handle_reserved //Reserved
 ldr pc,=do_handle_irq  //IRQ
 ldr pc,=do_handle_fiq  //FIQ
.end

The code is put at section .vector_reloc (intuitive name vector relocation).

The exception vector code by itself is very simple. It just loads the PC (Program Counter) register with the different exception handlers.

How is the address generated for the code? It would be EXCEPTION_INTERRUPT_VECTOR_TABLE_START.

How is the above address generation determined? We have to look at the linker script of the MDK OS(mdkos.lds).

First we look the memory section:


1
2
3
4
5
6
7
MEMORY
{
 sram : org = 0x00000000 , len = 0x1000
 /*sdram : org = 0x30000000 , len = 0x4000000*/
 sdram : org = 0x30000000 , len = 0x3F00000 /* 63MB RAM */
 vectors : org = 0x33F00000 , len = 0x100000 /* Last 1MB for the isr handlers */
}

I have defined vectors region starting at 0x33F00000.

Next we see the sections:


  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
SECTIONS
{

 .text :
 {
  *(.text);
  . = ALIGN(4);
 } > sdram


 .data :
 {
  __data_start__ = .;
  *(.data);
  . = ALIGN(4);
  __data_end__ = .;
 } > sdram

/*
 * Note on constant string bug (related to .rodata): 
 * There was a bug initially when printing a string constant would make the 
 * device go into a loop printing nonsense. This was due the fact that .rodata section
 * was left out. Due to this the addresses of the constant was emitted after the interrupt
 * vectors but the actual address of the constant was somewhere in between the file. (it
 * was after the stack setup. All the functions which referred to the string would use
 * the address which was emitted at the end of the isr handlers but the string was sitting
 * way before. It would have worked if after startup the string was moved to the address
 * at the end of the isr handler. Instead of doing this we can create a .rodata section and
 * put in the RAM. Also make sure we don't overwrite the read only section with some 
 * method. We can later write the .rodata to say flash and lock the write and do only a
 * read.
 */
 .rodata :
 {
  __rodata_start__ = .;
  *(.rodata);
  . = ALIGN(4);
  __rodata_end__ = .;
 } > sdram

 .bss  :
 {
  __bss_start__ = .;
  *(.bss); *(COMMON)
  __bss_end__ = .;

  __usr_sys_stack_bottom__ = .;
  . += 0x1000;
  __usr_sys_stack_top__ = .;

  __irq_stack_bottom__ = .;
  . += 0x1000;
  __irq_stack_top__ = .;

  __fiq_stack_bottom__ = .;
  . += 0x1000;
  __fiq_stack_top__ = .;

  __svc_stack_bottom__ = .;
  . += 0x1000;
  __svc_stack_top__ = .;
 } > sdram

 
 .vector_reloc :
 {
  *(.vector_reloc);
 } >vectors AT>sdram 

 /* Get the lma address for the particular section */
 __exception_vector_reloc_startaddr__ = LOADADDR(.vector_reloc);
 __exception_vector_reloc_endaddr__ = LOADADDR(.vector_reloc) + SIZEOF(.vector_reloc);

 /* 
     * Above SDRAM is where it will be stored in the file but address
     * references will be in the addresses of the isr handler section
  */


 .isrhandler :
 {
  *(.isrhandler);
 } >vectors AT>sdram

 __exception_handler_start_addr__ = LOADADDR(.isrhandler);
 __exception_handler_end_addr__ = LOADADDR(.isrhandler) +  SIZEOF(.isrhandler);


 /* 
  * >vma region AT > lma region 
  */

 /* 
  * eg: .data section is linked with LMA in ROM and
  * the VMA pointing to the real RAM versions
  */

 .stab 0 (NOLOAD) : 
 {
  [ .stab ]
 }

 .stabstr 0 (NOLOAD) :
 {
  [ .stabstr ]
 }
}

In line 65 vector_reloc part I tell the linker to generate addresses in the range defined by vectors i.e. from 0x33F00000. This will be the VMA region.

Now how do I know where the code is loaded?
The code is loaded by the loader to address 0x30000000 which is the start of the SDRAM. The code is placed after the .bss section. The __exception_vector_reloc_startaddr__ and __exception_vector_reloc_endaddr__ contains the start and end of the exception handler vector section. So when the code is loaded the place where it be present is 0x30XXXXXX. This will be the LMA region. The code has to be loaded from this region to the EXCEPTION_INTERRUPT_VECTOR_TABLE_START(0x3F000000) region.

The loading of these code is done the following way(setup_interrupt_vector_table(..) in os/mmu.c):


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
static void setup_interrupt_vector_table()
{
/*
 * TODO: Optimize it to remove the extra index variables. Unoptimized only for test purposes.
 *
 */

 char *vector_table = (char *)EXCEPTION_INTERRUPT_VECTOR_TABLE_START;

 /* 
  * Need to get the lma of the code.
  * The __exception_vector_reloc_startaddr__ is the lma i.e. the generated 
  * address in the file. I need to use this as the start address for the 
  * later vectors and handlers.
  */

 char *src = (char *)__exception_vector_reloc_startaddr__; 
      
 uint32_t i = 0;

 for(i = (uint32_t)__exception_vector_reloc_startaddr__; 
   i<(uint32_t)__exception_vector_reloc_endaddr__; 
    i++) {
  *vector_table = *src;
  vector_table++;
  src++;
 }


 /* Continue with the same place for handler source  */
 for(i = (uint32_t)__exception_handler_start_addr__; 
     i<(uint32_t)__exception_handler_end_addr__;
     i++) {
  *vector_table = *src;
  vector_table++;
  src++;
 }

}



In line 17 we get the content from "vectoreloc" start address to end address and we copy it to the vector_table pointer pointing to EXCEPTION_INTERRUPT_VECTOR_TABLE_START i.e. 0x3F000000 address.

Apart from that we continue to copy the contents of the interrupt handlers. The isr handlers are placed right next to the exception handlers.

The interrupt service handlers are placed in file exception_handler.s under the section .isrhandler

The code fragment is as follows:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
.section .isrhandler,"ax"


.code 32


.globl do_handle_reset
do_handle_reset:
 b do_handle_reset

.globl do_handle_undef
do_handle_undef:
 b do_handle_undef

.globl do_handle_swi
do_handle_swi:
 b do_handle_swi

.globl do_handle_pabt
do_handle_pabt:
 b do_handle_pabt

.globl do_handle_dabt
do_handle_dabt:
 b do_handle_dabt

.globl do_handle_reserved
do_handle_reserved:
 b do_handle_reserved


.globl do_handle_irq
do_handle_irq:
 sub lr,lr,#4 @Subtract r14(lr) by 4.
 stmfd sp!, {r0-r12,lr} @Save r0-r12 and lr. 
       @sp! indicates sp will be subtracted by the sizes of the registers saved.
       @Instruction details can be read in ARM System Developers guide book at Pg 65.
 /*
  * Note on disabling and enabling CPU IRQ.
  * ======================================
  * There is no need to disable IRQ when in IRQ mode. When there is 
  * an interrupt the processor switches to IRQ mode with the I bit 
  * enabled which means it is masked.
  *
  * It was tested by printing the cpsr_irq which had the value
  * 0x60000092. The 7th bit is set which means the IRQ flag is set.
  *
  * This is the same case with the FIQ.
  */
 
 ldr r2,INTOFFSET    @Load the INTOFFSET value into r2
 ldr r2,[r2]      @Load the value in the address to r2
 

 ldr r3,=interrupt_handler_jmp_table @Load the address of the interrupt handler jump table.

 mov lr,pc
 ldr pc,[r3,r2,LSL #2] @Load the value which is the interrupt handler jmp table.


// bl handle_irq

 //Clear interrupt source pending

 ldr r2,INTOFFSET    @Load the INTOFFSET value into r2
 ldr r2,[r2]      @Load the value in the address to r2

 mov r3,#1    @move 1 to r3.
 mov r3,r3, LSL r2   @Shift left by INTOFFSET and store it in r3
 
 ldr r4,SRCPND
 str r3,[r4]   @Store the value of r3 in r4 address

 ldr r4,INTPND
 str r3,[r4]   @Store the value of r3 in r4 address

 
 
 ldmfd sp!, {r0-r12,pc}^  @Restore the stack values to r0 and r12. Next restore lr to pc.
       @The ^ indicates the spsr has to copied to cpsr. The cpsr was copied to spsr
       @when the interrupt was generated.
       @The restoration of CPSR will change the mode to whatever mode was
       @present before the interrupt was called.
 

.globl do_handle_fiq
do_handle_fiq:
  b do_handle_fiq


The code in os_vector.s for e.g. where the ldr pc,=do_handle_irq was done has the code of do_handle_irq in the file exception_handler.s which contains the implementation.

This concludes the memory juggling needed to execute the interrupts.

Handling of various IRQ's:

To get an interrupt you have to enable the global IRQ and FIQ in the CPSR register. This is done as follows:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
static void enable_irq_fiq(void)
{
 uint32_t cpsr_val = 0;

 __asm__ __volatile__ (
  "mrs r0,cpsr\n\t"   /* Copy CPSR to r0 */
  "bic r0,r0,#0xC0\n\t"  /* Clear IRQ, FIQ */
  "msr cpsr,r0\n\t"   /* Copy modified value to cpsr */
  "mov %0,r0\n\t"
  : [cpsr_val]"=r"(cpsr_val) /* No output */
  : /* No input */
  : "r0" /* r0 gets clobbered */
 );

 //print_hex_uart(UART0_BA,cpsr_val);
}

An optimization would be to rewrite as a macro.

Next we will go to the actual handling of the interrupt exception. For this we have to turn over to the code in exception_hander.s

In the do_handle_irq we have :



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
do_handle_irq:
 sub lr,lr,#4 @Subtract r14(lr) by 4.
 stmfd sp!, {r0-r12,lr} @Save r0-r12 and lr. 
       @sp! indicates sp will be subtracted by the sizes of the registers saved.
       @Instruction details can be read in ARM System Developers guide book at Pg 65.
 /*
  * Note on disabling and enabling CPU IRQ.
  * ======================================
  * There is no need to disable IRQ when in IRQ mode. When there is 
  * an interrupt the processor switches to IRQ mode with the I bit 
  * enabled which means it is masked.
  *
  * It was tested by printing the cpsr_irq which had the value
  * 0x60000092. The 7th bit is set which means the IRQ flag is set.
  *
  * This is the same case with the FIQ.
  */
 
 ldr r2,INTOFFSET    @Load the INTOFFSET value into r2
 ldr r2,[r2]      @Load the value in the address to r2
 

 ldr r3,=interrupt_handler_jmp_table @Load the address of the interrupt handler jump table.

 mov lr,pc
 ldr pc,[r3,r2,LSL #2] @Load the value which is the interrupt handler jmp table.


// bl handle_irq

 //Clear interrupt source pending

 ldr r2,INTOFFSET    @Load the INTOFFSET value into r2
 ldr r2,[r2]      @Load the value in the address to r2

 mov r3,#1    @move 1 to r3.
 mov r3,r3, LSL r2   @Shift left by INTOFFSET and store it in r3
 
 ldr r4,SRCPND
 str r3,[r4]   @Store the value of r3 in r4 address

 ldr r4,INTPND
 str r3,[r4]   @Store the value of r3 in r4 address

 
 
 ldmfd sp!, {r0-r12,pc}^         @Restore the stack values to r0 and r12. Next restore lr to pc.
     @The ^ indicates the spsr has to copied to cpsr. The cpsr was copied to spsr
     @when the interrupt was generated.
     @The restoration of CPSR will change the mode to whatever mode was
     @present before the interrupt was called.
 

Before we go in depth into the explanation of the code there is a need to explain the first line of the code.
When an exception occurs the link register is set to a specific address based on the current pc. When an IRQ exception is raised the link register lr points to the last executed instruction plus 8. Care has to be taken to make sure the exception handler does not corrupt the lr because lr is used to return from an exception handler. The IRQ exception is taken only after the current instruction is executed, so the return address has to point to the next instruction i.e. lr-4.

The following has useful addresses for the different exceptions.

Exception Address
Reset
Undefined lr
SWI lr
PABT lr-4
DABT lr-8
Reserved
IRQ lr-4
FIQ lr-4



Next we save the registers from r0 to r12.
Next we get the interrupt offset from the interrupt offset register. After this we load the program counter with the index to the handler in the interrupt_handler_jmp_table.

Later code involves interrupt clean up by setting bits in source pending and interrupt pending registers. After this we restore the values r0 to r12 from the stack and load lr to pc to continue where we left off.

Note on the jump tables:

There are 2 jump tables present. The interrupt_handler_jmp_table and external_interrupt_handler_jmp_table. The 2 tables are array of functions pointers of the type void(*handler)(void).


This completes the generic parts of the interrupt handling by the MDK OS. I will add more details if I see anything lacking.


Restlessness is discontent and discontent is the first necessity of progress. Show me a thoroughly satisfied man and I will show you a failure.  
--Thomas A. Edison
                                               

Thursday, 17 December 2015

Android Lollipop build and install on Wandboard Dual

I took a small break from baremetal development to learn some Android. I had bought a Wandboard Dual for the purpose of learning Android. This board comes with the Freescale i.MX6 Cortex A9 Dual core CPU with 1GB of DDR3 RAM.


My Build setup:

Ubuntu 15.10 Wily
Sun java version "1.8.0_51" and javac 1.8.0_51

Do a prerequisite install of the different packages as mentioned in google android site.

As I recall there was a problem when the kernel was getting archived which seems to be because of lzop package not being present. It seems to be the latest way to archive. Please install the package from the repository.


Coming to the install grab the Android Lollipop source archive from the Wandboard site and unzip it.



Modify wandboard.h(bootable/bootloader/uboot-imx/include/configs/wandboard.h) to have the appropriate display resolution. I have a Dell Monitor with 1920x1024@60 as ideal specification.
So my code snippet is:


  "if hdmidet; then " \
            "setenv bootargs ${bootargs} " \

                "video=mxcfb${nextcon}:dev=hdmi,1920x1080M@60," \
                    "if=RGB24; " \
            "setenv fbmen fbmem=28M; " \

Either compile first and then modify and compile again or modify the above in the initial steps before start of compilation and then compile.

Now do a
  1. source ./build/envsetup.sh
  2. lunch and hit tab to see the options. Do -> lunch wandboard-eng
  3. make -j10 (here 10 is because I have a quad core machine. So the rule being 2.5 * number of cores we get 2.5 * 4 = 10)
In between the build might stop because of changes in Java code where they have introduced constants which are not present in the original AOSP. The build suggests running make update_api. So do a make update_api and then again start by doing make -j10.

There might be errors when compilation of u-boot and it might stop. Restart the compilation again. This time there will be no stoppage.

My build took at least 3 hours to complete! Usage of CCACHE is highly recommended.

Partition of SD Card:

I created the following partition on my SD card using gparted as the fsl-sdcard-partitions.sh (present in ./device/fsl/common/tools) does not work because of "unsupported option -M" in the recent version of sfdisk.

My partition structure is as follows:



Disk /dev/sdb: 29.7 GiB, 31914983424 bytes, 62333952 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x74ea21b6

Device     Boot    Start      End  Sectors  Size Id Type
/dev/sdb1         206848   309247   102400   50M  b W95 FAT32
/dev/sdb2         309248  1357823  1048576  512M 83 Linux
/dev/sdb3        1357824 33310719 31952896 15.2G  5 Extended
/dev/sdb4       33310720 62332927 29022208 13.9G 83 Linux
/dev/sdb5        1359872 11845631 10485760    5G 83 Linux
/dev/sdb6       11847680 16041983  4194304    2G 83 Linux
/dev/sdb7       16044032 18141183  2097152    1G 83 Linux
/dev/sdb8       18143232 22337535  4194304    2G 83 Linux
/dev/sdb9       22339584 33310719 10971136  5.2G 83 Linux

Here
  1. sdb1 is BOOT which is a vfat partition.
  2. sdb2 is RECOVERY. All partitions including this will be ext4.
  3. sdb3 will be an extended partition as we can hold maximum of 4 primary partitions.
  4. sdb4 is DATA
  5. sdb5 is SYSTEM
  6. sdb6 is CACHE
  7. sdb7 is DEVICE
  8. sdb8 is MISC
  9. sdb9 will be mounted if encryptable to sdb5 data.

Leave at least 100MB space at the start of the SD Card for the UBoot SPL loader.

After the build is done we have to flash the SD card.

In the below command we seek 1KB. This is done because Image Vector table offset is 0x400 or 1KB from the start. (Ref: Processor Reference Manual Pg 434).

sudo dd if=bootable/bootloader/uboot-imx/SPL of=/dev/sdb bs=1K seek=1; sync

sudo dd if=./out/target/product/wandboard/system.img of=/dev/sdb5; sync

sudo mount /dev/sdb1 /mnt/wandboard
cd /mnt/wandboard
mkdir boot


For initial ramdisk image creation please do the following:
mkimage -A arm -O linux -T ramdisk -C none -a 0x10800800 -n "Android Root Filesystem" -d out/target/product/wandboard/ramdisk.img /mnt/wandboard/boot/uramdisk.img
 
Copy the following files to the boot directory:

kernel_imx/arch/arm/boot/dts/imx6dl-wandboard.dtb,device/fsl/wandboard/logo/out.bmp.gz,bootable/bootloader/uboot-imx/u-boot.img,device/fsl/wandboard/uenv/uEnv.txt,kernel_imx/arch/arm/boot/uImage and kernel_imx/arch/arm/boot/zImage.

Modify uEnv.txt to have the following text:


bootargs_base=console=ttymxc0,115200 init=/init androidboot.console=ttymxc0 androidboot.hardware=freescale vmalloc=400M androidboot.selinux=disabled
video_mode=display0=dev=hdmi,mode=1920x1080M@60,if=RGB24,bpp=32 fbmem=28M
expansion=fwbadapt
baseboard=wand


Insert the SD Card in the wandboard. Make sure you have connected the display and then boot.

You should get a "android" in the boot screen. After this you will get "Updating apps" dialog and then you will get the standard Android launcher screen.

That is all. I will update this HOWTO if I have missed anything.