Friday 18 December 2015

Interrupt handling in the MDK OS.

In ARM microprocessors the memory map address 0x00000000 is reserved for the vector table which is a set of 32bit words. When an interrupt occurs the processors suspends normal execution and starts loading instructions from the exception vector table. It is usually contains a form of branch instruction to a particular routine.

The interrupt vector table is as follows:

Vector Address
Reset 0x00000000
Undefined 0x00000004
SWI 0x00000008
PABT 0x0000000C
DABT 0x00000010
Reserved 0x00000014
IRQ 0x00000018
FIQ 0x00000018


In the S3C2440 after a power on reset the initial 4KB of the NAND flash memory will be loaded onto an internal boot SRAM called the "stepping stone" buffer and the boot code present in this memory address will be executed. The loader is flashed onto the NAND flash using supervivi.

Interrupt handling in loader:

The "stepping stone" buffer SRAM memory map address is located at 0x00000000. Hence our MDK loader gets executed from there. The MDK loader has the following code at the start:

.section .text
.code 32
.globl vectors

vectors:
 b reset  /* Reset */
 b fault_state  /* Undefined instruction */
 b fault_state /* Software Interrupt */
 b fault_state  /* Abort prefetch */
 b fault_state  /* Abort data */
 b .  /* Reserved */
 b fault_state /* IRQ */
 b fault_state /* FIQ */

The code is placed in .text section. The addresses in this section is generated from 0x00000000. The fragment of the loader script is below:

MEMORY
{
 sram : org = 0x00000000 , len = 0x1000
 sdram : org = 0x30000000 , len = 0x4000000
}

SECTIONS
{
 .text :
 {
  *(.text);
  . = ALIGN(4);
 } > sram

As shown above the section .text is loaded onto the sram section which has origin from 0x00000000 with the length of 0x1000(4096) or 4KB.

Notice that my reset vector contains a branch to the reset label. The reset code fragment is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
reset:
 /* Start by clearing bss section */
 ldr r1, bss_start
 ldr r2, bss_end
 ldr r3, =0

clear_bss:
 cmp r1,r2
 str r3,[r1],#4
 bne clear_bss

 /* load r13 i.e. stack pointer with stack_pointer */
 ldr r13,stack_pointer

 bl main

Here I load the bss_start and bss_end as present in the linker script file. Next in the clear_bss I compare if r1 i.e. the bs_start has reached r2 i.e the bs_end. I clear the bss by storing r3 in r1 memory content and incrementing it by 4. Then if I have not equaled r2 I continue the loop. Else I load the stack pointer in r13 and branch to main. The main is the main() function in os_main.c file.

Where have I got the stack_pointer,bss_start and bss_end variables from?

The code fragment below explains: 


1
2
3
stack_pointer: .word __stack_top__
bss_start : .word __bss_start__
bss_end : .word __bss_end__

Where did the __stack_top__,__bss_start__ and __bss_end__ come from?

The linker script code explains:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
SECTIONS
{
 .text :
 {
  *(.text);
  . = ALIGN(4);
 } > sram

 .data :
 {
  __data_start__ = .;
  *(.data)
  . = ALIGN(4);
  __data_end__ = .;
 } > sram

 .bss  :
 {
  __bss_start__ = .;
  *(.bss); *(COMMON)
  __bss_end__ = .;

  __stack_bottom__ = .;
  . += 0x300;
  __stack_top__ = .;

 } > sram

Notice that the linker script variables has global visibility. Now we can take the generated address and use it in our code. Notice that the __stack_bottom__ and __stack_top__ has 0x300(768) bytes of space. Please note that we are loading __stack_top__ in r13(SP) as the stack is a descending stack.

We are not handling any other interrupts in the loader. So if there are any interrupts that happens we just jump to a fault state as shown below:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
fault_state:
 ldr r3,GPBCON
 ldr r4,GPBDAT
 ldr r5,GPBUP

 ldr r6,=0x15400
 str r6,[r3]  @Set to output
 ldr r6,=0x00
 str r6,[r4]  @Set the led
 ldr r6,=0x1E0
 str r6,[r5]  @Disable pullup 

 b .

I have setup the LED's to glow so that I understand that I am in a fault state.

This completes interrupt handling in the loader after a Power on Reset. Next we will see how we will handle this in the MDK OS.

Interrupt handling in MDK OS:

In the MDK OS the interrupt handling will be done differently. We face several problems with using the initial vectors to jump to a particular interrupt handling routine. First is that if we want to jump to a routine which is placed in the SDRAM at address 0x30000000 it becomes too far a jump.

So how did I fix this? I enabled the MMU and mapped address 0x00000000 to EXCEPTION_INTERRUPT_VECTOR_TABLE_START which is presently hard coded to 0x33F00000. So now whenever the processor jumps to 0x00000000 it will do an address translation and translates it to 0x33F00000 and executes the content at that address.

So how is the implementation done?

First we visit the code where the exception vectors are written(os_vectors.s).


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
.section .vector_reloc,"ax" //Apparent fix for missing section when objcopy is to have allocatable and executable flags-"ax"

.code 32

.globl exception_vectors

exception_vectors:
 ldr pc,=do_handle_reset  //Reset vector
 ldr pc,=do_handle_undef  //Undefined instruction
 ldr pc,=do_handle_swi   //Software Interrupt
 ldr pc,=do_handle_pabt   //Abort prefetch
 ldr pc,=do_handle_dabt  //Abort data
 ldr pc,=do_handle_reserved //Reserved
 ldr pc,=do_handle_irq  //IRQ
 ldr pc,=do_handle_fiq  //FIQ
.end

The code is put at section .vector_reloc (intuitive name vector relocation).

The exception vector code by itself is very simple. It just loads the PC (Program Counter) register with the different exception handlers.

How is the address generated for the code? It would be EXCEPTION_INTERRUPT_VECTOR_TABLE_START.

How is the above address generation determined? We have to look at the linker script of the MDK OS(mdkos.lds).

First we look the memory section:


1
2
3
4
5
6
7
MEMORY
{
 sram : org = 0x00000000 , len = 0x1000
 /*sdram : org = 0x30000000 , len = 0x4000000*/
 sdram : org = 0x30000000 , len = 0x3F00000 /* 63MB RAM */
 vectors : org = 0x33F00000 , len = 0x100000 /* Last 1MB for the isr handlers */
}

I have defined vectors region starting at 0x33F00000.

Next we see the sections:


  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
SECTIONS
{

 .text :
 {
  *(.text);
  . = ALIGN(4);
 } > sdram


 .data :
 {
  __data_start__ = .;
  *(.data);
  . = ALIGN(4);
  __data_end__ = .;
 } > sdram

/*
 * Note on constant string bug (related to .rodata): 
 * There was a bug initially when printing a string constant would make the 
 * device go into a loop printing nonsense. This was due the fact that .rodata section
 * was left out. Due to this the addresses of the constant was emitted after the interrupt
 * vectors but the actual address of the constant was somewhere in between the file. (it
 * was after the stack setup. All the functions which referred to the string would use
 * the address which was emitted at the end of the isr handlers but the string was sitting
 * way before. It would have worked if after startup the string was moved to the address
 * at the end of the isr handler. Instead of doing this we can create a .rodata section and
 * put in the RAM. Also make sure we don't overwrite the read only section with some 
 * method. We can later write the .rodata to say flash and lock the write and do only a
 * read.
 */
 .rodata :
 {
  __rodata_start__ = .;
  *(.rodata);
  . = ALIGN(4);
  __rodata_end__ = .;
 } > sdram

 .bss  :
 {
  __bss_start__ = .;
  *(.bss); *(COMMON)
  __bss_end__ = .;

  __usr_sys_stack_bottom__ = .;
  . += 0x1000;
  __usr_sys_stack_top__ = .;

  __irq_stack_bottom__ = .;
  . += 0x1000;
  __irq_stack_top__ = .;

  __fiq_stack_bottom__ = .;
  . += 0x1000;
  __fiq_stack_top__ = .;

  __svc_stack_bottom__ = .;
  . += 0x1000;
  __svc_stack_top__ = .;
 } > sdram

 
 .vector_reloc :
 {
  *(.vector_reloc);
 } >vectors AT>sdram 

 /* Get the lma address for the particular section */
 __exception_vector_reloc_startaddr__ = LOADADDR(.vector_reloc);
 __exception_vector_reloc_endaddr__ = LOADADDR(.vector_reloc) + SIZEOF(.vector_reloc);

 /* 
     * Above SDRAM is where it will be stored in the file but address
     * references will be in the addresses of the isr handler section
  */


 .isrhandler :
 {
  *(.isrhandler);
 } >vectors AT>sdram

 __exception_handler_start_addr__ = LOADADDR(.isrhandler);
 __exception_handler_end_addr__ = LOADADDR(.isrhandler) +  SIZEOF(.isrhandler);


 /* 
  * >vma region AT > lma region 
  */

 /* 
  * eg: .data section is linked with LMA in ROM and
  * the VMA pointing to the real RAM versions
  */

 .stab 0 (NOLOAD) : 
 {
  [ .stab ]
 }

 .stabstr 0 (NOLOAD) :
 {
  [ .stabstr ]
 }
}

In line 65 vector_reloc part I tell the linker to generate addresses in the range defined by vectors i.e. from 0x33F00000. This will be the VMA region.

Now how do I know where the code is loaded?
The code is loaded by the loader to address 0x30000000 which is the start of the SDRAM. The code is placed after the .bss section. The __exception_vector_reloc_startaddr__ and __exception_vector_reloc_endaddr__ contains the start and end of the exception handler vector section. So when the code is loaded the place where it be present is 0x30XXXXXX. This will be the LMA region. The code has to be loaded from this region to the EXCEPTION_INTERRUPT_VECTOR_TABLE_START(0x3F000000) region.

The loading of these code is done the following way(setup_interrupt_vector_table(..) in os/mmu.c):


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
static void setup_interrupt_vector_table()
{
/*
 * TODO: Optimize it to remove the extra index variables. Unoptimized only for test purposes.
 *
 */

 char *vector_table = (char *)EXCEPTION_INTERRUPT_VECTOR_TABLE_START;

 /* 
  * Need to get the lma of the code.
  * The __exception_vector_reloc_startaddr__ is the lma i.e. the generated 
  * address in the file. I need to use this as the start address for the 
  * later vectors and handlers.
  */

 char *src = (char *)__exception_vector_reloc_startaddr__; 
      
 uint32_t i = 0;

 for(i = (uint32_t)__exception_vector_reloc_startaddr__; 
   i<(uint32_t)__exception_vector_reloc_endaddr__; 
    i++) {
  *vector_table = *src;
  vector_table++;
  src++;
 }


 /* Continue with the same place for handler source  */
 for(i = (uint32_t)__exception_handler_start_addr__; 
     i<(uint32_t)__exception_handler_end_addr__;
     i++) {
  *vector_table = *src;
  vector_table++;
  src++;
 }

}



In line 17 we get the content from "vectoreloc" start address to end address and we copy it to the vector_table pointer pointing to EXCEPTION_INTERRUPT_VECTOR_TABLE_START i.e. 0x3F000000 address.

Apart from that we continue to copy the contents of the interrupt handlers. The isr handlers are placed right next to the exception handlers.

The interrupt service handlers are placed in file exception_handler.s under the section .isrhandler

The code fragment is as follows:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
.section .isrhandler,"ax"


.code 32


.globl do_handle_reset
do_handle_reset:
 b do_handle_reset

.globl do_handle_undef
do_handle_undef:
 b do_handle_undef

.globl do_handle_swi
do_handle_swi:
 b do_handle_swi

.globl do_handle_pabt
do_handle_pabt:
 b do_handle_pabt

.globl do_handle_dabt
do_handle_dabt:
 b do_handle_dabt

.globl do_handle_reserved
do_handle_reserved:
 b do_handle_reserved


.globl do_handle_irq
do_handle_irq:
 sub lr,lr,#4 @Subtract r14(lr) by 4.
 stmfd sp!, {r0-r12,lr} @Save r0-r12 and lr. 
       @sp! indicates sp will be subtracted by the sizes of the registers saved.
       @Instruction details can be read in ARM System Developers guide book at Pg 65.
 /*
  * Note on disabling and enabling CPU IRQ.
  * ======================================
  * There is no need to disable IRQ when in IRQ mode. When there is 
  * an interrupt the processor switches to IRQ mode with the I bit 
  * enabled which means it is masked.
  *
  * It was tested by printing the cpsr_irq which had the value
  * 0x60000092. The 7th bit is set which means the IRQ flag is set.
  *
  * This is the same case with the FIQ.
  */
 
 ldr r2,INTOFFSET    @Load the INTOFFSET value into r2
 ldr r2,[r2]      @Load the value in the address to r2
 

 ldr r3,=interrupt_handler_jmp_table @Load the address of the interrupt handler jump table.

 mov lr,pc
 ldr pc,[r3,r2,LSL #2] @Load the value which is the interrupt handler jmp table.


// bl handle_irq

 //Clear interrupt source pending

 ldr r2,INTOFFSET    @Load the INTOFFSET value into r2
 ldr r2,[r2]      @Load the value in the address to r2

 mov r3,#1    @move 1 to r3.
 mov r3,r3, LSL r2   @Shift left by INTOFFSET and store it in r3
 
 ldr r4,SRCPND
 str r3,[r4]   @Store the value of r3 in r4 address

 ldr r4,INTPND
 str r3,[r4]   @Store the value of r3 in r4 address

 
 
 ldmfd sp!, {r0-r12,pc}^  @Restore the stack values to r0 and r12. Next restore lr to pc.
       @The ^ indicates the spsr has to copied to cpsr. The cpsr was copied to spsr
       @when the interrupt was generated.
       @The restoration of CPSR will change the mode to whatever mode was
       @present before the interrupt was called.
 

.globl do_handle_fiq
do_handle_fiq:
  b do_handle_fiq


The code in os_vector.s for e.g. where the ldr pc,=do_handle_irq was done has the code of do_handle_irq in the file exception_handler.s which contains the implementation.

This concludes the memory juggling needed to execute the interrupts.

Handling of various IRQ's:

To get an interrupt you have to enable the global IRQ and FIQ in the CPSR register. This is done as follows:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
static void enable_irq_fiq(void)
{
 uint32_t cpsr_val = 0;

 __asm__ __volatile__ (
  "mrs r0,cpsr\n\t"   /* Copy CPSR to r0 */
  "bic r0,r0,#0xC0\n\t"  /* Clear IRQ, FIQ */
  "msr cpsr,r0\n\t"   /* Copy modified value to cpsr */
  "mov %0,r0\n\t"
  : [cpsr_val]"=r"(cpsr_val) /* No output */
  : /* No input */
  : "r0" /* r0 gets clobbered */
 );

 //print_hex_uart(UART0_BA,cpsr_val);
}

An optimization would be to rewrite as a macro.

Next we will go to the actual handling of the interrupt exception. For this we have to turn over to the code in exception_hander.s

In the do_handle_irq we have :



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
do_handle_irq:
 sub lr,lr,#4 @Subtract r14(lr) by 4.
 stmfd sp!, {r0-r12,lr} @Save r0-r12 and lr. 
       @sp! indicates sp will be subtracted by the sizes of the registers saved.
       @Instruction details can be read in ARM System Developers guide book at Pg 65.
 /*
  * Note on disabling and enabling CPU IRQ.
  * ======================================
  * There is no need to disable IRQ when in IRQ mode. When there is 
  * an interrupt the processor switches to IRQ mode with the I bit 
  * enabled which means it is masked.
  *
  * It was tested by printing the cpsr_irq which had the value
  * 0x60000092. The 7th bit is set which means the IRQ flag is set.
  *
  * This is the same case with the FIQ.
  */
 
 ldr r2,INTOFFSET    @Load the INTOFFSET value into r2
 ldr r2,[r2]      @Load the value in the address to r2
 

 ldr r3,=interrupt_handler_jmp_table @Load the address of the interrupt handler jump table.

 mov lr,pc
 ldr pc,[r3,r2,LSL #2] @Load the value which is the interrupt handler jmp table.


// bl handle_irq

 //Clear interrupt source pending

 ldr r2,INTOFFSET    @Load the INTOFFSET value into r2
 ldr r2,[r2]      @Load the value in the address to r2

 mov r3,#1    @move 1 to r3.
 mov r3,r3, LSL r2   @Shift left by INTOFFSET and store it in r3
 
 ldr r4,SRCPND
 str r3,[r4]   @Store the value of r3 in r4 address

 ldr r4,INTPND
 str r3,[r4]   @Store the value of r3 in r4 address

 
 
 ldmfd sp!, {r0-r12,pc}^         @Restore the stack values to r0 and r12. Next restore lr to pc.
     @The ^ indicates the spsr has to copied to cpsr. The cpsr was copied to spsr
     @when the interrupt was generated.
     @The restoration of CPSR will change the mode to whatever mode was
     @present before the interrupt was called.
 

Before we go in depth into the explanation of the code there is a need to explain the first line of the code.
When an exception occurs the link register is set to a specific address based on the current pc. When an IRQ exception is raised the link register lr points to the last executed instruction plus 8. Care has to be taken to make sure the exception handler does not corrupt the lr because lr is used to return from an exception handler. The IRQ exception is taken only after the current instruction is executed, so the return address has to point to the next instruction i.e. lr-4.

The following has useful addresses for the different exceptions.

Exception Address
Reset
Undefined lr
SWI lr
PABT lr-4
DABT lr-8
Reserved
IRQ lr-4
FIQ lr-4



Next we save the registers from r0 to r12.
Next we get the interrupt offset from the interrupt offset register. After this we load the program counter with the index to the handler in the interrupt_handler_jmp_table.

Later code involves interrupt clean up by setting bits in source pending and interrupt pending registers. After this we restore the values r0 to r12 from the stack and load lr to pc to continue where we left off.

Note on the jump tables:

There are 2 jump tables present. The interrupt_handler_jmp_table and external_interrupt_handler_jmp_table. The 2 tables are array of functions pointers of the type void(*handler)(void).


This completes the generic parts of the interrupt handling by the MDK OS. I will add more details if I see anything lacking.


Restlessness is discontent and discontent is the first necessity of progress. Show me a thoroughly satisfied man and I will show you a failure.  
--Thomas A. Edison
                                               

No comments:

Post a Comment