

## Efficiently ensure data consistency

**Background, examples and HowTos** 



## Contents

- Motivation, examples for inconsistent data ullet
- Data consistency with single-core
  - Protection mechanisms **S1..S5** 
    - Concepts removing the need for protection altogether
- Data consistency with multi-core ۲
  - **Spinlocks Solutions** M1..M5 Concepts removing the need for protection altogether
- Runtime consumption (= 'costs')

## Summary



Motivation, examples for inconsistent data



## Example 1: calculating the absolute value

• Simple function:

• Simple implementation:





## Example 1 compiled for Atmel AVR





## Example 1 compiled for TriCore (AURIX)

| vola     | atile int $y = -4;$ |
|----------|---------------------|
| int<br>{ | main(void)          |
| ι        | <pre>int x;</pre>   |
|          | if (y<0)<br>x = -y; |
|          | else<br>x = y;      |
| }        | return x;           |



| AU | RIX | : no | pro | blem |
|----|-----|------|-----|------|
|    |     |      |     |      |

| abs d2,d15<br>ret | main: | ld.w<br>abs<br>ret | d15,y<br>d2,d15 |
|-------------------|-------|--------------------|-----------------|
|-------------------|-------|--------------------|-----------------|



## Example 2: accessing data structures

- Preemption while reading
- The preempting code updates the data structure.
- When the preempted code continues, it uses inconsistent data (partly old, partly new)





- Let's assume an application needs to know the total number of ISRs executed.
- Problem: sometimes the execution of ISR\_high\_prio is not considered.

```
unsigned int counterISR = 0;
void interrupt(0x05) ISR low prio (void)
{
    enable(); // globally enable interrupts
    counterISR++;
    DoSomething();
}
void __interrupt(0x30) ISR_high_prio (void)
{
    enable(); // globally enable interrupts
    counterISR++;
    DoSomethingElse();
}
```



## Example 3: how are interrupts 'lost'?





# Ensuring data consistency on single-core



- Introduce interrupt suspension:
  - \_\_disable();

// Execute critical code with interrupts disabled
\_\_enable();

- Advantage: easy to implement
- Problems
  - Critical sections often difficult to identify
  - Interrupts/tasks with higher priority get delayed.
  - Even interrupts/tasks which do not write to the critical data/resource get delayed.



- Priority Ceiling Protocol: 'suspension up to the lowest required prio' GetResource(myResource1);
  - // Execute critical code; the CPU runs with
  - // the (lowest possible) prio that ensures, no
  - // other code writing to myResource1 can preempt.
    ReleaseResource(myResource1);
- Advantage compared to interrupt suspension: no delay of tasks/interrupts above the '*Ceiling Prio*'.
- Problems
  - All other problems stated for interrupt suspension remain.
  - OS or at least implementation of the Priority Ceiling Protocol required



- At the beginning of each task, copies of all critical data are created.
   → 'critical' means here: data gets accessed from code with higher priority.
- The actual copy process is protected through interrupt suspension of PCP.
- The task uses the copy of the data only.





- Typically implemented as some part of 'all inclusive' solution
  - Automated discovery of critical sections/data
  - Code generation of copies and copy routines
- Example: AUTOSAR RTE (Run-Time Environment)
- Advantage: User does not need to bother about data-consistency
   As long as run-time consumption and RAM requirements are irrelevant.
- Problems
  - Often high costs by means of run-time and RAM
  - Increased latency due to protected copy routines



## Solution S4: polling

## #include <avr/io.h> #include <avr/interrupt.h>

```
void InitHardware(void)
{
   DDRB = (1<<PB0); /* pin connected to LED is output pin */</pre>
   /* initialize timer 1 */
   TCCR1B = (1<<CS11) | (1<<CS10); /* prescaler = clk/64 */
   TIMSK |= (1<<TOIE1); /* enable overflow interrupt */</pre>
}
ISR(TIMER1_OVF_vect) /* timer 1 overflow interrupt */
{
   PORTB ^= (1<<PB0); /* toggle LED */</pre>
   // DoSomePeriodicalStuff();
}
int main(void)
{
   InitHardware();
   sei(); /* globally enable interrupts */
   while(1) {
        // DoSomeBackgroundStuff();
    }
```



## Solution S4: polling

## 

#include <avr/io.h>
#include <avr/interrupt.h>

```
void InitHardware(void)
{
    DDRB = (1<<PB0); /* pin connected to LED is output pin */</pre>
    /* initialize timer 1 */
    TCCR1B = (1<<CS11) | (1<<CS10); /* prescaler = clk/64 */
}
int main(void)
{
    InitHardware();
    while(1) {
        // DoSomeBackgroundStuff();
        if (TIFR & (1<<TOV1)) {</pre>
            TIFR |= (1<<TOV1); /* clear pending flag by
                                    writing a logical 1 */
            PORTB ^= (1<<PB0); /* toggle LED */</pre>
            // DoSomePeriodicalStuff();
        }
    }
}
```



- Advantage: interruptions avoided altogether, data consistency conceptionally ensured
- Problems
  - Increased latency
  - Big (and thus risky) modification when introduced to existing software.



## Solution S5: cooperative multitasking I







## Solution S5: cooperative multitasking II

- (Cooperative) task switches are possible at 'Schedule-Points' only.
- Schedule-Points: call of OS service function OS\_Schedule();
- OS\_Schedule() call not necessarily required after each runnable/function

```
OS_TASK( Core1_25msTask )
   Core1 25msRunnable0( );
   OS Schedule();
   Core1 25msRunnable1( );
   OS Schedule();
   Core1_25msRunnable2( );
   OS Schedule( );
   Core1 25msRunnable3( );
   OS Schedule();
   Core1 25msRunnable4( );
   OS Schedule();
   Core1_25msRunnable5( );
   OS Schedule();
   Core1 25msRunnable6( );
   OS Schedule( );
   Core1 25msRunnable7( );
   OS_Schedule( );
   Core1 25msRunnable8( );
   OS Schedule();
   Core1 25msRunnable9( );
```



- Advantage
  - No preemption of runnables, i.e. application code, data consistency conceptionally ensured
  - RAM required for stack typically drastically reduced.
  - More efficient cache usage  $\rightarrow$  run-time optimization
- Problem
  - Latency of tasks with higher priority depends on execution time of runnables in tasks with lower priority.





## Ensuring data consistency on multi-core



### Previous example 3 now with multi-core



• All previous approaches for ensuring data consistency do not work.



#### Interface (services)

| StatusType GetSpinlock      | ( | SpinlockIdType SpinlockId                                           | ); |
|-----------------------------|---|---------------------------------------------------------------------|----|
| StatusType ReleaseSpinlock  | ( | SpinlockIdType SpinlockId                                           | ); |
| StatusType TryToGetSpinlock | ( | <pre>SpinlockIdType SpinlockId, TryToGetSpinlockType* Success</pre> | ); |

#### Usage

GetSpinlock(spinlock);
/\* Execute critical code here \*/
ReleaseSpinlock(spinlock);



### Problem #1 with GetSpinlock





#### **Pseudo solution**

```
DisableAllInterrupts();
GetSpinlock(spinlock);
/* Execute critical code here */
ReleaseSpinlock(spinlock);
EnableAllInterrupts();
```





```
TryToGetSpinlockType success;
DisableOSInterrupts( );
(void)TryToGetSpinlock( spinlock, &success );
while( TRYTOGETSPINLOCK_NOSUCCESS == success )
{
    EnableOSInterrupts( );
    /* Interrupts and high prio tasks can preempt here */
    DisableOSInterrupts( );
    (void)TryToGetSpinlock( spinlock, &success );
}
/* Execute critical code here */
ReleaseSpinlock( );
EnableOSInterrupts( );
```

#### The previously mentioned problems #1 and #2 are solved.



- Idea: make any data structure • behave like an atomic global variable
  - Writing overwrites old value
  - Reading gives you the last value written
- Accesses surrounded by Get and Finish methods.
- No blocking (interrupt locks or spinlocks) in *between* these methods!
- Very short sections of blocking ۲ within these methods Free! Code openly available,

ask us!

dataToprotect\_t\* pWr;

```
How to write
pWr = GetWrPtr();
If (NULL != pWr) {
    // write access, e.g.
    // pWr->data[0] = 'A';
    // pWr->data[1] = 'B';
    FinishWr();
}
```

```
dataToprotect t* pRd;
                                How to read
pRd = GetRdPtr();
// read access, e.g.
// someVar = pRd->data[0];
// otherVar = pRd->data[1];
FinishRd();
```



## Solution M3: Lamport's bakery algorithm

void lock(int thread)

- Cf. bakery: only one customer served at the counter. Others have to queue.
- Advantage: works without interrupt locks!
- https://en.wikipedia.org/wiki/Lamport%27s\_ bakery\_algorithm
- https://www.geeksforgeeks.org/bakeryalgorithm-in-process-synchronization/
   Compile with gcc -pthread

```
// Before getting the ticket number
//"choosing" variable is set to be true
choosing[thread] = 1;
MEMBAR:
// Memory barrier applieg
int max ticket = 0;
// Finding Maximum ticke
for (int i = 0; i < THRE
    int ticket = tickets
    max_ticket = ticket
// Allotting a new ticket
tickets[thread] = max tic
MEMBAR:
choosing[thread] = 0;
MEMBAR;
// The ENTRY Section starts from here
for (int other = 0; other < THREAD COUNT; ++other) {</pre>
    // Applying the bakery algorithm conditions
    while (choosing[other]) {
    MEMBAR;
    while (tickets[other] != 0 && (tickets[other]
                                     < tickets[thread]
                                         == tickets[thread]
```

&& other < thread))) {



## Solution M4: Logical Execution Time (LET)

 Idea: reserve certain time slots on the time axis for sending/writing and for receiving/reading data.
 → writing and reading is decoupled by design, hence no further protection necessary



#### Example *without* LET (data loss)



#### Same example with LET (no data loss)





## Solution M5: avoid the need for protection

```
unsigned int counterISR low prio = 0;
unsigned int counterISR high prio = 0;
void ISR low prio (void) attribute ((signal, used));
void ISR low prio (void)
                                                   "The best data
{
                                               protection mechanism
    enable(); // globally enable interrupts
                                                is the one you do not
    counterISR low prio++;
    DoSomething();
                                                         need"
}
void ISR_high_prio (void) __attribute__ ((signal,used));
void ISR high prio (void)
{
   _enable(); // globally enable interrupts
    counterISR high prio++;
    DoSomethingElse();
unsigned int GetCounterSum(void)
Ł
    return counterISR low prio + counterISR high prio;
```



## Run-time costs of data protection mechanisms









## Book Embedded Software Timing



#### Contents

- Basics (Compilers, RTOSs, processors)
- Timing theory
- Timing analysis techniques
- Examples from automotive projects
- Timing optimization
- Multi-core, many-core
- AUTOSAR
- Safety, ISO 26262





## Summary

- Safe and reliable software only possible with consistent data
- General rules
  - Multiple readers are not critical
  - Nested reads are not critical
  - Multiple writers are typically a design flaw → do not do it!
  - Nested or simultaneous reading and writing are critical → need protection

My personal recommendation: use cooperative multitasking on single-core and LET on multi-core!

- All protection mechanisms come with advantages and disadvantages.
  - For making the right choice, you need to know them!
  - Knowing how your code generator (e.g. RTE) works carries great optimization potential.
- When using Spinlocks, apply the TryToGetSpinlock approach shown!
- The best data protection mechanism is the one you do not need







 GLIWA GmbH embedded systems

 Pollinger Str. 1

 82362 Weilheim i.OB.

 Germany

 fon
 +49 - 881 - 13 85 22 - 10

 fax
 +49 - 881 - 13 85 22 - 99

 mobile
 +49 - 177 - 2 57 86 72

peter.gliwa@gliwa.com www.gliwa.com

Peter Gliwa

Geschäftsführer (CEO)

Dipl.-Ing. (BA)

## Thank you