Short pulse generation with Arduino Uno Part 4: NOP for loops

In our previous post, we used NOP instructions inserted in between direct GPIO register access instructions to create pulse widths variable with obtainable resolutions of 62.5ns.

One could now use a for loop to create multiple NOP instructions:

for (int i = 0; i < 3; i++)
{
    _NOP();
}

As a matter of fact, this performs exactly as using three NOPs manually (312.5ns pulse width) because it is inlined by the compiler, i.e. the compiler just generates three separate NOPs since the number of NOPs is known at compile time.

Were we instead to use a variable numNOPs and use volatile so the compiler is instructed to not assume it is constant:

volatile int numNOPs = 3;
for (int i = 0; i < numNOPs; i++)
{
  _NOP();
}

we’ll end up with a pulse width of not 312.5ns but 3250ns:

I measured the pulse width for different numNOPs settings:

From this table it is obvious that the formula for the pulse width is

[latex display=“true”](\text{numNOPs} + 1) \cdot 0.75us[/latex]

Why is it so much slower than manually pasting NOPs into it? Because the for loop introduces many additional instructions including memory load, compare and (conditional) jump into the machine instructions for your compiled program. This is why instead of 1instruction of length 62.5ns it actually takes 12 instructions of length 750ns to complete one iteration of the loop.

Full example

#include <Arduino.h>
#include <avr/io.h>

#define PORT11 PORTB
#define PIN11 3
#define PIN11_MASK (1 << PIN11)

void setup() {
    pinMode(11, OUTPUT);
}

volatile int numNOPs  = 1;

void loop() {
    cli();
    PORT11 |= PIN11_MASK;
    for (int i = 0; i < numNOPs; i++) {
        _NOP();
    }
    PORT11 &= ~PIN11_MASK;
    sei();

    delay(10);
}