1

I have created a simple program on Java and translated this in C++. But this program on C++ runs relative slowly. I think the trouble is in my method to control the time between operations. I don't have enough C++ experience - I usually develop using Java but Java is for my purpose (change statement of the GPIO-output of a Raspberry Pi 3B 40-65 times slower as C++ (I have tested the both languages by the same operation)).

How can I make this code faster and control exactly the timings between digitalWrite(int, bool) operations?

My Java code (a single call of digitalWrite(int,int) through JNI lasts 3500-6500 nanoseconds on Raspberry Pi 4b):

public static void switchOnLeds(int firstLed, int lastLed){
    final int BIT_0_HIGH_VOLTAGE_TIME = 250;    //nanoseconds to start transfer the logical 0 for ws2811 in fast mode
    final int BIT_0_LOW_VOLTAGE_TIME = 1000;    //nanoseconds to end transfer the logical 0 for ws2811 in fast mode

    final int BIT_1_HIGH_VOLTAGE_TIME = 600;    //nanoseconds to start transfer the logical 1 for ws2811 in fast mode
    final int BIT_1_LOW_VOLTAGE_TIME = 650;     //nanoseconds to end transfer the logical 1 for ws2811 in fast mode

    final int BITS_FOR_SINGLE_LED = 24;         //how many bits contains a signal for a single LED
    final int PIN = 4;                          //Output pin

    final int LEDS = 10;
    long start, end;

    for (int i = 0; i < LEDS; i++){
        if (i >= firstLed && i <= lastLed){
            for (int bit = 0; bit < BITS_FOR_SINGLE_LED; bit++){
                start = System.nanoTime();
                end = start+BIT_1_HIGH_VOLTAGE_TIME;
                digitalWrite(PIN, true);
                while (System.nanoTime()<end){
                    //wait
                }
                start = System.nanoTime();
                end = start+BIT_1_LOW_VOLTAGE_TIME;
                digitalWrite(PIN, false);
                while (System.nanoTime()<end){
                    //wait
                }                    
            }
        }
        else {
            for (int bit = 0; bit < BITS_FOR_SINGLE_LED; bit++){
                start = System.nanoTime();
                end = start+BIT_0_HIGH_VOLTAGE_TIME;
                digitalWrite(PIN, true);
                while (System.nanoTime()<end){
                    //wait
                }
                start = System.nanoTime();
                end = start+BIT_0_LOW_VOLTAGE_TIME;
                digitalWrite(PIN, false);
                while (System.nanoTime()<end){
                    //wait
                }                    
            }
        }
    }
}

My C++ code that must be modernized (a single call of digitalWrite(int,int) lasts 88 nanoseconds on Raspberry Pi 4b):

void switchOnLeds(int firstLed, int lastLed) {
    const int BIT_0_HIGH_VOLTAGE_TIME = 250;    //nanoseconds to start transfer the logical 0 for ws2811 in fast mode
    const int BIT_0_LOW_VOLTAGE_TIME = 1000;    //nanoseconds to end transfer the logical 0 for ws2811 in fast mode

    const int BIT_1_HIGH_VOLTAGE_TIME = 600;    //nanoseconds to start transfer the logical 1 for ws2811 in fast mode
    const int BIT_1_LOW_VOLTAGE_TIME = 650;     //nanoseconds to end transfer the logical 1 for ws2811 in fast mode

    const int BITS_FOR_SINGLE_LED = 24;         //how many bits contains a signal for a single LED
    const int PIN = 4;                          //Output pin
    const int LEDS = 10;

    std::chrono::steady_clock::time_point start = std::chrono::high_resolution_clock::now();
    std::chrono::steady_clock::time_point end = std::chrono::high_resolution_clock::now();    
    for (int i = 0; i < LEDS; i++) {
        if (i >= firstLed && i <= lastLed) {
            for (int bit = 0; bit < BITS_FOR_SINGLE_LED; bit++) {
                start = std::chrono::high_resolution_clock::now();
                digitalWrite(PIN, true);
                while ((std::chrono::high_resolution_clock::now() - start).count()< BIT_1_HIGH_VOLTAGE_TIME) {
                    //wait                    
                }
                start = std::chrono::high_resolution_clock::now();
                digitalWrite(PIN, false);
                while ((std::chrono::high_resolution_clock::now() - start).count() < BIT_1_LOW_VOLTAGE_TIME) {
                    //wait                    
                }                
            }
        }
        else {
            for (int bit = 0; bit < BITS_FOR_SINGLE_LED; bit++) {
                start = std::chrono::high_resolution_clock::now();
                digitalWrite(PIN, true);
                while ((std::chrono::high_resolution_clock::now() - start).count() < BIT_0_HIGH_VOLTAGE_TIME) {
                    //wait                    
                }
                start = std::chrono::high_resolution_clock::now();
                digitalWrite(PIN, false);
                while ((std::chrono::high_resolution_clock::now() - start).count() < BIT_0_LOW_VOLTAGE_TIME) {
                    //wait                    
                }
            }
        }
    }    
}
  • 2
    The busy while loop is ...problematic. Simply use `std::this_thread::sleep_for`. That's its job. – Sam Varshavchik Apr 21 '23 at 12:34
  • do not use [`std::high_resolution_clock`](https://en.cppreference.com/w/cpp/chrono/high_resolution_clock): "The high_resolution_clock is not implemented consistently across different standard library implementations, and its use should be avoided." – 463035818_is_not_an_ai Apr 21 '23 at 12:35
  • You can use the std::chrono library and a high-resolution timer. Additionally, you can optimize the digitalWrite function by using the Raspberry Pi's GPIO registers directly. – Kozydot Apr 21 '23 at 12:41
  • @Kozydot should I use std::chrono::high_resolution_timer instead of std::chrono::high_resolution_clock? Is it not so long when I set the new timer after every digitalWrite(int,bool)? Is it not so long when I set the new timer after every digitalWrite(int,bool)? I think, I don't need to use GPIO registers directly. I have written above: the single call of digitalWrite(int, bool) lasts 88 nanoseconds. I think it is enough and I don't need to make it faster - I need to control the time of this statement 250 nanoseconds +- 150 nanoseconds. – Alexander Gorodilov Apr 21 '23 at 13:01
  • @SamVarshavchik do you think I need sent to sleep my code after the digitalWrite(int,bool) like in arduino delay(int milliseconds)? Does this function works with nanoseconds exactly? – Alexander Gorodilov Apr 21 '23 at 13:08
  • The C++ standard does not place any specific requirements on what each C++ implementation supports or doesn't support, in this context. Your C++ implementation might support nanosecond resolution for `std::this_thread::sleep_for`, or it might not. But you'll never know until you try.\ – Sam Varshavchik Apr 21 '23 at 13:13
  • @SamVarshavchik dear Sam, could you please see this code: https://gist.github.com/MGDSStudio/2f1516ca3ff87db321c149455c7483cd I have rewritten the code using std::this_thread::sleep_for(). Have I right understood your advice? – Alexander Gorodilov Apr 21 '23 at 13:28
  • 1
    @AlexanderGorodilov -- What are the compilation settings you used to build the program? Any question concerning the speed of C++ code must be accompanied by the settings used, especially the optimization settings, to build the application. If you are timing a "debug" or unoptimized build, the timings you are showing are meaningless. – PaulMcKenzie Apr 21 '23 at 13:55
  • @PaulMcKenzie you are right. I have written that I'm new in the C++ world and it is not my desire but a forced step (although I wanted start to learn C++). Java can not give me enough performance. I have tested it using a simple experiment - it is described here: https://stackoverflow.com/questions/75782278/high-frequency-gpio-output-from-raspberry-pi-3-using-java . The same test in C++ launched on my Raspberry Pi4b with Raspberry Pi OS 64 bit and compiled using g++ compiler created from the template console project in Code::blocks gave me 88 nanoseconds. – Alexander Gorodilov Apr 21 '23 at 15:10
  • 1
    @AlexanderGorodilov When using g++, if the compilation flags do not use `-O1` or `-O2` or `-O3`, or some other optimization flag, then again, your timings are meaningless. Make sure you are building your application using one of those settings. There have been hundreds (maybe thousands) of closed threads, or threads that have outright been deleted by the author, where the slow speed of a C++ program is due to not running an optimized build. – PaulMcKenzie Apr 21 '23 at 15:12
  • You can't guarentee _"...150 nanoseconds..."_ accuracy on any non-realtime OS that has a scheduler. When the OS decides it needs the program's CPU resource the process will get paused and rescheduled at a time of the OSs choosing. – Richard Critten Apr 21 '23 at 16:40
  • @RichardCritten Does it mean I need a microcontroller? Maybe can I install a lighter version of Linux for ARM to get more accuracy? Like Retropie, Lakka or something else? – Alexander Gorodilov Apr 21 '23 at 20:20
  • @SamVarshavchik I have tested with std::this_thread::sleep_for. The result is worse. Without delay periods the program send all 24x10=240 bits between 10 000 and 18 000 nanoseconds. With this delay periods all the operations last between 36 000 000 and 40 000 000 nanoseconds. But must be 300 000 (1250 nanos for one signal * 240 signals). Maybe I use not right sleep_for function? – Alexander Gorodilov Apr 21 '23 at 20:41
  • Implement your LEDs via LED framework (just needed the led-gpio driver enabled in the kernel and properly written Device Tree). And then you forget about all this. I.o.w. let Linux kernel to do its job. – 0andriy Apr 24 '23 at 15:34

0 Answers0