0

I want to calculate access time for these two ways : Row major and Column major as we know C/C++ is Row major , so when we process in first way (Row major) we should be faster. but look at this code in C++ language

#include <iostream>
#include <time.h>
#include <cstdio>
clock_t RowMajor()
{
char* buf =new char [20000,20000];
clock_t start = clock();
for (int  i = 0; i < 20000; i++) 
    for (int j = 0; j <20000; j++) 
        {
         ++buf[i,j];
        } 
    clock_t elapsed = clock() - start;
   delete [] buf;
return elapsed  ;
} 
clock_t ColumnMajor()
 {
char* buf =new char[20000,20000];
clock_t start = clock();
for (int  i = 0; i < 20000; i++) 
    for (int  j = 0; j < 20000; j++) 
    {
        ++buf[j,i]; 
    }
clock_t elapsed = clock() - start;
 delete [] buf;
return elapsed ;

}

 int main()
  {     
        std::cout << "Process Started." << std::endl;
        printf( "ColumnMajor took  %lu microSeconds. \n",   ColumnMajor()*1000000/ (CLOCKS_PER_SEC)  );
        printf( "RowMajor took  %lu microSeconds. \n",  RowMajor() *1000000/ (CLOCKS_PER_SEC) );
        std::cout << "done" << std::endl; return 0; 
  }

but whenever i run this code i get diffrent answers , sometimes Rowmajor time is grater than column major time and sometimes is opposite, any help is apriciated.

Majid Ramzani
  • 358
  • 5
  • 12
  • 6
    `buf[i,j]` is not doing what you think it is doing. It's actually equivalent to `buf[j]`. See https://en.wikipedia.org/wiki/Comma_operator In your case both traversals do exactly the same thing. – Raxvan Apr 04 '14 at 14:58
  • 1
    What are your optimization settings? Since buf is never read, the compiler might completely optimize the code away – lethal-guitar Apr 04 '14 at 14:58
  • ++buf[i][j] make error – Majid Ramzani Apr 04 '14 at 15:02
  • The time is fluctuating because `clock()` is not precise enough for a measurement like this. Run your micro-benchmark 10,000 times to get some sense of what's going on. – Sergey Kalinichenko Apr 04 '14 at 15:04
  • How can i access optimazation settings? – Majid Ramzani Apr 04 '14 at 15:04
  • To make this actually show anything useful you need to size your rows properly to use the entire cache line in your processor. I'm guessing your row cache in your processor isn't 20000... Otherwise I don't get what you're trying to show, it's pointless because different accesses you think are across column and row borders aren't because a 2D array in C/C++ is actually a 1D array followed by other 1D arrays. – Jack Apr 04 '14 at 15:06
  • How do i run my micro benchmark? – Majid Ramzani Apr 04 '14 at 15:06
  • Before you start optimization, look at Raxvans comment. Your ColMajor and RowMajor code is the same – MatthiasB Apr 04 '14 at 15:06
  • `new char [20000,20000]` is a code obfuscation technique I wasn't aware of. Too bad that VC spoils the fun on /W4 and says "warning C4709: comma operator within array index expression" :) – Christian Hackl Apr 04 '14 at 15:30

2 Answers2

3

in c++ the coma operator can't be used create/access matrix thing. To make a matrix you need to keep track of with and height and allocate all the memory as an array. Basically you need to create a vector with the number or elements equivalent to number of elements in the matrix and you get each element by taking the x + y * width.

clock_t RowMajor()
{
    int width = 20000;
    int height = 20000;
    char* buf = new char[width * height];
    clock_t start = clock();
    for (int j = 0; j < height; j++)
        for (int i = 0; i < width; i++)
        {
            ++buf[i + width * j];
        }
    clock_t elapsed = clock() - start;
    delete[] buf;
    return elapsed;
}

for ColumnMajor the buf needs to be accessed with buf[j * width + i];

An alternative way to create a matrix (from comments, thanks to James Kanze) is to create the buffer like so: char (*buf)[20000] = new char[20000][200000]. In this case, accessing the buffer is like: buf[i][j]

The safest way to do this is to use std::vector or array, and avoid using new/delete. Use std::vector to prevent buffer write overflows:

clock_t RowMajor()
{
    int width = 20000;
    int height = 20000;
    std::vector<char> buf;
    buf.resize(width * height);
    clock_t start = clock();
    for (int j = 0; j <height; j++)
        for (int i = 0; i <width; i++)
        {
            ++buf[i + j * width];
        }
    clock_t elapsed = clock() - start;
    return elapsed;
}
Raxvan
  • 6,257
  • 2
  • 25
  • 46
  • Thanks , Result : Process Started. ColumnMajor took 4292953329 microSeconds. RowMajor took 4293525493 microSeconds. done it makes no sense – Majid Ramzani Apr 04 '14 at 15:11
  • 1
    He can write: `char (*buf)[20000] = new char[20000][200000]`, and then use it. (Personally, it's not something that I'd want in production code, but it could be justified for benchmarking low level operations.) – James Kanze Apr 04 '14 at 15:11
  • 2
    Size your rows to be your CPU cache row size for the results to be meaningful. – Jack Apr 04 '14 at 15:12
  • @user3365425 i made a small mistake, i corrected the code.`RowMajor` was actually `ColumnMajor` – Raxvan Apr 04 '14 at 15:13
  • Thank you , but why does it still give wrong answer? Process Started. ColumnMajor took 1745196 microSeconds. RowMajor took 4293016329 microSeconds. done – Majid Ramzani Apr 04 '14 at 15:25
  • every time i run the code , i give diffrent answers Process Started. Process Started. ColumnMajor took 4293807493 microSeconds. RowMajor took 4292860329 microSeconds. done – Majid Ramzani Apr 04 '14 at 15:27
  • @user3365425 something is not right with the time calculated: 4293016329 microSeconds is abous 1192 hours. – Raxvan Apr 04 '14 at 15:27
  • yes , i think so , but it seems be ok , Clocks_per_second give seconds , multiple by 1000000 giv micro seconds – Majid Ramzani Apr 04 '14 at 15:30
  • yes , i think so , but it seems be ok , Clocks_per_second give seconds , multiple by 1000000 giv micro seconds – Majid Ramzani Apr 04 '14 at 15:30
  • @user3365425 use `QueryPerformanceCount` it's a better solution for measuring small time intervals http://stackoverflow.com/questions/1739259/how-to-use-queryperformancecounter – Raxvan Apr 04 '14 at 15:34
  • The second fn you privided makes some errors about vector ... I'm a PHP man and not very good at C++ , Can you explain QueryPerformanceCount more? – Majid Ramzani Apr 04 '14 at 15:43
  • with QueryPerformanceCount : Process Started. ColumnMajor took 2578283148 . RowMajor took 3031780920 . done , still wrong answer :( – Majid Ramzani Apr 04 '14 at 15:48
  • @user3365425 something else must be broken in the code , On my small test i got `ColumnMajor` to be 10 time slowe. I used QueryPerformanceCount and the 5000x5000 elements initialized with 0; – Raxvan Apr 04 '14 at 16:01
  • 1
    Beware of CPU cache hits and misses. Search the web for "data cache hits misses". This may account for variances in your timings. – Thomas Matthews Apr 04 '14 at 17:29
  • i think the problem is about converting time , calc= ColumnMajor() /CLOCKS_PER_SEC ; works fine :) – Majid Ramzani Apr 04 '14 at 17:38
0

Thanks to Raxvan ,this is the final code works fine so far

#include <iostream>
#include <time.h>
#include <cstdio>
#include <windows.h>

int calc = 0;


clock_t RowMajor()
{
int width = 20000;
int height = 20000;
char* buf = new char[width * height];
clock_t start = clock();
for (int j = 0; j < height; j++)
    for (int i = 0; i < width; i++)
    {
        ++buf[i + width * j];
    }
clock_t elapsed = clock() - start;
delete[] buf;
return elapsed;
}
clock_t ColumnMajor()
{
 int width = 20000;
int height = 20000;
char* buf = new char[width * height];
clock_t start = clock();
for (int j = 0; j < height; j++)
    for (int i = 0; i < width; i++)
    {
        ++buf[j + width * i];
    }
clock_t elapsed = clock() - start;
delete[] buf;
return elapsed;
}




 int main()
  {     
        std::cout << "Process Started." << std::endl;
        calc= ColumnMajor() /CLOCKS_PER_SEC  ;
        printf( "ColumnMajor took  %lu  . \n",  calc );
        calc=RowMajor()/CLOCKS_PER_SEC ;
        printf( "RowMajor took  %lu  . \n", calc );
        std::cout << "done" << std::endl; return 0; 
  }
Majid Ramzani
  • 358
  • 5
  • 12