1

Suppose pp is a pointer to an array of structs of length n. [was dynamically allocated] Suppose I want to create a copy of that array of structs and make a pointer to it, the following way:

struct someStruct* pp2 = malloc(_appropriate_size_);
memcpy(pp2, pp, _appropriate_length_);

I also can make a loop and do pp2[i]=pp[i] for 0 <= i <= n.

What is the difference between these approaches, and which is better and why?

Mysticial
  • 464,885
  • 45
  • 335
  • 332
triple fault
  • 13,410
  • 8
  • 32
  • 45
  • 3
    It's arguable which would be more readable. As far as performance goes, try both and see what the assembly shows. – Mysticial Mar 28 '13 at 02:04
  • If I need to go back to assembly level, I guess you're hinting that these approaches are identical? – triple fault Mar 28 '13 at 02:06
  • I don't know whether they compile identically. There are plenty of reasons why it would and why it would not. So the best way is just to try it and see. – Mysticial Mar 28 '13 at 02:07
  • They *don't* compile identically, as I said in my answer below. In fact, neither of them are required to compile at all. "The best way to determine whether or not bleach is poisonous is to *try it and see*." Do you see what's wrong, there? – autistic Mar 28 '13 at 02:28

3 Answers3

1

There is no definitive answer for all architectures. You need to do profiling to figure out what method is best.

However IMHO I would imagine that memcpy would be faster simply that somebody has taken the time for your particular architecture/platform and is able to use particular nuances to speed things up.

Ed Heal
  • 59,252
  • 17
  • 87
  • 127
1

The former uses identifiers that are forbidden in C: anything prefixed with _. The latter is invalid in C89, as struct assignment was rationalised in C99. Assuming neither of these factors cause issues, there is no functional difference.

Better isn't well defined; If you define "better" as compiles in C89, then the former might be better. However, if you define "better" as has no undefined behaviour in C99, then the latter is better. If you define "better" as more optimal, then there is no definitive answer as one implementation may produce poor performance for both, while another may produce poor performance for one and perfectly optimal code for the other, or even the same code for both. This is pretty unlikely to be a bottleneck in your algorithm, anyway...

autistic
  • 1
  • 3
  • 35
  • 80
0

I would say the memcpy to be faster - usually tuned for the underlying platform and may possibly use DMA initiated transfers (without L1/L2 cache copies). The for-loop possibly may involved extra transfers. However, it depends on how smart the underlying compiler is - if it spots statically defined value for n, it may replace it with memcpy. It is worth timing the routines or checking the assembly code as Mystical did mention.

TJR
  • 58
  • 4