-1

I have a requirement where I have to create a new by processing the given dynamic array. I have no idea what my new array size would be. but the max size is 2x size of old array.

Background: Inside the function, it should iterate over each element in the array and based on the number it will add/delete either one or two elements into the new array.

Implementation: How I implemented the above requirement was, I create a linked list and add one or two new nodes according the check condition.

pseudocode:

node_t *createNode() {
  node_t *temp;
  temp = malloc(sizeof(node_t));
  if (temp == NULL) {
      fputs("Error: Failed to allocate memory for temp.\n", stderr);
    exit(1);
  }
  temp->next = NULL;
  return temp;
}

/*
 * Function: addNode
 * -----------------
 *   add node to a existing linked list at end
 *
 *   head: original linked list to add additional node at end
 *   newVal: value of the additional node
 *
 *   return: new linked list that have an additional node
 */
node_t *addNode(node_t *head, double newVal) {
  node_t *temp;
  node_t *p;
  // create additional node
  temp = createNode();
  temp->val = newVal;

  if (head == NULL) {
    head = temp;
  } else {
    p = head;
    while (p->next != NULL) {
      p = p->next;
    }
    p->next = temp;
  }
  return head;
}

/*
 * Function: lastNodeDeletion
 * --------------------------
 *   delete last node of the linked list
 *
 *   head: linked list
 */
void lastNodeDeletion(node_t *head) {
  node_t *toDelLast;
  node_t *preNode;
    if (head == NULL) {
        printf("There is no element in the list.");
    } else {
      toDelLast = head;
        preNode = head;
        /* Traverse to the last node of the list */
        while (toDelLast->next != NULL) {
            preNode = toDelLast;
            toDelLast = toDelLast->next;
        }
        if (toDelLast == head) {
          /* If there is only one item in the list, remove it */
            head = NULL;
        } else {
            /* Disconnects the link of second last node with last node */
            preNode->next = NULL;
        }
        /* Delete the last node */
        free(toDelLast);
    }
}

/*
 * Function: calculateLower
 * ------------------------
 *   find the data set of lower tube curve
 *
 *   reference: reference data curve
 *   tubeSize: data array specifying tube size that includes:
 *             tubeSize[0], x -- half width of rectangle
 *             tubeSize[1], y -- half height of rectangle
 *             tubeSize[2], baseX -- base of relative value is x direction
 *             tubeSize[3], baseY -- base of relative value is y direction
 *             tubeSize[4], ratio -- ratio y / x
 *
 *   return : data set defining lower curve of the tube
 */
struct data calculateLower(struct data reference, double *tubeSize) {
  int i;
  struct data ref_norm;
/*
struct data contains two pointers double *x, double *y, int n;
*/
  struct data lower;
  node_t *lx = NULL;
  node_t *ly = NULL;

  // ===== 1. add corner points of the rectangle =====
  double m0, m1; // slopes before and after point i of reference curve
  double s0, s1; // sign of slopes of reference curve: 1 - increasing, 0 - constant, -1 - decreasing
  double mx, my;
  int b;
  double xLen;
  double yLen;

  // Normalize data.
  mx = fabs(mean(reference.x, reference.n));
  my = fabs(mean(reference.y, reference.n));
  ref_norm = normalizeData(reference, mx, my);
  if equ(mx, 0.0) {
    xLen = tubeSize[0];
  } else {
    xLen = tubeSize[0] / mx;
  }
  if equ(my, 0.0) {
    yLen = tubeSize[1];
  } else {
    yLen = tubeSize[1] / my;
  }
  // ----- 1.1 Start: rectangle with center (x,y) = (reference.x[0], reference.y[0]) -----
  // ignore identical point at the beginning
  b = 0;
  while ((b+1 < ref_norm.n) && equ(ref_norm.x[b], ref_norm.x[b+1]) && (equ(ref_norm.y[b], ref_norm.y[b+1])))
    b = b+1;

  // add down left point
  lx = addNode(lx, (ref_norm.x[b] - xLen));
  ly = addNode(ly, (ref_norm.y[b] - yLen));

  if (b+1 < ref_norm.n) {
      // slopes of reference curve (initialization)
      s0 = sign(ref_norm.y[b+1] - ref_norm.y[b]);
      if (!equ(ref_norm.x[b+1], ref_norm.x[b])) {
          m0 = (ref_norm.y[b+1] - ref_norm.y[b]) / (ref_norm.x[b+1] - ref_norm.x[b]);
      } else {
          m0 = (s0 > 0) ? 1e+15 : -1e+15;
      }
      if equ(s0, 1) {
          // add down right point
          lx = addNode(lx, (ref_norm.x[b] + xLen));
          ly = addNode(ly, (ref_norm.y[b] - yLen));
      }

      // ----- 1.2 Iteration: rectangle with center (x,y) = (reference.x[i], reference.y[i]) -----
      for (i = b+1; i < ref_norm.n-1; i++) {
          // ignore identical points
          if (equ(ref_norm.x[i], ref_norm.x[i+1]) && equ(ref_norm.y[i], ref_norm.y[i+1]))
              continue;

          // slopes of reference curve
          s1 = sign(ref_norm.y[i+1] - ref_norm.y[i]);
          if (!equ(ref_norm.x[i+1], ref_norm.x[i])) {
              m1 = (ref_norm.y[i+1] - ref_norm.y[i]) / (ref_norm.x[i+1] - ref_norm.x[i]);
          } else {
              m1 = (s1 > 0) ? (1e+15) : (-1e+15);
          }

          // add no point for equal slopes of reference curve
          if (!equ(m0, m1)) {
              if (!equ(s0, -1) && !equ(s1, -1)) {
                  // add down right point
                  lx = addNode(lx, (ref_norm.x[i] + xLen));
                  ly = addNode(ly, (ref_norm.y[i] - yLen));
              } else if (!equ(s0, 1) && !equ(s1, 1)) {
                  // add down left point
                  lx = addNode(lx, (ref_norm.x[i] - xLen));
                  ly = addNode(ly, (ref_norm.y[i] - yLen));
              } else if (equ(s0, -1) && equ(s1, 1)) {
                  // add down left point
                  lx = addNode(lx, (ref_norm.x[i] - xLen));
                  ly = addNode(ly, (ref_norm.y[i] - yLen));
                  // add down right point
                  lx = addNode(lx, (ref_norm.x[i] + xLen));
                  ly = addNode(ly, (ref_norm.y[i] - yLen));
              } else if (equ(s0, 1) && equ(s1, -1)) {
                  // add down right point
                  lx = addNode(lx, (ref_norm.x[i] + xLen));
                  ly = addNode(ly, (ref_norm.y[i] - yLen));
                  // add down left point
                  lx = addNode(lx, (ref_norm.x[i] - xLen));
                  ly = addNode(ly, (ref_norm.y[i] - yLen));
              }

              int len = listLen(ly);
              double lastY = getNth(ly, len-1);
              // remove the last added points in case of zero slope of tube curve
              if equ((ref_norm.y[i+1] - yLen), lastY) {
                  if (equ(s0 * s1, -1) && equ(getNth(ly, len-3), lastY)) {
                      // remove two points, if two points were added at last
                      // ((len-1) - 2 >= 0, because start point + two added points)
                      lastNodeDeletion(lx);
                      lastNodeDeletion(ly);
                      lastNodeDeletion(lx);
                      lastNodeDeletion(ly);
                  } else if (!equ(s0 * s1, -1) && equ(getNth(ly, len-2), lastY)) {
                      // remove one point, if one point was added at last
                      // ((len-1) - 1 >= 0, because start point + one added point)
                      lastNodeDeletion(lx);
                      lastNodeDeletion(ly);
                  }
              }
          }
          s0 = s1;
          m0 = m1;
      }
      // ----- 1.3. End: Rectangle with center (x,y) = (reference.x[reference.n - 1], reference.y[reference.n - 1]) -----
      if equ(s0, -1) {
          // add down left point
          lx = addNode(lx, (ref_norm.x[ref_norm.n-1] - xLen));
          ly = addNode(ly, (ref_norm.y[ref_norm.n-1] - yLen));
      }
  }
  // add down right point
  lx = addNode(lx, (ref_norm.x[ref_norm.n-1] + xLen));
  ly = addNode(ly, (ref_norm.y[ref_norm.n-1] - yLen));

  // ===== 2. Remove points and add intersection points in case of backward order =====
  int lisLen = listLen(ly);
  double *tempLX = malloc(lisLen * sizeof(double));
  if (tempLX == NULL) {
      fputs("Error: Failed to allocate memory for tempLX.\n", stderr);
      exit(1);
  }
  double *tempLY = malloc(lisLen * sizeof(double));
  if (tempLY == NULL) {
      fputs("Error: Failed to allocate memory for tempLY.\n", stderr);
      exit(1);
  }

  tempLX = getListValues(lx);
  tempLY = getListValues(ly);

  lower = removeLoop(tempLX, tempLY, lisLen, -1);

  return denormalizeData(lower, mx, my);
}

It worked well for small examples. But when an array with 0.5 million values is passed as input to this function, the performance decreased drastically.

Perfomance decrease when the pointers belongs to an array of 0.5 million data points.

is there any alternative solution for the above problem or is there any other modification that needs to done to improve performance?

Thanks

chqrlie
  • 131,814
  • 10
  • 121
  • 189
rak
  • 61
  • 1
  • 9
  • The pseudocode does not shed any light on the question, it doesn't even call the malloc in that code. – Antti Haapala -- Слава Україні Jul 15 '20 at 10:22
  • Even if it does, what alternative means of dynamic memory allocation would you use? – Andrew Henle Jul 15 '20 at 10:28
  • You say "the function". Which one is "the" function? Your question has a pretty hard lack of focus and clarity. – RobertS supports Monica Cellio Jul 15 '20 at 10:35
  • calculateLower() is the funtion – rak Jul 15 '20 at 10:36
  • @rak The function is pretty extensive. Can't you just try to simplify your issue at an example? I don't see the need to show half of the program you have. I'm not be able to understand your problem anyway. I guess others have the same problem, too. Just create a minimal reproducible example and express your concerns with it. – RobertS supports Monica Cellio Jul 15 '20 at 10:42
  • @RobertSsupportsMonicaCellio, I would argue the code is not really needed, the question regards the use of allocated moemory and its impact on performance, that is the root of the problem, though some small optimization may be possible, it's the memory allocation that causes the performance issues. – anastaciu Jul 15 '20 at 10:47
  • is there any to improve performance? – rak Jul 15 '20 at 10:52
  • Your code is not a [minimal repoducible example](https://stackoverflow.com/help/minimal-reproducible-example), it's hard to optimize something than can't be ran, but my answer is basically it, you can't hope for a much faster program using memory allocation, even if you did all at once instead of in a loop, the performace would not be greatly improved, the only way you could make it much faster would be not to use memory allocation. – anastaciu Jul 15 '20 at 10:59
  • @rak How will one be able to answer your question to a specific example when s/he don't understand it? If you seek for general consideration regarding `malloc()` see anastaciu's answer. This question is by far too broad and unfocused if you want a specific improvement solution. – RobertS supports Monica Cellio Jul 15 '20 at 11:00

3 Answers3

2

I see nothing particularly inefficient about your linked-list code, but there are several reasons why a linked list approach in general might not be ideal performance-wise. Chief among those are

  1. Dynamic memory allocation is fairly expensive on a per-call basis. With a linked list, the number of allocations performed by your code scales linearly with the problem size.

  2. A linked list involves maintaining metadata, which takes up more memory and therefor reduces locality for accessing the data.

  3. Per-node (or per-node-group) allocation is likely to inherently reduce data locality, as different allocated blocks are not certain to be adjacent to each other.

If you have a good, not-too-outrageous upper bound on the amount of memory you will need -- as it sounds like you do -- then I would strongly consider allocating the maximum you will need as a single chunk (just one malloc() call instead of many), using it as an array, and, optionally, shrinking the allocation with realloc() once you know how much you really needed.

Saving hundreds or thousands of malloc() calls is a sure win, though it's unclear how much of one. Likewise, having direct access to the data, with improved locality of access, is a sure win (of similarly uncertain magnitude). Also, even if you don't shrink the allocation at the end, you may well find out that you use less memory that way.

HOWEVER, as with any performance problem, you should test with a profiler to determine what parts of the program are taking the most time. Compilers are pretty good at producing fast code, and humans are pretty bad at identifying bottlenecks by code inspection. Do not risk spending hours or days making improvements to areas that don't really matter performance-wise. Find out first what parts you need to improve, then work on those.

Update:

On second look, I do see something dreadfully inefficient about your linked-list code: each time you add a node, you traverse the whole list so as to add it at the end. This makes building the list scale as O(n2). My comments about linked list usage in general notwithstanding, you could eliminate all that by either

  1. adding new elements at the head instead of at the tail, or
  2. maintaining a pointer to the tail of each list, so as to be able to add to the end without traversing the list.

Similar applies to function lastNodeDeletion(). Traversing the whole list to find the last node gets costly as the the list gets long. It's not clear to me how many times your program ends up doing that, but if that turns out to be slowing you down (see previous comments about profiling), then you could consider not only tracking the list tail, but additionally making it a doubly-linked list, so as to be able to delete from the end without traversing the list. (But note that "deleting" from the and of an array just means updating your count of how many elements are in use, which takes constant time for deleting a tail of any length.) Alternatively, if by adding at the head of the list you can also change your requirement to deleting from the head, then that also would eliminate the need to traverse the whole list to perform deletions.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
0

It worked well for small examples. But when an array with 0.5 million values is passed as input to this function, the performance decreased drastically.

That makes sense, memory allocation is an expensive process, it certainly affects the performance of a program, that becomes apparent as you use it more often, i.e. in an extensive loop.

Is there any alternative solution for the above problem or is there any other modification that needs to done to improve performance?

If you want to use heap memory there is no faster way, you must use memory allocation, I don't see how you would significantly improve the performance while still using memory allocation, though you could somewhat improve on it by allocating larger blocks at a time thus avoiding allocating in every iteration, or even allocating all the needed memory in one go if you are aware of the total needed size.

Using stack memory is definitely faster and would be a good alternative, but in your case, since you are working with large amounts of data, that may not be possible, the stack size is very limited.

This said, the best way to assert the inefficiencies of your program is to test it yourself, surely memory management is not the only thing you can optimize. You can also use tools that allow you to profile the execution time of the program, for instance, in Linux systems, Callgrind.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
anastaciu
  • 23,467
  • 7
  • 28
  • 53
  • 1
    *I don't see how you would significantly improve the performance while still using memory allocation.* Not my DV but allocating nodes in pools is an obvious candidate to improve performance if individual allocation calls proves to be a bottleneck. – chqrlie Jul 15 '20 at 18:38
  • @chqrlie The DV is from P__J__ though he has the nasty habit of not leaving any comment. I agree that allocating larger blocks or even the whole block in one go would somewhat improve the performance, but not terribly. – anastaciu Jul 15 '20 at 19:14
  • @anastaciu: check out John Bollinger's latest update: repeatedly adding to the end and removing from the end of the linked list seems an even better explanation for the poor performance. – chqrlie Jul 15 '20 at 19:56
0

Extending on @John's answer: when you look at the mallocs (aka addNode())

calculateLower() {
    addNode();
    addNode();

    for (b+1 ... ref_norm.n-1) {
        addNode();
        addNode();
    }

    addNode();
    addNode();
}

There are two independent lists, each having at most two allocations at the beginning and the end, and at most two allocations per iteration. This makes at most 2 + (ref_norm.n - 1 - (b + 1)) + 2 allocations, per list.

Switching from a linked list to an array (plus an index/pointer to the current end) has two advantages

  • it reduces the number of allocations
  • it reduces the complexity of addNode() and lastNodeDeletion() from O(n) to O(1)
Olaf Dietsche
  • 72,253
  • 8
  • 102
  • 198