0

I have been currently working on an implementation in C++. I am new in C++ and want to be better. In initial step of the task, I need to read a file and store some part of the data in a dynamic array whose type is a class, "Sequence", as following;

class Sequence{
private:
    char *seq;
    int length;
public:
    /* Setter and getter declarations */
    void setSeq(char *);
    char *getSeq();
    void setLength(int);
    int getLength();

    /* Constructors and destructor */
    Sequence();
    Sequence(std::string);
    Sequence(std::string, unsigned);
    ~Sequence();

    /* Overloaded = operator */
    void operator = (Sequence *sequence)
    {
        seq = sequence->seq;
        length = sequence->length;
    };
 };

/* Constructor definitions */
Sequence::Sequence(){/* Default Constructor*/}
Sequence::Sequence(std::string str)
{
    seq = &str[0u];
    length = str.length();
}
Sequence::Sequence(std::string str, unsigned count)
{
    seq = &str[0u];
    length = count;
}

/* Destructor definition */
Sequence::~Sequence(){/* Default Destructor */}

/* Setter and getter definitions */
void Sequence::setSeq(char *seq){
    this->seq = seq;}
char * Sequence::getSeq(){
    return seq; }
void Sequence::setLength(int length){
    this->length = length; }
int Sequence::getLength(){
    return length; }

/* Forward declaration of file reading functions */
unsigned getCountOfSequences(char *);
Sequence *getSequences(char *, unsigned);

The following is the cpp file;

#include <fstream>
#include "file_operations.h"


unsigned getCountOfSequences(char *fileName)
{
    std::ifstream file(fileName);
    std::string str;
    unsigned count;

    /* Get total number of sequences*/
    file.unsetf(std::ios_base::skipws);
    count = std::count(
                       std::istream_iterator<char>(file),
                       std::istream_iterator<char>(), '\n')/4;
    file.close();

    return count;
}

/*
 *  Function : read_sequences
 *  Goal     : reading the file in FASTQ format to
 *  get all the sequences into a set in Sequence type
 *  Input    : name of the file
 *  Output   : a set of sequences in Sequence type
 */
Sequence *getSequences(char *fileName, unsigned count)
{

    /* Used variables while reading file */
    std::ifstream file(fileName);
    std::string str;

    /*Define a set of sequence dynamically*/
    Sequence *setOfSeqs = new Sequence[count];

    /* Put total number of seqs into first seq */
    setOfSeqs[0] = new Sequence("COUNT", count);


    /* Read each sequence and put it into set */
    int lineIndex = 1; int seqIndex = 0;
    while (getline(file, str))
    {
        /* This is the line of sequence */
        if ( lineIndex == 2 )
        {
            /* Define a temporary sequence */
            Sequence *newSeq = new Sequence(str);

            /* Add new seq to set of sequences */
            setOfSeqs[++seqIndex] = new Sequence(str);

            /* --- PART A --- */
            //delete newSeq;
            std::cout << setOfSeqs[seqIndex].getSeq() << "\n";

        }
        /* Loop updates */
        lineIndex = (lineIndex == 4) ? 0 : lineIndex;
        lineIndex ++;
    }
    file.close();

    /* --- PART B --- */
    for ( int i = seqIndex; i > 0; i -- )
    {
        std::cout << setOfSeqs[i].getSeq() << "\n";
    }*/
    return setOfSeqs;
} // read_sequences() function

Now, I have two main questions to understand the pointers and dynamic memory allocation in C++. As you got, I just want to read the file, put the data in an array(dynamically defined in "Sequence" type) and return it to a different .cpp file to be used.

My first question is about the data is changed after the loop ended. There are two parts are commented out as "PART A" and "PART B" to focus on. All the data I read in the "PART A" is correctly printed out line by line. However, if I try to print those lines(stored in setOfSeqs) in the "PART B", I encountered different data, which was incorrect, especially there were multiples of the same data mostly. Now, why I lost the data(or the data was diferent) I read in the while loop?

My second question is about the deallocation of memory in C++. As you see, in the while loop, I created a "Sequence" object and add it into array, setOfSeqs, after constructed it. Now, if I uncommented the line, delete newSeq;, in "PART B", after I copied the object(newSeq) into one of the setOfSeqs elements, why I got an empty string when I tried to printed setOfSeqs element in the next line? Indeed, I just deleted the newly created object, not the setOfSeqs element. From my perspective, I equalized the object to the element, so the data in the setOfSeqs element should not be lost.

In addition, even though, I was able to print the correct data in "PART B" (before returning the setOfSeqs) once, result of the print statement in the other .cpp file was again empty?

Thank you all in advance.

Weather Vane
  • 33,872
  • 7
  • 36
  • 56
  • 1
    Please do not spam tags. This is not C code. – Weather Vane Jan 15 '17 at 20:24
  • `Sequence` lacks a copy constructor. – PaulMcKenzie Jan 15 '17 at 20:25
  • If it lacks, how printed data can be correct after I copied it? Can you give me more details? Even if the solution is too easy, I want to understand and comprehend the logic of dynamic memory allocation in C++ completely. –  Jan 15 '17 at 20:30
  • It lacks a copy constructor. Do you deny this? What if a simple copy were done in your `main` program, i.e. `Sequence s1;...Sequence s2 = s1;`? Look at [the rule of 3](http://stackoverflow.com/questions/4172722/what-is-the-rule-of-three). Also, it is counter-intuitive to have `operator =` take a pointer. A C++ programmer expects `operator=` to take a `const Sequence&`, not a pointer. – PaulMcKenzie Jan 15 '17 at 20:35
  • I overloaded = operator. Is not enough to equalize these two objects? –  Jan 15 '17 at 20:41
  • You're class runs totally counter to how to properly write a class that is supposed to have correct copy semantics. It should have a copy constructor that takes a const ref, an assignment operator that takes a const ref, and a destructor that works properly. An `operator=` that takes a pointer is not an assignment operator. – PaulMcKenzie Jan 15 '17 at 20:42
  • When I try to take a const ref, it does not work with "no viable overloaded =" error. –  Jan 15 '17 at 20:45
  • For a fuller description of what @PaulMcKenzie is talking about Read up on [What is The Rule of Three?](http://stackoverflow.com/questions/4172722/what-is-the-rule-of-three) – user4581301 Jan 15 '17 at 20:46
  • `Sequence *newSeq = new Sequence(str);` -- This is a memory leak. As to your `no viable overload` error, maybe you are still trying to assign a pointer to an object? This is why your code to take a pointer in the `operator =` is counter to how C++ programmers write code. When you assign an object, it is simply `Object o1; Object o2; o2 = o1;`. No pointers, just values. If you write code where a programmer has to do `o2 = &o1;`, that is a nightmare. – PaulMcKenzie Jan 15 '17 at 20:48
  • I read it before posting it. I need to understand the main logic of this scenario to not struggle with same issues in the future. –  Jan 15 '17 at 20:48
  • I strongly recommend producing a [mcve]. If this does not expose the error, edit your question and supply the MCVE. – user4581301 Jan 15 '17 at 20:53
  • Paul : That is the point I really wonder. I confused while deciding to use a value or pointer ... I tried to use pointer as I thought from performance perspective, but not sure. I need to help to know which one I choose in these scenarios. –  Jan 15 '17 at 20:53
  • @user4581301: this is also really simple example to work on. –  Jan 15 '17 at 20:55
  • @PaulMcKenzie why Sequence *newSeq = new Sequence(str); is a memory leak? –  Jan 15 '17 at 20:56
  • What do you do with `newSeq`? It is local to that `{ }`, and you don't `delete` it anywhere or even store it somewhere for later deletion. Also, using pointers does not necessarily mean you're saving any time. Optimizing compilers in this day and age are smart, much smarter than you believe they are. Second, excessive usage of pointers can slow your code down due to aliasing, thus shutting off any way for the compiler to optimize the code. It looks like you're trying to beat the compiler at the optimization game -- in that game, you will more than likely lose. – PaulMcKenzie Jan 15 '17 at 20:58
  • @PaulMcKenzie I did what you offered me, but I am still getting empty strings in PART B. `if ( lineIndex == 2 ) { /* Add new seq to set of sequences */ setOfSeqs[++seqIndex].setSeq(&str[0u]); setOfSeqs[seqIndex].setLength(str.length()); /* PART A */ std::cout << setOfSeqs[seqIndex].getSeq() << "\n"; //delete newSeq; }` –  Jan 15 '17 at 21:06

0 Answers0