0

this is what I have done till now: I want to read words from file in C++ and I am allowed to use only cstring library. this is my piece of code

#include <cstring>
#include <fstream>
#include <stdio.h>

using namespace std;

int main(){

    ifstream file;
    char word[1];


    file.open("p.txt");



    while (!file.eof()){

        file >> word;
        cout << word << endl;

    }
    system("pause");
    return 0;
}

It is working fine and reading one word at a time. But I don't understand how this is working fine. How can char array of any size be it char word[1] or char word[50] read only one word at a time ignoring spaces.

And further I want to store these words in dynamic array. How can I achieve this? Any guidance would be appreciated?

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
qwas
  • 77
  • 8
  • 3
    See ["while( !feof( file ) )" is always wrong](http://stackoverflow.com/q/5431941) – Joseph Quinsey May 15 '14 at 21:30
  • Probably because the type of a C-string is `char *` and the type of an array of characters is `char *`. In one method or another, the array is decomposing to `char *`. – Thomas Matthews May 15 '14 at 21:31
  • 2
    This is a buffer overflow. Use `std::string`. – Marius Bancila May 15 '14 at 21:31
  • 4
    Its *not* "working fine". Don't confuse *observed* behavior with *defined* behavior. Send it through Valgrind. – WhozCraig May 15 '14 at 21:31
  • 1
    I suggest you use `std::string` to avoid problems like this. For a single characters, don't use an array. – Thomas Matthews May 15 '14 at 21:31
  • 1
    the only explanation is that you are getting lucky. For strings longer than a character, `file >> word` is going to write over memory it hasn't allocated, which is UB. Why aren't you using `std::string`? – Red Alert May 15 '14 at 21:32
  • 1
    possible duplicate of [Why does reading a struct record fields from std::istream fail, and how can I fix it?](http://stackoverflow.com/questions/23047052/why-does-reading-a-struct-record-fields-from-stdistream-fail-and-how-can-i-fi) – πάντα ῥεῖ May 15 '14 at 21:36
  • 1
    @RedAlert, not 'for strings longer than a character' but, actually, for *any* string. Remember that C/C++ string are NUL-terminated, one-char array can contain only an empty string, one-char string needs 2-char array. – CiaPan May 15 '14 at 21:45
  • 1
    Sorry, selected the wrong dupe link, this one I meant: http://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong – πάντα ῥεῖ May 15 '14 at 21:49

7 Answers7

7

Your code has undefined behaviour. operator >> simply overwrites memory beyond the array.

Take into account that included by you header <stdio.h> is not used in the program. On the other hand you need to include header <cstdlib> that declares function system.

As for your second question then you should use for example standard container std::vector<std::string>

For example

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cstdlib>

int main()
{
    std::ifstream file("p.txt");
    std::string s;
    std::vector<std::string> v;
    v.reserve( 100 );



    while ( file >> s ) v.push_back( s );

    std::system( "pause" );

    return 0;
}

Or you can simply define the vector as

std::vector<std::string> v( ( std::istream_iterator<std::string>( file ) ),
                            std::istream_iterator<std::string>() );

provided that you will include header <iterator>

For example

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>
#include <cstdlib>

int main()
{
    std::ifstream file("p.txt");
    std::vector<std::string> v( ( std::istream_iterator<std::string>( file ) ),
                                std::istream_iterator<std::string>() );

    for ( const std::string &s : v ) std::cout << s << std::endl;

    std::system( "pause" );

    return 0;
}
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
4

Your code is invoking undefined behavior. That it doesn't crash is a roll of the dice, but its execution is not deterministic precisely because that is the nature of being undefined.

The easiest way (I've found) to load a file of words with whitespace separation is by:

std::ifstream inp("p.txt");
std::istream_iterator<std::string> inp_it(inp), inp_eof;
std::vector<std::string> strs(inp_it, inp_eof);

strs will contain every whitespace delimited char sequence as a linear vector of std::string. Use std::string for dynamic string content and don't feel the least bit guilty about exploiting the hell out of the hard work those that came before you gave us all: The Standard Library.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
1

You're using C++, so you can avoid all that C stuff.

std::string word;
std::vector<std::string> words;
std::fstream stream("wordlist");

// this assumes one word (or phrase, with spaces, etc) per line...
while (std::getline(stream, word))    
    words.push_back(word);

or for multiple words (or phrases, with spaces, etc) per line separated by commas:

while (std::getline(stream, word, ','))    
    words.push_back(word);

or for multiple words per line separated by spaces:

while(stream >> word)
    words.push_back(word);

No need to worry about buffer sizes or memory allocation or anything like that.

Rook
  • 5,734
  • 3
  • 34
  • 43
  • checking if the stream is good is a bad programming practise. – Veritas May 15 '14 at 21:35
  • This code is broken, see: [Testing stream.good() or !stream.eof() reads last line twice](http://stackoverflow.com/questions/4324441/testing-stream-good-or-stream-eof-reads-last-line-twice) – Blastfurnace May 15 '14 at 22:57
  • @Veritas: you are of course entirely correct. Mea culpa. Fixed now, I hope. – Rook May 16 '14 at 16:44
1

It works because you are lucky and you don't overwrite some critical memory. You need to allocate enough bytes for char word array, say char word[64]. And use while(file>>word) as your test for EOF. In the loop you can push_back the word into a std::vector<string> if you are allowed to use C++ STL.

If you want a simple C++11 STL-like solution, use this

#include <algorithm>
#include <iterator>
#include <vector>
#include <string>
#include <fstream>
#include <iostream>

using namespace std;

int main()
{
   ifstream fin("./in.txt"); // input file
   vector<string> words; // store the words in a vector 
   copy(istream_iterator<string>(fin),{}, back_inserter(words)); // insert the words
   for(auto &elem: words)
       cout << elem << endl; // display them
}

Or, more compactly, construct the container directly from the stream iterator like

vector<string> words(istream_iterator<string>(fin),{});

and remove the copy statement.

If instead a vector<string> you use a multiset<string> (#include <set>) and change

copy(istream_iterator<string>(fin),{}, back_inserter(words)); // insert the words

to

copy(istream_iterator<string>(fin),{}, inserter(words, words.begin())); // insert the words

you get the words ordered. So using STL is the cleanest approach in my opinion.

vsoftco
  • 55,410
  • 12
  • 139
  • 252
  • There are words in the english language with more than 40 letters (there are others but here is an example: http://en.wikipedia.org/wiki/Pneumonoultramicroscopicsilicovolcanoconiosis ). If you include hyphens it can get much longer than that. There is no way to be sure that char[X] is a big enough array - you need something that will grow automatically like a std::string. – Jerry Jeremiah May 15 '14 at 21:41
  • Sure, I just pointed out that `char word[1]` is not enough (edited my answer now). In any case, one should use STL and not plain old C in C++, it makes no sense to use a language like C++ in C-style mode, unless you get a significant performance decrease (which is not the case for STL). – vsoftco May 15 '14 at 21:44
1

Your code is failing due to the overload of char * for operator>>.

An array of char, regardless the size, will decompose to the type char * where the value is the address of the start of the array.

For compatibility with the C language, the overloaded operator>>(char *) has been implemented to read one or more characters until a terminating whitespace character is reached, or there is an error with the stream.

If you declare an array of 1 character and read from a file containing "California", the function will put 'C' into the first location of the array and keep writing the remaining characters to the next locations in memory (regardless of what data type they are). This is known as a buffer overflow.

A much safer method is to read into a std::string or if you only want one character, use a char variable. Look in your favorite C++ reference for the getline methods. There is an overload for reading until a given delimiter is reached.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
1

You only need a couple changes:

#include <cstring>
#include <fstream>
#include <stdio.h>
#include <string>
int main(){

    ifstream file;
    string word;    

    file.open("p.txt");    

    while (file >> word){
        cout << word << endl;    
    }

    system("pause");
    return 0;
}
Red Alert
  • 3,786
  • 2
  • 17
  • 24
0
file>>char *

Will work with any char * and you are using

file >> word;

and it simply sees work variable as a char * but you are getting a segemntation fault somewhere and if your code grows you will see something is not working without any logical reason. GDB debugger will show you the seg fault

madz
  • 1,803
  • 18
  • 45