1

Good day. I could really use your help on this one. I have a stats text file in the following format.

ID=1000000 
Name=Name1
Field1=Value1 
...(Fields 2 to 25)
Field26=Value26 

ID=1000001
Name=Name2
Field1=Value1 
...(Fields 2 to 25) 
Field26=Value26

ID=1000002
Name=Name2
Field1=Value1 
...(Fields 2 to 25) 
Field26=Value26 

...goes up to 15000

I have an active people text file separated by line breaks.

Name2
Name5
Name11
Name12 
...goes up to 1400 Random Names

I need to be able to delete records from the stats text file (ID, Name, Fields1 to 26) if the name is not found in the active people text file. In the example above, the associated record for Name1(ID, Name, Fields1 to 26) should be deleted since it's not in the active people text file.

I've tried reformatting the stats file through notepad++ using TextFX->Quick->Find/Replace to convert it to a comma separated file with each record separated by a line break. I had it rearranged to

ID       Name    Field1  ...Fields2 to Fields 25... Field26
1000000  Name1   Value1  ...Value2 to Value 25...   Value26
1000001  Name2   Value1  ...Value2 to Value 25...   Value26
1000002  Name2   Value1  ...Value2 to Value 25...   Value26

I've opened it with excel and I've created two tables (stats table and a active names table) in mysql using the csv file file. I'm not sure how to process this in an automatic function. Besides removing inactive records, the other problem I have is rewriting it back to its old format.

I've been trying my best to figure this out for a hours on end. Is there a solution that won't require me to use find, copy, paste and switch between the two files 1400 times? Unfortunately, I have to keep the stats file in this format.

Please help. Thank you.

Krispy K
  • 25
  • 4
  • It doesn't sound like you are asking how to write a program to do this so I think you will get a better response over at http://superuser.com/ – IronMensan Oct 18 '11 at 12:02
  • I've added more information as to how I tried to solve using programs. – Krispy K Oct 18 '11 at 17:23
  • That was my point, you are trying to solve this problem "using programs." Stackoverflow is geared towards people **writing** programs. – IronMensan Oct 18 '11 at 19:34

1 Answers1

1

Here's a C++ program that will process the files for you:

#include <algorithm>
#include <fstream>
#include <iostream>
#include <locale>
#include <set>
#include <string>
#include <vector>

//trim functions taken:
//http://stackoverflow.com/questions/216823/whats-the-best-way-to-trim-stdstring/217605#217605
//with a slight change because of trouble with ambiguity
static int myIsSpace(int test)
{
    static std::locale loc;
    return std::isspace(test,loc);
}
static std::string &rtrim(std::string &s) {
    s.erase(std::find_if(s.rbegin(), s.rend(), std::not1(std::ptr_fun<int, int>(myIsSpace))).base(), s.end());
    return s;
}

static std::string &ltrim(std::string &s) {
    s.erase(s.begin(), std::find_if(s.begin(), s.end(), std::not1(std::ptr_fun<int, int>(myIsSpace))));
    return s;
}

static std::string &trim(std::string &s) {return ltrim(rtrim(s));}

int main(int argc,char * argv[])
{
    std::ifstream peopleFile;
    peopleFile.open("people.txt");

    if (!peopleFile.is_open()) {
        std::cout << "Could not open people.txt" << std::endl;
        return -1;
    }

    std::set<std::string> people;

    while (!peopleFile.eof()) {
        std::string somePerson;
        std::getline(peopleFile,somePerson);
        trim(somePerson);
        if (!somePerson.empty()) {
            people.insert(somePerson);
        }
    }

    peopleFile.close();

    std::ifstream statsFile;
    statsFile.open("stats.txt");

    if (!statsFile.is_open()) {
        std::cout << "could not open stats.txt" << std::endl;
        return -2;
    }

    std::ofstream newStats;
    newStats.open("new_stats.txt");

    if (!newStats.is_open()) {
        std::cout << "could not open new_stats.txt" << std::endl;
        statsFile.close();
        return -3;
    }

    size_t totalRecords=0;
    size_t includedRecords=0;

    bool firstRecord=true;
    bool included=false;
    std::vector<std::string> record;
    while (!statsFile.eof()) {
        std::string recordLine;
        getline(statsFile,recordLine);
        std::string trimmedRecordLine(recordLine);
        trim(trimmedRecordLine);

        if (trimmedRecordLine.empty()) {
            if (!record.empty()) {
                ++totalRecords;

                if (included) {
                    ++includedRecords;

                    if (firstRecord) {
                        firstRecord=false;
                    } else {
                        newStats << std::endl;
                    }

                    for (std::vector<std::string>::iterator i=record.begin();i!=record.end();++i) {
                        newStats << *i << std::endl;
                    }
                    included=false;
                }

                record.clear();
            }
        } else {
            record.push_back(recordLine);
            if (!included) {
                if (0==trimmedRecordLine.compare(0,4,"Name")) {
                    trimmedRecordLine=trimmedRecordLine.substr(4);
                    ltrim(trimmedRecordLine);
                    if (!trimmedRecordLine.empty() && '='==trimmedRecordLine[0]) {
                        trimmedRecordLine=trimmedRecordLine.substr(1);
                        ltrim(trimmedRecordLine);
                        included=people.end()!=people.find(trimmedRecordLine);
                    }
                }
            }
        }
    }

    if (!record.empty()) {
        ++totalRecords;

        if (included) {
            ++includedRecords;

            if (firstRecord) {
                firstRecord=false;
            } else {
                newStats << std::endl;
            }

            for (std::vector<std::string>::iterator i=record.begin();i!=record.end();++i) {
                newStats << *i << std::endl;
            }
            included=false;
        }

        record.clear();
    }

    statsFile.close();
    newStats.close();

    std::cout << "Wrote new_stats.txt with " << includedRecords << " of the " << totalRecords << ((1==totalRecords)?" record":" records") << "found in stats.txt after filtering against the " << people.size() << ((1==people.size())?" person":" people") << " found in people.txt" << std::endl;

    return 0;
}
IronMensan
  • 6,761
  • 1
  • 26
  • 35