1

In C/C++, how can I extract from c:\Blabla - dsf\blup\AAA - BBB\blabla.bmp the substrings AAA and BBB ?

i.e. extract the parts before and after - in the last folder of a filename.

Thanks in advance.

(PS: if possible, with no Framework .net or such things, in which I could easily get lost)

Basj
  • 41,386
  • 99
  • 383
  • 673

5 Answers5

2
#include <iostream>
using namespace std;

#include <windows.h>
#include <Shlwapi.h> // link with shlwapi.lib

int main()
{
    char buffer_1[ ] = "c:\\Blabla - dsf\\blup\\AAA - BBB\\blabla.bmp"; 
    char *lpStr1 = buffer_1;

    // Remove the file name from the string
    PathRemoveFileSpec(lpStr1);
    string s(lpStr1);

    // Find the last directory name
    stringstream ss(s.substr(s.rfind('\\') + 1));

   // Split the last directory name into tokens separated by '-'
    while (getline(ss, s, '-')) 
        cout << s << endl;
}

Explanation in comments.

This doesn't trim leading spaces - in the output - if you also want to do that - check this.

Community
  • 1
  • 1
user93353
  • 13,733
  • 8
  • 60
  • 122
2

This does all the work and validations in plain C:

int FindParts(const char* source, char** firstOut, char** secondOut)
{
const char* last        = NULL;
const char* previous    = NULL;
const char* middle      = NULL;
const char* middle1     = NULL;
const char* middle2     = NULL;
char* first;
char* second;

last = strrchr(source, '\\');
if (!last || (last  == source))
    return -1;
--last;
if (last == source)
    return -1;

previous = last;
for (; (previous != source) && (*previous != '\\'); --previous);
++previous;

{
    middle = strchr(previous, '-');
    if (!middle || (middle > last))
        return -1;

    middle1 = middle-1;
    middle2 = middle+1;
}

//  now skip spaces

for (; (previous != middle1) && (*previous == ' '); ++previous);
if (previous == middle1)
    return -1;
for (; (middle1 != previous) && (*middle1 == ' '); --middle1);
if (middle1 == previous)
    return -1;
for (; (middle2 != last) && (*middle2 == ' '); ++middle2);
if (middle2 == last)
    return -1;
for (; (middle2 != last) && (*last == ' '); --last);
if (middle2 == last)
    return -1;

first   = (char*)malloc(middle1-previous+1 + 1);
second  = (char*)malloc(last-middle2+1 + 1);
if (!first || !second)
{
    free(first);
    free(second);
    return -1;
}

strncpy(first, previous, middle1-previous+1);
first[middle1-previous+1] = '\0';
strncpy(second, middle2, last-middle2+1);
second[last-middle2+1] = '\0';

*firstOut   = first;
*secondOut  = second;

return 1;
}
Liviu
  • 1,859
  • 2
  • 22
  • 48
  • @James Kanze Before being smart (C++/C#), you have to know plain C. Maybe it's a homework. – Liviu May 17 '13 at 09:18
  • This is completely false. You should _not_ bother with plain C before learning C++; it will only be a source of confusion. – James Kanze May 17 '13 at 09:47
  • In my days, in technical schools, you first learn C to simply know about pointers. "Completely false" ? Please, be careful with your language ! – Liviu May 17 '13 at 09:55
  • 1
    In my days, when I was in school, you didn't learn C, because it hadn't been invented yet. Today, regardless of the school, they should be teaching C++ before teaching C (unless they don't count on teaching C++ at all; e.g. if the only reason they're teaching programming is to support embedded controlers or the like). Other than for embedded controllers, there's practically no reason to teach C at all. – James Kanze May 17 '13 at 10:06
2

This can relatively easily be done with regular expressions: std::regex if you have C++11; boost::regex if you don't:

static std::regex( R"(.*\\(\w+)\s*-\s*(\w+)\\[^\\]*$" );
smatch results;
if ( std::regex_match( path, results, regex ) ) {
    std::string firstMatch = results[1];
    std::string secondMatch = results[2];
    //  ...
}

Also, you definitely should have the functions split and trim in toolkit:

template <std::ctype_base::mask test>
class IsNot
{
    std::locale ensureLifetime;
    std::ctype<char> const* ctype;  //  Pointer to allow assignment
public:
    Is( std::locale const& loc = std::locale() )
        : ensureLifetime( loc )
        , ctype( &std::use_facet<std::ctype<char>>( loc ) )
    {
    }
    bool operator()( char ch ) const
    {
        return !ctype->is( test, ch );
    }
};
typedef IsNot<std::ctype_base::space> IsNotSpace;

std::vector<std::string>
split( std::string const& original, char separator )
{
    std::vector<std::string> results;
    std::string::const_iterator current = original.begin();
    std::string::const_iterator end = original.end();
    std::string::const_iterator next = std::find( current, end, separator );
    while ( next != end ) {
        results.push_back( std::string( current, next ) );
        current = next + 1;
        next = std::find( current, end, separator );
    }
    results.push_back( std::string( current, next ) );
    return results;
}

std::string
trim( std::string const& original )
{
    std::string::const_iterator end
        = std::find_if( original.rbegin(), original.rend(), IsNotSpace() ).base();
    std::string::const_iterator begin
        = std::find_if( original.begin(), end, IsNotSpace() );
    return std::string( begin, end );
}

(These are just the ones you need here. You'll obviously want the full complement of IsXxx and IsNotXxx predicates, a split which can split according to a regular expression, a trim which can be passed a predicate object specifying what is to be trimmed, etc.)

Anyway, the application of split and trim should be obvious to give you what you want.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • Thanks for your answer! I use Visual C++ 2010 express, so I probably don't have C++11 ? What do I need to do then? I forgot to ask : what happens if: a) there are many occurences of ` - ` in the last folder name? b) there are no occurences of ` - ` in the last folder name? – Basj May 17 '13 at 15:14
  • If you don't have C++11, there's always boost::regex. And `split` and `trim` don't require c++11. As to what will happen if there are a different number of occurances of `-`: it depends. With the regular expression I present, there won't be a match, so you won''t go into the `if`. Using `split` and `trim`, it's up to you; when you do the second `split` (on `'-'`), you'll end up with one more field than there are `'-'`. – James Kanze May 20 '13 at 07:55
1

Use std::string rfind rfind (char c, size_t pos = npos)

  1. Find character '\' from the end using rfind (pos1)
  2. Find next character '\' using rfind (pos2)
  3. Get the substring between the positions pos2 and pos1. Use substring function for that.
  4. Find character '-' (pos3)
  5. Extract 2 substrings between pos3 and pos1, pos3 and pos2
  6. Remove the spaces in the substrings.

Resulting substrings will be AAA and BBB

bjskishore123
  • 6,144
  • 9
  • 44
  • 66
  • Good solution. It would be maybe more easier to extract the substrings without spaces. –  May 17 '13 at 08:30
  • @gkovacs: if we know the text format will always be word-word , then we can decrease/increase position while extracting substrings, which means without spaces. – bjskishore123 May 17 '13 at 08:35
  • 1
    Yes, this is exactly what I'm saying, so theres no need for step 6 in your algrithm. It can be solved with one statement. But this is just a little correction, Your solution works well. –  May 17 '13 at 08:41
  • This will work, but is rather specialized. If the original poster doesn't have the generic tools which will make this particular problem trivial, he should add them to his toolbox, rather than writing some specialized code which can't be used elsewhere. – James Kanze May 17 '13 at 09:15
  • Thanks! Your steps are nice, I think it will work perfect... But I probably won't be able to implement it easily ;) – Basj May 17 '13 at 15:19
1

The plain C++ solution (without boost, nor C++11), still the regex solution of James Kanze (https://stackoverflow.com/a/16605408/1032277) is the most generic and elegant:

inline void Trim(std::string& source)
{
size_t position = source.find_first_not_of(" ");
if (std::string::npos != position)
    source = source.substr(position);
position = source.find_last_not_of(" ");
if (std::string::npos != position)
    source = source.substr(0, position+1);
}

inline bool FindParts(const std::string& source, std::string& first, std::string& second)
{
size_t last = source.find_last_of('\\');
if ((std::string::npos == last) || !last)
    return false;

size_t previous = source.find_last_of('\\', last-1);
if (std::string::npos == last)
    previous = -1;

size_t middle = source.find_first_of('-',1+previous);
if ((std::string::npos == middle) || (middle > last))
    return false;

first   = source.substr(1+previous, (middle-1)-(1+previous)+1);
second  = source.substr(1+middle, (last-1)-(1+middle)+1);

Trim(first);
Trim(second);

return true;
}
Community
  • 1
  • 1
Liviu
  • 1,859
  • 2
  • 22
  • 48
  • thanks! do I really need `trim` if I want to extract what is before and after ` - ` (<->) ? If we search for ` - `, is it really necessary to trim? – Basj May 17 '13 at 16:00
  • 1
    No, you don't need to. Therefore you can eliminate `Trim` (calls and processing). – Liviu May 17 '13 at 16:11