2

I just started using dlls, but I haven't had this problem before, so it might not be dll connected. I am have KMP String-match algorithm implemented in c++ and I am calling it from c# using dll.

This is my export:

extern "C" __declspec (dllexport) const char* naive(const char* text, const   char* str);
extern "C" __declspec (dllexport) const char* KMP(const char* text, const char* str);

My import:

 [DllImport(@"dll_path", CallingConvention = CallingConvention.Cdecl)]
 public static extern IntPtr KMP([MarshalAs(UnmanagedType.LPStr)] string text, [MarshalAs(UnmanagedType.LPStr)] string str);

Calling from c#

  string output = Marshal.PtrToStringAnsi(KMP(richTextBox1.Text, richTextBox2.Text));

And the c++ function:

const char* KMP(const char* text, const char* str)
{
    int tL = strlen(text);
    int sL = strlen(str);
    /* Algorithm */
 }

The exception is thrown right after the function is called. So I figured it's not the code implementation. The wired thing is it's only thrown when there is a '\n' new line in the second parameter (str), no matter where exactly. If there are no new lines it runs normally. The thing that confuses me the most is why just the second argument, both are identically declared and used. I also have implemented Naive algorithm, same story.

All the answers I found were only when a negative number was given as size to an array or an undeclared variable, but nothing on pointers. But I doubt it's anything similar, because when my search string (2nd parameter (str)) doesn't contain new line the code executes normally.

Any ideas ?

Thank you in front.

EDIT (body of function):

const char* KMP(const char* text, const char* str)
{
    int tL = strlen(text);
    int sL = strlen(str);
    string match = "";

    if (sL == 0 || tL == 0)
        throw "both text and string must be larger than 0";
    else if (sL > tL)
        throw "the text must be longer than the string";

    int tI = 0;
    int col = 0, row = 0;

    while (tI <= tL - sL)
    {
        int i = 0;
        int tmpCol = -1;
        int next = 1;
        for (; i <= sL && text[i + tI] != '\0'; i++)
        {
            if (text[i + tI] == '\n')
            {
                row++;
                tmpCol++;
            }
            if (text[i + tI] == str[0] && next == 1 && i > 0)
                next = i;

            if (text[i + tI] != str[i])
                break;
        }
        if (i == sL)
        {
            match += to_string(row) + ',' + to_string(col) + ';';
        }

        tI += next;

        col = tmpCol > -1 ? tmpCol : col + next;
    }
    char* c = new char[match.length() - 1];
    c[match.length() - 1] = '\0';
    for (int i = 0; i < match.length() - 1; i++)
        c[i] = match[i];
    return c;
}
veili_13
  • 63
  • 1
  • 9
  • `IntPtr` and the allocation were suggested by our professor, as we just begun to use dll, and were told that string arrays cause a lot of problems and it is best to use this "template". Freeing .. I totally forgot I haven't used c++ in a while. the allocation is this `char* c = new char[match.length() - 1];` – veili_13 Mar 21 '15 at 18:31
  • Why the `const char*`/`IntPtr` return type? Isn't that algorithm supposed to return an index in the string (`int`)? – Lucas Trzesniewski Mar 21 '15 at 18:31
  • the KMP yes, mu function returns all the found matches with row and column indexes. – veili_13 Mar 21 '15 at 18:33
  • When it really depends on a `\n` in the input then there is some sort of bug in the C++ code. You are probably writing outside a buffer due to a superluous/lacking `\r`. – H H Mar 21 '15 at 18:33
  • 1
    @veili_13 Have you tried it with completely stub methods (like `...{ return "adfadsf";}`)? Just to make absolutely sure that is the marshaling issue. – Eugene Podskal Mar 21 '15 at 18:36
  • @Henk I thought so too, but when I tried to debug it never went pass the declaration of the function. – veili_13 Mar 21 '15 at 18:37
  • @veili_13 What do you mean? Does it return some stub value or it fails in the same way? – Eugene Podskal Mar 21 '15 at 18:38
  • @EugenePodskal yes, as I said if the 2nd parameter (str) does not contain a '\n' the whole code executes normally. I mean to me, this does not make one bit of sense. The first parameter (text), in every try, has multiple lines and all goes normally as long as str does not – veili_13 Mar 21 '15 at 18:39
  • @veili_13 It is a wild guess, but can you check whether string parameters have line breaks as `Environment.NewLine` or as `'\n'`,`\r`? – Eugene Podskal Mar 21 '15 at 18:45
  • @EugenePodskal the ar as '\n', I was also interested in how they were presented. – veili_13 Mar 21 '15 at 18:47
  • @veili_13 Try to normalize them to `Environment.NewLine` - http://stackoverflow.com/questions/841396/what-is-a-quick-way-to-force-crlf-in-c-sharp-net or http://stackoverflow.com/questions/140926/normalize-newlines-in-c-sharp and check the marshaling. – Eugene Podskal Mar 21 '15 at 18:50
  • @EugenePodskal I did, now they were presented as \r\n, but same story, the error was thrown after the calling.... – veili_13 Mar 21 '15 at 18:56
  • That _after the calling_ doesn't mean a whole lot, check your C++ code. It does not appear to be a Marshalling issue. – H H Mar 21 '15 at 18:57
  • @HenkHolterman the error is thrown at this point `int tL = strlen(text);`. My memory allocation I mentioned in the first comment. If you think that a loop trough both arrays can be the cause I'll post the code as well, I didn't thought it was relevant. – veili_13 Mar 21 '15 at 19:02
  • @veili_13 Well, I have specifically asked whether your problem shows itself with stub function that does not do anything - http://stackoverflow.com/questions/29186264/invalid-allocation-size-when-calling-a-function-from-dll?noredirect=1#comment46588107_29186264. – Eugene Podskal Mar 21 '15 at 19:05
  • @EugenePodskal No matter what the return same result, with new line error, without normal execution. – veili_13 Mar 21 '15 at 19:07
  • @HenkHolterman there is the rest of the code, if it's any help. – veili_13 Mar 21 '15 at 19:08

1 Answers1

2

Just change your code to handle no matches case, because runtime cannot allocate 0-1 = 0xFFFFFFFFF bytes. And now I have also changed your copy buffer allocation and loop code to avoid overwrite(as pointed by @HenkHoltermann):

...
if (match.length() == 0)
    return "No matches";

// Allocate for all chars + \0 except the last semicolon
char* c = new char[match.length()];
c[match.length() - 1] = '\0';

// Copy all chars except the last semicolon
for (int i = 0; i < match.length() - 1; i++)
    c[i] = match[i];

return c;

!It still does not copy the last semicolon, so if you need it then you will have to add one more symbol to the buffer.


P.S.: Also I see a few issues with your code:

  1. You use C++ exceptions. While CLR will catch them as SEH (because VC++ uses SEH) it is still not a good idea overall - Throwing C++ exceptions across DLL boundaries
  2. You use signed int for length int tL = strlen(text); and strlen returns unsigned size_t. It may not be an actual problem, but it is not a right way either.
Community
  • 1
  • 1
Eugene Podskal
  • 10,270
  • 5
  • 31
  • 53
  • danm... I must've overseen that ... because I thought my inputs matched ... seems not to. Thank you a lot that solved it. – veili_13 Mar 21 '15 at 19:48
  • In my defense I was testing word\n and it wasn't a match (as I thought) because it was word \n xD. Anyway thank you very much! – veili_13 Mar 21 '15 at 19:53
  • thank you for the tips, I'll get on it. thanks again. – veili_13 Mar 21 '15 at 19:58
  • @veili_13 NEVER TRUST THE INPUT. Sorry for shouting. Also try to use more meaningful names for your variables - Intellisense in any case saves a lot of typing, so why not utilize such boon to create easier comprehensible programs. Empty lines can help too. Good luck. – Eugene Podskal Mar 21 '15 at 20:03
  • I've fixed the override right after @HenkHolterman pointed it out. I know my code is a bit messy and the variables seem totally out of the blue, I'm a bit short on time so they are just initials (tI = text iterator, tL = text length), but I was meaning to ask something else, it's been bugging me (pun not intended), do you have any idea why wouldn't my debuging go on if the break point was at any line before `string match = ""`, but the error would be thrown, shouldn't it be thrown at `char* c = new char[match.length() - 1];` IF there was no match and not before ? – veili_13 Mar 21 '15 at 20:19
  • To be honest I have debugged it in a standalone Cpp project and used dll only to check the actual marshaling. And in such project it showed an assert dialog that it cannot allocate 0xFFFFFFFF bytes. – Eugene Podskal Mar 21 '15 at 20:22
  • @veili_13 I have tried it with dll and unmanaged code debugging - step-by-step fails exactly on the `char* c = new char[match.length() - 1]` allocation when the assert window appears. And I can successfully set the working breakpoint on this line. So I have no idea why it worked so strange for you (perhaps some debug settings or pdb bugs?). – Eugene Podskal Mar 21 '15 at 20:32