0

I just started learning to code, starting with C++ yesterday. I need it for a project I'm doing and I like to build generation tools as an "onboarding" process when I learn a new skills. So I thought I'd try building out a regex generation tool.

I googled, I binged, and I looked through the similar questions and only saw answers pertaining to Ruby, Perl, or JS. Frankly, I'm a bit surprised given the utility and prevalence of C++, not more people have tried this.

I don't know how to go about the task, as I'm not a professional or really knowledgeable about what I'm doing. I'm not sure how to ask such questions, either. Please bare with me while I explain my current thoughts.

I am currently toying around with generating strings using byte arrays (I find the C++ type system and casting is confusing at times). I wanted to see if there were any specific ranges of random values that produce strings with latin characters more than others. I get a lot of different values, and found a few ranges that looked like sweet spots, but I ultimately don't know what numbers correlate to what characters.

I wanted to establish a pattern, then set the rand() ranges to correlate with the projected total byte value of what the pattern should generate as a string, then go fishing. I understand that I have to account for upper bounds for characters. So the generated values would be something like:

//not implemented
int getBoundary(string expression){
  srand(time(0));
  int boundaries[2] = {0};
  boundaries[0] = getCeilingValue(expression)
  boundaries[1] = getFloorValue(expression)
  return boundaries
}

practice.cpp

        /*
         Method actually producing the byte strings
        */
        void practice::stuub(int boundaries[2]){
        srand(time(0)); //seed
        basic_string<char> byteArray = {}; //"byte array" instantiation
        for (int i = 0; i < 1000; i += 1) {
           if(i % 2 ==0){
              byteArray.push_back(rand() % boundaries[0]);//ceiling 
            }else{
              byteArray.push_back(rand() % boundaries[1]);//floor
           }
        }
        std::string s(byteArray, sizeof(byteArray)); //convert to string
        cout << s << "\n";
    }

    /*
      just a copy pasta validation function that I don't know if I need yet
    */
    bool isNumeric(string str) {
        for (int i = 0; i < str.length(); i++)
            if (isdigit(str[i]) == false)
                return false; //when one non numeric value is found, return false
        return true;
    }

    /*
     current putzing around. It's just been real fun to play around with, 
     but I plan to replace the instantiation of values of the "mod" array with
     the upper/lower bounds of the string projected values This currently takes
     a value and just does random stuff to it on a fishing expedition to see
     if I can find any patterns.
    */
    void practice::randomStringGen() {
        try {
            srand(time(0));
            int mod[2] = {0};
            string choice;
            while (choice != "q") {
                cout << "\n enter an integer to generate random byte strings or press (q) to quit \n";
                cin >> choice;
               if(choice != "q") {// make sure its not quit, otherwise it still carries out the tasks
                    if (isNumeric(choice)) {//make sure its numeric
                        mod[0] = stoi(choice);
                        if(mod[0] > 0) {//make sure its not 0
                            mod[0] = int(pow(mod[0], mod[0]));//do some weirdo math
                            mod[1] = rand() % mod[0]+1; //get another weirdo number
                            cout << "\n random string start:\n";
                            stuub(mod);//generate random string
                            cout << "\n :random string end\n";
                        }else{//user entered invalid integer
                            cout << "\n you did not enter a valid integer. Enter numbers greater than 0";
                        }
                    }else{
                        cout << "\n " << choice << " is not an integer";
                    }
                }
            }
        }catch(std::exception& e){
            cout << e.what();
        }
    }

I hope that provides enough explanation of what I am trying to accomplish. I'm not any sort of pro, and I have very little understanding of what I'm doing. I picked this up yesterday as a absolute beginner.

Talk to me like I'm 5 if you can.

Also, any recommendations on how to improve and "discretize" what I'm currently doing would be much appreciated. I think the nested "ifs" look wonky, but that's just a gut instinct.

Thanks!

yugely
  • 21
  • 1
  • 3
  • 1
    `srand(time(0));` should be done once per _process_, and therefore probably not inside a function. – Mooing Duck Jun 27 '21 at 22:52
  • `basic_string` and `std::string` are two names for the same type – Mooing Duck Jun 27 '21 at 22:54
  • if you're trying to generate random numbers in a range, the equation is `(rand() % (max-min))+min` – Mooing Duck Jun 27 '21 at 22:56
  • `isNumeric(string str)` right now takes its parameter _by copy_. You should probably take the parameter by `const&`, to avoid the copy. – Mooing Duck Jun 27 '21 at 22:58
  • I think the answers to these suggest ways to create a regex give some text: https://stackoverflow.com/questions/6219790/need-a-regex-tool-that-suggests-expressions-based-on-selected-text https://stackoverflow.com/questions/776286/given-a-string-generate-a-regex-that-can-parse-similar-strings https://stackoverflow.com/questions/4880402/how-to-auto-generate-regex-from-given-list-of-strings https://stackoverflow.com/questions/31254100/detect-or-generate-regular-expression-from-string They may help... – Jerry Jeremiah Jun 27 '21 at 22:59
  • I've now read your thoughts twice, and your code twice, and still have absolutely no idea what your question is or what you're trying to do. – Mooing Duck Jun 27 '21 at 23:05
  • Currently, you take input, raise it to it's own power, pick a random number between 1 and that value, and then generate a string made of either the first or second "weirdo" character. – Mooing Duck Jun 27 '21 at 23:06
  • `sizeof(byteArray)` is always the size of the string _metadata_, and does not include the length of the content it owns. You probably just want `std::string s(byteArray)` – Mooing Duck Jun 27 '21 at 23:07
  • @MooingDuck, thank you! I was basically on a fishing expedition. The goal and the question basically is, these characters are basically bytes, numbers. So I wanted to basically take in a specific regex and total the limits with numeric value of the string it could produce. I don't know what base10 numbers correlate to what character value. I basically want to go ```[(1) + \w(2) + \w(2) + \w(2) + ](1) = 8```. Given the type, I could then generate an non numeric latin character based on the numeric keying of ```\w```. Should the total switch, then re-evaluate the numeric keys. – yugely Jun 29 '21 at 01:32
  • @MooingDuck I put in a lot of your feedback! I realized what I wanted. I didn't quite know how to phrase it. So, I was looking at ASCII number ranges. Like 1-127, printable characters. So the numeric ranges of the latin alphabet is like 60+26 = A-Z and 87+26=a-z. So I wanted to use rand()%26+60 to generate a string character (A-Z). I was hoping there is some list of byte values and the characters they correlate with. Thank you again for your patience, I'm learning! – yugely Jul 17 '21 at 14:57
  • The general equation is `(rand() % (max-min))+min`, so if you want random latin characters then `(rand() % ('z'-'a'))+'a'` will give you a random character. Don't assume that `a` is 87, because it's only 87 _on some machines_, and it's also unreadable. Just use `'a'`. – Mooing Duck Jul 19 '21 at 15:21
  • (Sidenote, printable ascii is 32-126, not 1-127) – Mooing Duck Jul 19 '21 at 15:23
  • @MooingDuck that is exactly what I wanted to know. Thats really good information, thank you. this helps me a ton. – yugely Jul 19 '21 at 18:52

0 Answers0