0

Hello guys i want to ask, why are my punctuations on my domain name generator not random enough? here is my code

import random

Example = open("dictionaries/examples.txt", 'w', encoding="utf-8")
with open("dictionaries/Combined.txt") as i:
    Test = [line.rstrip() for line in i]

This is a few words what inside the combined text file

 Example ['login', 'confirm', 'signup', 'confirmation', 'enroll', 'mobile', 'access',
'claim', 'service', 'group', 'recovery', 'support', 'find', 'confirmation']

delimeters = ['', '-', '.']
web = 'web'
subdomain_count2 = 2
subdomain_count3 = 3
subdomain_count4 = 4
suffix = ['co.id', 'info', 'id', 'com']
output = []
for i in range(100):
    outputs = []
    for j in range(subdomain_count2):
        outputs.append(random.choice(Test))
    data = outputs + [web]
    random.shuffle(data)
    for k in range(subdomain_count3):
        outputs.append(random.choice(Test))
    data2 = outputs + [web]
    random.shuffle(data2)
    for l in range(subdomain_count4):
        outputs.append(random.choice(Test))
    data3 = outputs + [web]
    random.shuffle(data3)
    prefix = random.choice(delimeters).join(data)
    prefix2 = random.choice(delimeters).join(data2)
    prefix3 = random.choice(delimeters).join(data3)
    out = random.choice(delimeters).join([prefix])
    out2 = random.choice(delimeters).join([prefix2])
    out3 = random.choice(delimeters).join([prefix3])
    tld = random.choice(suffix)
    addr1 = '.'.join([out, tld])
    addr2 = '.'.join([out2, tld])
    addr3 = '.'.join([out3, tld])
    addr = (addr1 + '\n' + addr2 + '\n' + addr3 + '\n')
    output.append(addr)
    Example.write(addr)

with open("dictionaries/examples.txt") as f:
    websamples = [line.rstrip() for line in f]

This is the output that i get.

Output

As you can see from the image that the domain punctuation is pretty straightforward and has no variation for the domain name. it should've looked like this web-loginconfirm.co.id or web.login-confirm.co.id since I have made the delimiters = ['', '-', .] so there would be a combination where there are 3 characters connected with the punctuation like this web-confirm.group.com or confirm.web-group.com and there are also 3 characters that connects without the punction like this webclaimservice.com.

And also as you can see on the 2nd domain and 3rd domain, it somehow does not follow the subdomain number that I want based on this code line for example if subdomain_count3 = 3 there should've only been 3 subdomains. On the 2nd domain (excluding the web variable since it's a fixed one and I want it to always show up on every domain) signup-login-web-confirmation-confirm-enroll.co.id, there are 5 subdomains instead of 3 subdomains (signup, login, confirmation, confirm, enroll). Is there a reason why the delimiters are not random enough and also the subdomain count on the 2nd and the 3rd one has too many variables than i wanted?

Thank you!

EDIT:

Okay for the subdomain range one I have actually found the solution.

I basically added another output name for each subdomain

for i in range(5):
    outputs = []
    outputs2 = []
    outputs3 = []

Correct Output

No need to answer the 2nd question, i still need help of the first one

thanks!

Tsukuru
  • 65
  • 7
  • You are asking many questions here, so its not clear what you want, but firstly I think that you use: `outputs.append(...)` without using `outputs=[]` before the loop, so you retain the previous contents. Secondly you have: `random.choice(delimeters).join(data)` which randomly picks just one of the delimiters and uses that alone to join the parts in `data`. – quamrana Oct 28 '22 at 08:59
  • For the 2nd question there's no need to answer it, I have found the solution. but is still need help on the first one @quamrana – Tsukuru Oct 28 '22 at 09:02

1 Answers1

1

The domain punctuation varies between the three domains generated in the main loop, for example the three domain names with find in them one has a dot and the other has nothing. You are only choosing the delimiter once, and using that one delimiter to join all the elements in data:

prefix = random.choice(delimeters).join(data)

You are adding to the same output list for the second and third subdomains, either don't do that (in which case you can get rid of 2/3 of the code and just loop over three different values for the number of domains to add) or only add the difference between subdomain_count3 and subdomain_count2 subdomains.

You can also use random.choices to get a list of k choices rather than looping and calling random.choice repeatedly. Do that twice - once for the domain parts and once for the delimiters, then zip the lists.

import random
Test = ['login', 'confirm', 'signup', 'confirmation', 'enroll', 'mobile', 'access',
        'claim', 'service', 'group', 'recovery', 'support', 'find', 'confirmation']

delimiters = ['', '-', '.']
web = 'web'
suffix = ['co.id', 'info', 'id', 'com']
output = []

for i in range(100):
    for subdomain_count in [2, 3, 4]:
        data = [web] + random.choices(Test, k=subdomain_count)
        random.shuffle(data)
        delims = (random.choices(delimiters, k=subdomain_count) +
            ['.' + random.choice(suffix)])
        address = ''.join([a+b for a, b in zip(data, delims)])
        output.append(address)

for o in output:
    print(o)
Pete Kirkham
  • 48,893
  • 5
  • 92
  • 171
  • Oh wow, this one is much shorter and much simpler than the code that I made, i want to ask with this code line `address = ''.join([a+b for a, b in zip(data, delims)])` can you explain on how this one works? do the a and b represent data and delims? @Pete_Kirkham – Tsukuru Oct 28 '22 at 09:45
  • I also want to ask again, if i want to add `{'http://', 'https://']` in front of the subdomains what should i do? i tried putting it in this line `address = ''.join([a+b+c for a, b, c in zip(http, data, delims)])` it only outputs the h word and no the the whole word – Tsukuru Oct 28 '22 at 10:49
  • @Tsukuru Yes, `a` and `b` unpack the zipped data and delims elements. See also https://stackoverflow.com/questions/19560044/how-to-concatenate-element-wise-two-lists-in-python . To add a protocol, the easiest way is to just prepend it to the string at the end in the `address = ...` line; adding another parameter to zip will treat that as a sequence to be zipped not an extra element in the sequences. – Pete Kirkham Oct 28 '22 at 12:12