-4

I have a e-mails in a text file and I need to sort them. Good for one text file, second (invalid) for another text file

import re

regex = re.compile(r'([A-Za-z0-9]+[.-_])*[A-Za-z0-9]+@[A-Za-z0-9-]+(\.[A-Z|a-z]{1,4})+')
outfile = open("good_email.txt", "a+")
incorrect_emails = open("incorrect_emails.txt", "a+")
    
def isValid(email):
    with open(email, 'r+') as file1:
        lines = file1.readlines()
        for line in lines:
            if re.fullmatch(regex, email):
              outfile.write(email)
              print("good")
            else:
                incorrect_emails.write(email)
                print("no good")
        
isValid("email-pack-1.txt")

My code is not working because it doesn't sort

Basior
  • 1
  • 1

2 Answers2

0

I sorted the file before everthing like that

import re

regex = re.compile(r'([A-Za-z0-9]+[.-_])*[A-Za-z0-9]+@[A-Za-z0-9-]+(\.[A-Z|a-z]{1,4})+')
outfile = open("good_email.txt", "a+")
incorrect_emails = open("incorrect_emails.txt", "a+")


def isValid(email):
    with open(email, 'r+') as file1:
        lines = sorted(file1.readlines())
        for line in lines:
            if re.fullmatch(regex, line):
              outfile.write(line)
              print("good")
            else:
                incorrect_emails.write(line)
                print("no good")


isValid("email-pack-1.txt")

But, this way you are sorted the file in memory, if there isn't no problem for you.

  • In the txt. I have e-mails and they are still not sorted and bar.txtbat.txt is entered in the "incorrect_emails" notebook :/ – Basior Jun 04 '22 at 04:13
0

Your utility for email validation:

import re


_email_validator = re.compile(r'([A-Za-z0-9]+[.-_])*[A-Za-z0-9]+@[A-Za-z0-9-]+(\.[A-Z|a-z]{1,4})+')


def validate_email(email: str):
    return _email_validator.fullmatch(email)

Also, I would recommend you to check out: https://pypi.org/project/email-validate/ It is a nice package for validating e-mails. No need to write your own validator.


Your "store" utility functions.

from os import PathLike
from pathlib import Path
from typing import List, Union


# default location of your storage of valid emails
STORE_PATH = 'emails.valid.txt'


def get_email_store(path: Union[str, PathLike]):
    """
    Loads your store of emails or gives a default empty list.

    :param path: a path to your store of valid emails
    :return: list of valid emails
    """

    if isinstance(path, str):
        path = Path(path)

    if path.exists():
        return path.read_text(encoding='utf-8').splitlines()
    return []


def update_store(data: List[str], store_path: Union[str, PathLike] = None):
    """
    Utility function to help with updating your emails.

    :param data: a list of emails to update your store with
    :param store_path: your store resource file
    :return: None
    """

    if store_path is None:
        store_path = STORE_PATH
    if isinstance(store_path, str):
        store_path = Path(store_path)

    store = get_email_store(store_path)
    store += list(filter(validate_email, data))
    store.sort()

    store_path.write_text(
        '\n'.join(store),
        encoding='utf-8'
    )

I assumed that your files are utf-8 encoded.

Also - please clarify, if I guessed wrong - I assumed that you would like your whole output file (here: emails.valid.txt) to be sorted.

If this is the case, you should load your email-strings in memory, add the new email sources to them, and save your new store.


Usage:

# your toy example sources
# you could also read your inputs from a file or files
valid_emails_src = [
    'foo@foo.com',
    'bar@bar.it',
    'fooz@fooz.fr',
    'bazz@bazz.en'
]
invalid_emails_src = [
    'foo.foo.com',
    'bar@bar',
    '@fooz.fr',
    'bazz.en'
]
emails_src = valid_emails_src + invalid_emails_src

update_store(emails_src)

Note: This is a toy example. However, when working with files - depending on the use case - you may want to lock your files, so that others do not accidentally change them, while you are working with them.

See: Locking a file in Python

skyzip
  • 237
  • 2
  • 11