0

First time post here and new to python. My program should take a json file and convert it to csv. I have to check each field for validity. For a record that does not have all valid fields, I need to output those records to file. My question is, how would I take the a invalid data entry and save it to a text file? Currently, the program can check for validity but I do not know how to extract the data that is invalid.

import numpy as np
import pandas as pd
import logging
import re as regex
from validate_email import validate_email



# Variables for characters
passRegex = r"^(?!.*\s)(?=.*[A-Z])(?=.*[a-z])(?=.*\d).{8,50}$"
nameRegex = r"^[a-zA-Z0-9\s\-]{2,80}$"




# Read in json file to dataframe df variable
# Read in data as a string
df = pd.read_json('j2.json', dtype={'string'})




# Find nan values and replace it with string
#df = df.replace(np.nan, 'Error.log', regex=True)


# Data validation check for columns
df['accountValid'] = df['account'].str.contains(nameRegex, regex=True)
df['userNameValid'] = df['userName'].str.contains(nameRegex, regex=True)
df['valid_email'] = df['email'].apply(lambda x: validate_email(x))
df['valid_number'] = df['phone'].apply(lambda x: len(str(x)) == 11)




# Prepend 86 to phone number column
df['phone'] = ('86' + df['phone'])

Convert dataframe to csv file

df.to_csv('test.csv', index=False)

The json file I am using has thousands of rows Thank you in advance!

  • Your question is unclear. What is your current output, and what is your expected output? Please [edit] to include a [mcve] so that we can better understand. See [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – G. Anderson Oct 18 '22 at 23:24
  • Ok, sorry I was not clear. I have updated the information. Thank you for the help, I hope it is a little more clear now. – sodaapopped Oct 19 '22 at 01:39

0 Answers0