0
keywords_cap = ['SpA', 'SPA', 'LIMITADA', 'LTDA', 'S.A.']
keywords_cap = map(re.escape, keywords_cap)
keywords_cap.sort(key=len, reverse=True)
obj = re.compile(r'[:,;.]\s*"?([^:,;.]*?(?<!\w)(?:{}))'.format('|'.join(keywords_cap)))
m = obj.search(mensaje)
if m:
    company_name = ("COMPANY NAME: {}".format(m.group(1)))

regex = r"\s*CVE\s+([^|]*)"
matches = re.search(regex, mensaje)
if matches:
    company_cve = ("CVE: %s" % matches.group(1).strip())

dic = [company_name, company_cve, "example", "another"]  #dictionary

create_csv = "exampleCsv.csv" 
csv = open(create_csv, "w") 

columnTitleRow = "name, cve\n"
csv.write(columnTitleRow)

for key in dic.keys():
 company_name = key
 company_cve = dic[key]
 row = company_name + "," + company_cve + "\n"
 csv.write(row)

I have a lot of regular expressions inside a for. Here an example with the variable CVE found in a text file. I would like to be able to write the result of all the variables found in the text in a csv separated by commas.

INPUT: Variables company_name and company_cve (and more extracted from regular expressions...)

OUTPUT: CVE with header the name of the variables and below another row with the result of variable company_name, company_cve, and others variables.

Anna Castan
  • 89
  • 1
  • 7

1 Answers1

1

It's not totally clear from the problem statement, but I think you are collecting data about a company (or many companies) and then you want to print that data out in a CSV file.

Firstly, a couple of issues with your existing code:

dic = [company_name, company_cve, "example", "another"]  #dictionary

This is a list, not a dictionary. A dictionary has a key-value structure, and would be declared like this:

dic = { 'name': company_name, 'cve': company_cve, 'example': 'example data' }

Secondly, it's better to use with when opening a file to forestall potential file opening errors:

with open('exampleCsv.csv', 'w') as csvfile:
    # write to the file

Thirdly, if you use the existing csv library, you don't have to worry about cases where (e.g.) company names or other file data has commas in it; csv will handle that for you. See the csv library documentation for full details. You will need to import the cvs library at the start of your script:

import csv
import re

and create a csv.writer object to output your data:

with open('exampleCsv.csv', 'w') as csvfile:
    writer = csv.writer(csvfile, delimiter=',', quotechar='"') # set your CSV options here

Depending on what data structure you are using to hold the company data, you can print it out in different ways.

# list of headers
headers = [ 'company_name', 'company_cve', 'prop_1', 'prop_2' ]

# list of variables holding company data
dic = [ company_name, company_cve, example_1, example_2 ]

with open('exampleCsv.csv', 'w') as csvfile:
    writer = csv.writer(csvfile, delimiter=',', quotechar='"')
    writer.writerow( headers )  # print the header row
    writer.writerow( dic )

Or if you created a dictionary containing your company data:

company = {
    'company_name': 'Company A',
    'company_cve': 'Some value',
    'prop_1': 'value 1',
    'prop_2': 'value 2'
}
# note that dictionary keys match the headers
headers = [ 'company_name', 'company_cve', 'prop_1', 'prop_2' ]

with open('exampleCsv.csv', 'w') as csvfile:
    writer = csv.writer(csvfile, delimiter=',', quotechar='"')
    writer.writerow( headers )
    writer.writerow( map( lambda x: company[x], headers ) )

or a list of dictionaries for many companies:

company_data = [{
    'company_name': 'Company A',
    'company_cve': 'Some value',
    'prop_1': 'value 1',
    'prop_2': 'value 2'
},{
    'company_name': 'Company B',
    'company_cve': 'A different value',
    'prop_1': 'value 3',
    'prop_2': 'value 4'

}]

headers = [ 'company_name', 'company_cve', 'prop_1', 'prop_2' ]

with open('exampleCsv.csv', 'w') as csvfile:
    writer = csv.writer(csvfile, delimiter=',', quotechar='"')
    writer.writerow( headers )
    for c in company_data:
        writer.writerow( map( lambda x: c[x], headers ) )
i alarmed alien
  • 9,412
  • 3
  • 27
  • 40