-3

i have multiple CSVs which i need to import into MangoDB. These csv's have a dot in the header which is failing when i insert them into MangoDB. apparently they dont allow dots in the keys. How can i remove the dots? i cannot modify the CSV as the csv's are loaded at runtime

  df = pd.read_csv(filepath) #csv file which you want to import
  records_ = df.to_dict(orient = 'records')
  print(records_)
  result = db.matchstats.insert_many(records_ )
SuperStormer
  • 4,997
  • 5
  • 25
  • 35
gary rizzo
  • 39
  • 7
  • 1
    [Pandas has excellent documentation](http://pandas.pydata.org/pandas-docs/stable/text.html#splitting-and-replacing-strings) - and methods to replace/transform strings. If you search I imagine there are a number of SO Q&A's regarding removing or replacing substrings in Pandas DataFrames or Series. – wwii May 26 '18 at 15:31
  • 2
    *MangoDB* == MongoDB? – patrick May 26 '18 at 15:37

1 Answers1

0

You are converting your data to a dictionary here:

records_ = df.to_dict(orient = 'records')

Thus, you can just apply a dictionary comprehension (I am assuming operating on the dictionary keys here, as per your question.) You can just use a regex to replace individual items.

For example (note that this assumes Py 3):

import re
records_ = {re.sub(r"\.+", "", k):v for k,v in records_.items()} # replace one or more stops using re.sub

However, the overall approach of just replacing dots seems dangerous as it can lead to duplicate keys / dropped keys; you might reconsider it based on your data.

patrick
  • 4,455
  • 6
  • 44
  • 61
  • hey, thanks for your answer. i'm getting the error 'list' object has no attribute 'items' now. – gary rizzo May 27 '18 at 08:20
  • ah indeed, the `to_dict` returns a list when you use the 'records' setting. Can you do a `print (_records[0])` and reproduce results here, indicating which item you want to convert? – patrick May 28 '18 at 14:39