1

For a petl table, how do I replace empty values with zeros?

I would expect something like the following:

tb_probii = etl.fromcsv("data.csv").fill("score", "", 0)

Looked for a similar function here: http://petl.readthedocs.io/en/latest/_modules/petl/transform/fills.html

But no luck :/

Selah
  • 7,728
  • 9
  • 48
  • 60

2 Answers2

1

I don't know if this is the best way or not. I'm actually grateful to you for bringing the existence of petl to my attention.

>>> import petl
>>> tb_probii = petl.fromcsv('trial.csv')
>>> tb_probii
+------+-------+
| team | score |
+======+=======+
| 'A'  | ''    |
+------+-------+
| 'B'  | '25'  |
+------+-------+
| 'C'  | '35'  |
+------+-------+

>>> from collections import OrderedDict
>>> mappings = OrderedDict()
>>> def f(s):
...     if s == '':
...         return '0'
...     else:
...         return s
...     
>>> mappings['team'] = 'team'
>>> mappings['score'] = 'score', lambda s: f(s)
>>> tb_probii = petl.fieldmap(tb_probii, mappings)
>>> tb_probii 
+-------+------+
| score | team |
+=======+======+
| '0'   | 'A'  |
+-------+------+
| '25'  | 'B'  |
+-------+------+
| '35'  | 'C'  |
+-------+------+

Some explanation: fieldmap executes the collection of mappings contained in an OrderedDict. When I tried this out I did the mapping to a new table. That's why team is mapped identically to itself. This may be unnecessary if you are keeping the same table, although I somehow doubt it. Each mapping is a tuple. The one for score indicates that score is to be mapped to itself with a transformation. It appears to be necessary to use a lambda; however, lambdas can't include if statements. For that reason I created the function f for the lambda to call. I think the columns are re-ordered because the container is an OrderedDict and it's ordered lexicographically on the names of the columns. Perhaps it doesn't have to be an OrderedDict but this is what I found in the doc.

Bill Bell
  • 21,021
  • 5
  • 43
  • 58
  • 1
    Thanks for your answe! I'm glad I got to share petl with you! I'm on the hunt for good data wrangling tools. Considering to try out Blaze as well. If you have any other suggestions I'd love to hear. – Selah May 04 '17 at 15:03
1

I emailed the help group python-etl@googlegroups.com and the creator himself responded with a function that worked perfectly:

tb_probii = etl.fromcsv("data.csv").replace("score", "", 0)
Selah
  • 7,728
  • 9
  • 48
  • 60