Am trying to use petl
library to build an ETL process that copied data between two tables. The table contain a unique slug
field on the destination. For that, I wrote my script so It would identify duplicate slugs and convert them with by appending ID
to the slug value.
table = etl.fromdb(source_con, 'SELECT * FROM user')
# get whatever remains as duplicates
duplicates = etl.duplicates(table, 'slug')
for dup in [i for i in duplicates.values('id')]:
table = etl.convert(
table,
'slug',
lambda v, row: '{}-{}'.format(slugify_unicode(v), str(row.id).encode('hex')),
where=lambda row: row.id == dup,
pass_row=True
)
The above did not work as expected, it seems like the table
object remains with duplicate values after the loop.
Anyone can advise? Thanks