1

Am trying to use petl library to build an ETL process that copied data between two tables. The table contain a unique slug field on the destination. For that, I wrote my script so It would identify duplicate slugs and convert them with by appending ID to the slug value.

    table = etl.fromdb(source_con, 'SELECT * FROM user')
    # get whatever remains as duplicates
    duplicates = etl.duplicates(table, 'slug')
    for dup in [i for i in duplicates.values('id')]:
        table = etl.convert(
            table,
            'slug',
            lambda v, row: '{}-{}'.format(slugify_unicode(v), str(row.id).encode('hex')),
            where=lambda row: row.id == dup,
            pass_row=True
        )

The above did not work as expected, it seems like the table object remains with duplicate values after the loop.

Anyone can advise? Thanks

Mo J. Mughrabi
  • 6,747
  • 16
  • 85
  • 143

0 Answers0