I am working on a data set where the values in PLU column are scattered all over the place like: I have 4 columns out of more than 500:
Inventory_No | Description | Group | PLU
----------------------------------------------
93120007 | Coke |Drinks | 1000
93120008 | Diet Coke |Drinks | 1003
93120009 | Coke Zero |Drinks | 1104
93120010 | Fanta |Drinks | 1105
93120011 | White Bread |Bread | 93120011
93120012 | whole Meal |Bread | 93120012
93120013 | Whole Grains|Bread | 110011
93120014 | Flat white |Breads | 1115092
I want my output to be like below that if there is any value with the length more than 6 digits in PLU column, system checks the next available number in sequence of PLU with less than 4 digits in length and add an increment of 1 in it and assign the PLU value to that row and does not change any existing less than 6 digits PLU's:
Inventory_No | Description | Group | PLU
----------------------------------------------
93120007 | Coke |Drinks | 1000
93120011 | White Bread |Bread | 1001
93120012 | whole Meal |Bread | 1002
93120008 | Diet Coke |Drinks | 1003
93120014 | Flat white |Breads | 1004
. | . | . | .
. | . | . | .
. | . | . | .
93120009 | Coke Zero |Drinks | 1104
93120010 | Fanta |Drinks | 1105
93120013 | Whole Grains|Bread | 110011
I want the next available value in the sequence with less than 6 digits and increment it by 1 and if it finds a sequence for any number of incremental values then skip the sequence and start with next available value after the sequence as long as the sequence is less than 6 digits long:
I have checked the below links and they are towards filling the sequence with 0 or Nan values
fill-in-a-missing-values-in-range-with-pandas
missing-data-insert-rows-in-pandas-and-fill-with-nan
Thanks in advance for your answers. Regards,