0

I have a pandas data frame that consists of special/vanity numbers.

numbers = [539249751,530246444,539246655,539209759,538849098]
  
# Create the pandas DataFrame with column name is provided explicitly
vanity_class= pd.DataFrame(numbers, columns=['MNM_MOBILE_NUMBER'])

I would like to add a column to classify each number based on its pattern using regex.

I have written a function that iterates through the column MNM_MOBILE_NUMBER. Identifies the pattern of each number using regex. Then, creates a new column MNC_New_Class with the relevant classification.

def vanity_def(vanity_class):
    if vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'^5(\d)\1{7}') | \
            vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'^5(?!(\d)\1)\d(\d)\2{6}$') | \
            vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'.{2}(?!(\d)\1)\d(\d)\2{5}$') | \
            vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'^\d*(\d)(\d)(?:\1\2){3}\d*$') | \
            vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'^5((\d)\2{3})((\d)\4{3})$') | \
            vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'.{3}(1234567$)'):
        vanity_class['MNC_New_Class'] = 'Diamond'
    elif vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'.{3}(?!(\d)\1)\d(\d)\2{4}$') | \
             vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'^(?!(\d)\1)\d((\d)\3{6})(?!\3)\d$') | \
             vanity_class.MNM_MOBILE_NUMBER.astype(str).str.match(r'\d(\d)\1(\d)\2(\d)\3(\d)\4'):     
        vanity_class['MNC_New_Class'] = 'Gold'
    else:
        vanity_class['MNC_New_Class'] = 'Non Classified'

Then, I wrote this line of code to apply the function to the column.

vanity_class['MNC_New_Class']  = vanity_class['MNM_MOBILE_NUMBER'].apply(vanity_def)

However, I keep getting this error

AttributeError: 'int' object has no attribute 'MNM_MOBILE_NUMBER'

Any advice on how to avoid this error?

Thank you

Leena
  • 247
  • 1
  • 7
  • `vanity_class` is an `int`, so this fails: `vanity_class.MNM_MOBILE_NUMBER`. Make sure it's of the correct type, you're passing the wrong object to the function. – Óscar López Oct 05 '22 at 12:50
  • I have tried different types (str, object and float) I keep getting the same error – Leena Oct 05 '22 at 12:52
  • Then you should pass an object of type vanity_class :) (or a data frame that actually contains that attribute) no other type will have the MNM_MOBILE_NUMBER attribute. Perhaps you should review some basic object-oriented and/or pandas programming concepts, and make sure you're creating and passing an object of the correct type. – Óscar López Oct 05 '22 at 12:55
  • vanity_class is the dataframe – Leena Oct 05 '22 at 12:58
  • Pass the actual data frame, then. You seem to be passing something else, an int. How about using `vanity_def(vanity_class)` instead? – Óscar López Oct 05 '22 at 13:00
  • `vanity_def` is the function name – Leena Oct 05 '22 at 13:01

1 Answers1

1

When you pass a function to Pandas' apply(), it receives the value of the selected column, not the data frame itself. So you should rewrite your code accordingly:

def vanity_def(mnm_mobile_number): # parameter is an int
    # return the new value, do the assignment outside of this function
Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • Thank you for your help Oscar. I am now getting this error `ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().` I tired replacing `|` with `'or'` but the error still shows – Leena Oct 05 '22 at 13:15
  • @Leena it seems I was mistaken, `|` is the proper way to compare with Pandas, see [this question](https://stackoverflow.com/q/36921951/201359). It's not clear what you want to do, or how does the code look after the fixes. We should close this question (please mark it as resolved, click on the check mark to the left of my answer) and create a new question with the new problem; I believe the original problem was solved with this answer. – Óscar López Oct 05 '22 at 15:38