1

I'm getting data from a database, and I suspect all the data in it is simply set to string instead of float,int,etc. When I import the data into a pandas dataframe, it's all showing up as a string.

print("products.dtypes")
product_category_name         object
product_description_lenght    object
product_height_cm             object
product_id                    object
product_length_cm             object
product_name_lenght           object
product_photos_qty            object
product_weight_g              object
product_width_cm              object
dtype: object

or

print (products.applymap(type))

Results in:

product_category_name product_description_lenght product_height_cm  \
0             <class 'str'>              <class 'str'>     <class 'str'>   
1             <class 'str'>              <class 'str'>     <class 'str'>   
2             <class 'str'>              <class 'str'>     <class 'str'>   
3             <class 'str'>              <class 'str'>     <class 'str'>   
4             <class 'str'>              <class 'str'>     <class 'str'>   
...                     ...                        ...               ...   
32946         <class 'str'>              <class 'str'>     <class 'str'>   
32947         <class 'str'>              <class 'str'>     <class 'str'>   
32948         <class 'str'>              <class 'str'>     <class 'str'>   
32949         <class 'str'>              <class 'str'>     <class 'str'>   
32950         <class 'str'>              <class 'str'>     <class 'str'>   

          product_id product_length_cm product_name_lenght product_photos_qty  \
0      <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
1      <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
2      <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
3      <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
4      <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
...              ...               ...                 ...                ...   
32946  <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
32947  <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
32948  <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
32949  <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   
32950  <class 'str'>     <class 'str'>       <class 'str'>      <class 'str'>   

      product_weight_g product_width_cm  
0        <class 'str'>    <class 'str'>  
1        <class 'str'>    <class 'str'>  
2        <class 'str'>    <class 'str'>  
3        <class 'str'>    <class 'str'>  
4        <class 'str'>    <class 'str'>  
...                ...              ...  
32946    <class 'str'>    <class 'str'>  
32947    <class 'str'>    <class 'str'>  
32948    <class 'str'>    <class 'str'>  
32949    <class 'str'>    <class 'str'>  
32950    <class 'str'>    <class 'str'>  

[32951 rows x 9 columns]

When I look at the data, there are def. numeric values there. I've tried to take a value and add 1 to it to no avail.

products['test'] = products['product_description_lenght'] + 1
TypeError: can only concatenate str (not "int") to str

I've tried str.isnumeric but everything shows up as non-numeric.

Is there anything I can to do to detect numeric values?

Lostsoul
  • 25,013
  • 48
  • 144
  • 239

1 Answers1

1

Try using the below:

import numbers
products['test'] = pd.to_numeric(products['product_description_lenght'], errors='ignore').apply(lambda x: x + 1 if isinstance(x, numbers.Number) else x)
U13-Forward
  • 69,221
  • 14
  • 89
  • 114