0

I am quite new to Python, I tried to find an answer but nothing I tried seems to be working. And the most of the answers are provided when the whole data in JSON format

Through PYODBC I use the following code to retrieve data

formula = """select  id, type, custbody_attachment_1 from transaction  """
lineitem = pd.read_sql_query(formula, cnxn)

It gives me something like the following

+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Internal_ID | Type | Formula_Text                                                                                                                                                                                           |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2895531     | Bill |                                                                                                                                                                                                        |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3492009     | Bill | [{"FL":"https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile0","NM":"someFileName0"}]                                                                                                    |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3529162     | Bill | [{"FL":"5https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile1","NM":"someFileName1"},{"FL":"https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile2","NM":"someFileName2"}] |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I need the output like this. (There might be more than 2 links in the cell.)

+-------------+------+---------------------------------------------------------------------+---------------+
| Internal_ID | Type | FL                                                                  | NM            |
+-------------+------+---------------------------------------------------------------------+---------------+
| 2895531     | Bill |                                                                     |               |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3492009     | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile0 | someFileName0 |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3529162     | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile1 | someFileName1 |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3529162     | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile2 | someFileName2 |
+-------------+------+---------------------------------------------------------------------+---------------+

I tried to play with JSON but there were one problem after another(because it seemed like JSON data to me). In the end I run

print(lineitem['custbody_attachment_1'])

and got the following in Python console

999    [{"FL":"https://4811553.app.netsuite.com/core/...
Name: custbody_attachment_1, Length: 1000, dtype: object

So, I have no idea how to transform this so I could create new rows

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Ilya.s
  • 13
  • 3
  • 1
    Does this answer your question? [How to unnest (explode) a column in a pandas DataFrame?](https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe) – buran Sep 21 '21 at 10:01
  • 1
    And then [Split / Explode a column of dictionaries into separate columns with pandas](https://stackoverflow.com/q/38231591/4046632) – buran Sep 21 '21 at 10:03

1 Answers1

1
df = df.explode('Formula_Text')
df = pd.concat([df.drop(['Formula_Text'], axis=1), df['Formula_Text'].apply(pd.Series)], axis=1)
print(df)
buran
  • 13,682
  • 10
  • 36
  • 61
  • that solved the problem `df = lineitem` `df['custbody_attachment_1']=df['custbody_attachment_1'].apply(json.loads)` `df = df.explode('custbody_attachment_1')` `df = pd.concat([df.drop(['custbody_attachment_1'], axis=1), df['custbody_attachment_1'].apply(pd.Series)], axis =1)` `print(df)` – Ilya.s Sep 21 '21 at 12:26
  • Oh, I overlooked it JSON and decided that it's already converted to list of dicts. And `df` is the common convention name for DataFrame, so just used it – buran Sep 21 '21 at 12:41