0

For example, I imported a dataset from excel looks like this with just one column and many rows, with repeated information such as name, phone and title:

table
---------
0 name1
1 phone1
2 title1
3 name2
4 phone2
5 title2
6 name3
7 phone3
8 title3

And I want to build a table from this with 3 columns name, phone, title and extract that information into this new table. such as:

name phone title
name1 phone1 title1
name2 phone2 title2

and such... How should I approach this problem? I'm using Python with Jupyter Notebook.

++ So here, all name/phone/title are different words, for example, names can be Sarah Kim, Andrew m. white, Mike yesman. Phone can be 111-222-3333, 333-444-5555, and so on. And I have more than 500+ rows, so my first try is trying to use REGEX to separate name, phone, and title. I am using pandas dataframe, and I wanted to learn how to approach problems like this, rather than just getting the code.

Sarah
  • 627
  • 7
  • 25
  • 1
    What have you tried so far based on your own research? And what was your result? – G. Anderson Feb 11 '20 at 21:34
  • Does this answer your question? [Printing Lists as Tabular Data](https://stackoverflow.com/questions/9535954/printing-lists-as-tabular-data) – Jongware Feb 11 '20 at 21:36
  • @G.Anderson I don't even have many ideas of how to approach a problem like this. I'm not asking for code answers but more like how to think through and what (?) to consider. – Sarah Feb 11 '20 at 21:38
  • 1
    I think this is too broad/vague. See: [ask], [help/on-topic]. – AMC Feb 11 '20 at 21:40
  • If you want lined-up columns, then it follows that all the values in a given column need to be the same width. Use Python formatting to make each value in a column take up the same number of characters, padded with spaces where needed. – BoarGules Feb 11 '20 at 21:46

2 Answers2

0

You can use pandas to create a basic table. Define each of your variables, in this case by using an array of strings. Then use pd.DataFrame to title each column and then assign the values.

import pandas as pd

names = ('name1', 'name2')
phones = ('phone1', 'phone2')
titles = ('title1', 'title2')

example = pd.DataFrame({
    "Names": names,
    "Phones": phones,
    "Titles": titles,
})
example

Output:

    Names   Phones  Titles
0   name1   phone1  title1
1   name2   phone2  title2
Lauren A.
  • 11
  • 3
  • However, I will have actual names, phone numbers, and titles in the table rather than simplified version of name1, name2. And also the number of rows is over 500.... Will there be a better way? I'm thinking using Regex, but not sure if that's smart way – Sarah Feb 11 '20 at 21:37
  • He said that he imported that from Excel, so your answer has nothing to do with his question. – Michael Feb 11 '20 at 21:37
0

Here's a solution without using pandas (although pandas will likely be faster/more efficient):

data = ['name1', 'phone1', 'title1', 'name2', 'phone2', 'title2']

print("Name Phone Title")
for name, phone, title in zip(*[iter(data)]*3):
  print(name, phone, title)

Result

Name Phone Title
name1 phone1 title1
name2 phone2 title2
Jab
  • 26,853
  • 21
  • 75
  • 114