To answer your question, in the context of pandas, the primary reason you often see examples with column-oriented data is that DataFrames are designed to represent tabular data where each column corresponds to a specific attribute or variable. This aligns with how data is typically organized in databases and spreadsheets. DataFrames are widely used to represent structured tabular data, where each column corresponds to a specific attribute or measurement. This makes column-oriented examples more intuitive for beginners. Besides in many real-world scenarios, it's easier to collect and manage data in a column-oriented manner. For example, each observation (row) might have multiple attributes, and adding new observations can be as simple as appending a new row.
But you can absolutely put data into an existing DataFrame row-wise.
Solution 1: DataFrame.loc[]
indexer
Here's an example I use often with .loc
indexer to assign the data to specific rows in the DataFrame:
import pandas as pd
# Create an empty DataFrame with column names
columns = ['Name', 'Age', 'Country']
data_frame = pd.DataFrame(columns=columns)
# Data for each row
row1_data = ['Alice', 25, 'USA']
row2_data = ['Bob', 32, 'Canada']
row3_data = ['Eve', 28, 'UK']
# Inserting data into the DataFrame row-wise
data_frame.loc[0] = row1_data
data_frame.loc[1] = row2_data
data_frame.loc[2] = row3_data
# Append new data at the end
new_row = ['Rafi', 28, 'Dhaka']
data_frame.loc[len(data_frame)] = new_row
# Display the DataFrame
print(data_frame)
Output:
Name Age Country
0 Alice 25 USA
1 Bob 32 Canada
2 Eve 28 UK
3 Rafi 28 Dhaka
You can just iterate through a list of data and always append it as the last entry of the dataframe this way.
Solution 2: Using DataFrame.from_records()
import pandas as pd
# Data for multiple rows as tuples
data = [('Alice', 25, 'USA'),
('Bob', 32, 'Canada'),
('Eve', 28, 'UK')]
# Column names
columns = ['Name', 'Age', 'Country']
# Create DataFrame from records
data_frame = pd.DataFrame.from_records(data, columns=columns)
# Display the DataFrame
print(data_frame)
Output:
Name Age Country
0 Alice 25 USA
1 Bob 32 Canada
2 Eve 28 UK
Solution 3: Using DataFrame.from_dict()
you can add multiple rows or a single row as dict()
. This is especially useful if you have a .json
file to pass as data
import pandas as pd
# Data as a list of dictionaries
data = [
{'Name': 'Alice', 'Age': 25, 'Country': 'USA'},
{'Name': 'Bob', 'Age': 32, 'Country': 'Canada'},
{'Name': 'Eve', 'Age': 28, 'Country': 'UK'}
]
# Create DataFrame from dictionary
data_frame = pd.DataFrame.from_dict(data)
# Display the DataFrame
print(data_frame)
Output:
Name Age Country
0 Alice 25 USA
1 Bob 32 Canada
2 Eve 28 UK
Hope it helps!