We have got very powerful library Pandas available to perform analytical operation with minimum lines of code.
Basically pandas is an open source python package that provides numerous tools for data analysis. Some basic advantages and uses of pandas are listed below:
- It can present data in a way that is suitable for data analysis.
- The package contains multiple methods for convenient data filtering.
- It has a variety of utilities to perform Input/Output operations.
Implementaion of the case that you wanna achive using pandas
First install pandas
using pip install pandas
i/p > A text file with input data in given format
o/p > A text file with required output in csv format
import pandas as pd
from datetime import datetime
with open("input") as file: # Read input
headers = ["id", "first_name", "last_name", "age", "address", "date"]
dtypes = [int, str, str, int, str, datetime]
data_frame = pd.read_csv(file, sep='[|][|]', names=headers, header=None, parse_dates=['date'],
engine="python") # Read data into data frame from csv
data_frame.sort_values(data_frame.date.name, ascending=False, inplace=True) # Sort the values based on dates
data_frame.drop_duplicates(subset=data_frame.id.name, inplace=True) # Delete duplicate rows based on id
data_frame.to_csv('output', sep=',', header=None) # Generate outpu