I'm trying to iterate over a number of CSV files and join together each 'sequence' column of each dataframe to the first value. Here is what each file looks like:
ID Order Sequence
1773 1 'AAGG'
1773 2 'TTGG'
1773 3 'GGAA'
And I need it to look like this for each CSV:
ID Sequence
1773 'AAGGTTGGGGAA'
I don't have any need for the 'order' column after this. I've tried many different commands but can't seem to find the right one.
Right now I have:
path = r'C:\Users\CAAVR\Desktop\folder\*.csv'
for fname in glob.glob(path):
df = pd.read_csv(fname)
first = df['sequence'].iloc[:1]
next = df['sequence'].iloc[2:]
final = first.str.join(next)
print(final)
I know .join() isn't right but concat and merge don't seem to work either. Keep getting:
AttributeError: 'Series' object has no attribute 'join'
Let me know if you need any other info and thanks for the help!