I am trying to put specific columns of tab separated files into a dictionary. I am trying several things and none give me the result I am looking for.
I have for example this file:
Name Start End Size
del1 100 105 5
del2 150 160 10
del3 250 300 50
and this file, both .csv
Name Qual StartB EndB Size
inv1 6 400 405 5
inv2 7 450 460 10
inv3 20 450 400 50
What I want is something like this, where Name is the key and the others are values, additionally I have the problem of changing headers and indexes of headers, but they mean the same thing:
del_dict{del1: {Start: 100, End: 105, Size:5} del2: {etc}
I tried reading the file in several ways, based on other stack overflow answers.
for file in glob.glob(directoryname + "/*.csv"):
dict = pd.read_csv(file, squeeze=False, sep="\t").to_dict()
print(dict)
and
for file in glob.glob(directoryname + "/*.csv"):
df = pd.read_csv(open(file, 'r'), header=0, sep="\t")
if "StartB" in df.keys():
name = df.Name
start_pos = df.StartB
end_pos = df.EndB
else:
name = df.Name
start_pos = df.Start
end_pos = df.End
But this gives me dataframes, that I cannot seem to fit in that into a dictionary.
I also tried this code, which I used before, but then it was only one file and no changing headers and then it will result in too many loops and hard coding to digest everything I need, based on the file I open.
for file in glob.glob(directoryname + "/*.csv"):
with open(file, 'r') as csvfile:
csv_list = []
for line in csvfile:
csv_list.append(line.strip("\t"))
I am fairly new to python, and I know a relatively simply answer must be available, but I cannot seem to find it. Sorry if the answer is already on stack overflow, I tried for hours to find a similar/workable problem and this is the point I am really getting stuck.