EDIT: I am not asking for help with normalisation as was replied to my question. I am asking for help on the initial ER diagram design BEFORE normalising it. I had that in my initial question. Does my question stay closed now? I was linked to a normalisation question which again, is not what I am asking for. Am asking for some advice on where to begin wth my entities and attributes in my initial ER diagram design.
I'm still pretty new to DBMS. I have to do a vaccination database for assessment. I'm having trouble with making the entity relationship diagram though. I've done plenty course exercises and things, but the complexity in assessment is much more than the stuff learned. E.g. exercises and examples with tutors is a couple tables and uses typical stuff like cust_id, cust_name, product_id etc, a few attributes and that's all, but assessment is way more complex and I'm finding it hard on how I should approach it since these entity and attributes aren't really so clear cut as thing like cust_id and such, plus the csv files we are given have much more in it. Basically I need to make a database using what is in the csv files, not alter the data itself, but decide how to structure it. It's just much more complex compared to what is taught which seems to be a pattern. Wondering if anyone could give some tips on how to approach it? I'll just post the tables/column names below. Not asking anyone to do it for me, more just advice on how to approach it, what things I should be separating out into tables, what can be removed if it can be calculated in SQL later etc. Thanks
LOCATIONS.csv location,iso_code,vaccines,last_observation_date,source_name,source_website
(iso_code is assigned to a country, this would be primary key I think. vaccines is pfize, modern etc)
VACCINATIONS.csv location,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,total_boosters,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,total_boosters_per_hundred,daily_vaccinations_per_million,daily_people_vaccinated,daily_people_vaccinated_per_hundred
(there's so much in this csv! ?)
VACCINATIONS-BY-AGE-GROUP.csv location,date,age_group,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,people_with_booster_per_hundred
VACCINATIONS_BY_MANUFACTURER location,date,vaccine,total_vaccinations
US-STATE-VACCINATIONS date,location,total_vaccinations,total_distributed,people_vaccinated,people_fully_vaccinated_per_hundred,total_vaccinations_per_hundred,people_fully_vaccinated,people_vaccinated_per_hundred,distributed_per_hundred,daily_vaccinations_raw,daily_vaccinations,daily_vaccinations_per_million,share_doses_used,total_boosters,total_boosters_per_hundred
ENGLAND (THIS ONE IS THE SAME FOR 3 OTHERS, POLAND, AUSTRALIA AND CANADA) location,date,vaccine,source_url,total_vaccinations,people_vaccinated,people_fully_vaccinated,total_boosters
Sorry if its a lot, just wanted to be clear. This amount of data, possible entities, attributes etc is so much more than exercise stuff I had learned so I'm pretty confused or lost on how I should approach making the entity diagram? Thank you!