0

EDIT: I am not asking for help with normalisation as was replied to my question. I am asking for help on the initial ER diagram design BEFORE normalising it. I had that in my initial question. Does my question stay closed now? I was linked to a normalisation question which again, is not what I am asking for. Am asking for some advice on where to begin wth my entities and attributes in my initial ER diagram design.

I'm still pretty new to DBMS. I have to do a vaccination database for assessment. I'm having trouble with making the entity relationship diagram though. I've done plenty course exercises and things, but the complexity in assessment is much more than the stuff learned. E.g. exercises and examples with tutors is a couple tables and uses typical stuff like cust_id, cust_name, product_id etc, a few attributes and that's all, but assessment is way more complex and I'm finding it hard on how I should approach it since these entity and attributes aren't really so clear cut as thing like cust_id and such, plus the csv files we are given have much more in it. Basically I need to make a database using what is in the csv files, not alter the data itself, but decide how to structure it. It's just much more complex compared to what is taught which seems to be a pattern. Wondering if anyone could give some tips on how to approach it? I'll just post the tables/column names below. Not asking anyone to do it for me, more just advice on how to approach it, what things I should be separating out into tables, what can be removed if it can be calculated in SQL later etc. Thanks

LOCATIONS.csv location,iso_code,vaccines,last_observation_date,source_name,source_website

(iso_code is assigned to a country, this would be primary key I think. vaccines is pfize, modern etc)

VACCINATIONS.csv location,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,total_boosters,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,total_boosters_per_hundred,daily_vaccinations_per_million,daily_people_vaccinated,daily_people_vaccinated_per_hundred

(there's so much in this csv! ?)

VACCINATIONS-BY-AGE-GROUP.csv location,date,age_group,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,people_with_booster_per_hundred

VACCINATIONS_BY_MANUFACTURER location,date,vaccine,total_vaccinations

US-STATE-VACCINATIONS date,location,total_vaccinations,total_distributed,people_vaccinated,people_fully_vaccinated_per_hundred,total_vaccinations_per_hundred,people_fully_vaccinated,people_vaccinated_per_hundred,distributed_per_hundred,daily_vaccinations_raw,daily_vaccinations,daily_vaccinations_per_million,share_doses_used,total_boosters,total_boosters_per_hundred

ENGLAND (THIS ONE IS THE SAME FOR 3 OTHERS, POLAND, AUSTRALIA AND CANADA) location,date,vaccine,source_url,total_vaccinations,people_vaccinated,people_fully_vaccinated,total_boosters

Sorry if its a lot, just wanted to be clear. This amount of data, possible entities, attributes etc is so much more than exercise stuff I had learned so I'm pretty confused or lost on how I should approach making the entity diagram? Thank you!

  • If you are not asking us to do this for you, then I'm not sure what help you are looking for here! The approach to design a relational database is called normalisation. If you follow the normalisation rules **and** understand your data, then you can come up with a valid data model. The emphasis is that you must understand your data. – Shadow Feb 11 '22 at 10:00
  • @Shadow I'm not upto the normalisation part yet. I am asking as I asked in the question, how I should begin with separting my entities and attributes since there is so much, then normalisation afterwards. – Do_dee_dee Feb 11 '22 at 10:08
  • Normalisation is the approach you need to follow to identify your entities and their attributes. – Shadow Feb 11 '22 at 11:25

0 Answers0