I was exposed to the world of tables and data structures in R before the RDBMS systems and other database systems. It is quite elegant in R/Python to create tables and lists from stuctured data (.csv or other formats) and then do data manipulations programmatically.
Last year, I attended a course in Database management and learnt all about structured and unstructured databases. I also noticed that it is the norm to feed data from multiple sources of data into databases rather than directly use them in R (for convenience and discipline?).
For research purposes, R seems to suffice, for joining, appending or even complicated data manipulations.
The questions that keeps arising is: When to use R directly by using commands such as read.csv, when to use R by creating database and querying from tables using the R-SQL interface?
For instance, if I have a multi-source data, like (a) Person level information (age, gender, smoking habits), (b) Outcome variables (such as surveys taken by them in real time), (c) Covariate information (environment characteristics), (d) Treatment input (occurrence of an event that modifies the outcome - survey response) (d) Time and space information of participants taking survey
How to approach the data collection and processing in this case. There may be standard industry procedures, but I put this question forward here, to understand list of feasible and optimal approaches that individuals and small group of researchers can adopt.