1

I am absolutely new in R, and my problem is that I do not have any Real World experience in it. I mean, I have learnt a lot but I am always struggling when I get a new task to deal with. Generally speaking I am talking about, how to start to deal with a new task.

Sometimes the dataset is so big (surprisingly :)) that I am not able to get the panorama about it and the usually used functions such as str(), summarise(), head(), tail() maybe sample_n from package dplyr are not enough to fill me in satisfactorily.

Almost every example that I found on the net, were about datasets which were almost perfect. If we need to clean the data at all, we can relatively easily identify the basic problems because the problems are unambiguous and you can realize them when you check the head() or something.

What about the real world data? What if the columns shifted in the middle of your dataset, or there are some rows where the values consists an inappropriate symbol or space or something (salary, price, phone-number etc)?

In summary: - What is your general method to getting familiar with your dataset (lets assume that we are already know what is the meaning of the variables because we have a description about it)? - Do you have a general examining method?

I know that there are no two similar projects, I am really interested in YOUR basic workflow (with some examples or explanations) though.

Thank you in advance

CsCs
  • 43
  • 1
  • 1
  • 7
  • 1
    SO is about specific coding issues/problems, involving clear problem statements that are answerable through explicit coding examples. Questions asking for ideas/references and that invite opinion-based discussions are [off-topic](https://stackoverflow.com/help/on-topic). I'm not sure if there is a different StackExchange forum at which your question would be a better fit. Perhaps take a look at the posting guidelines over at [Data Science](https://datascience.stackexchange.com/). – Maurits Evers Apr 17 '18 at 00:28
  • See also this previous question about R Workflow: https://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing?rq=1 – Jeromy Anglim Apr 17 '18 at 00:35

0 Answers0