0

New to the R/ggplot.

I have a data set like this. Each mol-code is made of 3 components and copies represent how many times each mol-code appears. There are 8 unique components available and it is represented as smile files.

 full.mol.code2 Copies                Pair1.Acids            Pair2.Acids          Pair3.Acids
1   1.301241e+23     18       OC(C1=COC(CCl)=N1)=O OC(C1=CC=C(CCl)C=C1)=O   O=C(O)C1=C(C)OC=C1
2   1.303241e+23     18       OC(C1=CSC(CCl)=N1)=O   OC(C1=CSC(CCl)=N1)=O OC([C@H](C)Br)=O.[R]
3   1.301241e+23     17       OC(C1=COC(CCl)=N1)=O   OC(C1=COC(CCl)=N1)=O   O=C(O)C1=C(C)OC=C1
4   1.304241e+23     12 ClC/C(C)=C/[C@@H](C)C(O)=O   OC(C1=COC(CCl)=N1)=O OC([C@H](C)Cl)=O.[S]
5   1.309240e+23     12       OC(C1=CSC(CCl)=N1)=O OC(C1=CC=C(CCl)C=C1)=O   O=C(O)C1=C(C)OC=C1
6   1.301241e+23     11       OC(C1=COC(CCl)=N1)=O OC(C1=CC=C(CCl)C=C1)=O OC([C@H](C)Cl)=O.[S]

Edit: thanks Allan for formatting this properly. 'full.mol.code2' is a number like this (130124051501260617102804), it will not be considered as value.

I want to represent this data in a barplot where x-axis will be mol-code and y-axis represents copies and each bar represent the combination of three components in different color.

I hope that made sense and appreciate any help. Thanks.

  • Welcome to Stack Overflow! I agree with @EricFletcher 's suggestions; it will help us to help you if you [edit] your question to include the output of the R command `dput(your_data)` (where you replace `your_data` with the actual name of your dataset). You can also find more tips about making a [mcve] for an R question here: [How to make a great R reproducible example](https://stackoverflow.com/q/5963269/8386140) – duckmayr Aug 14 '20 at 17:07
  • sorry, I am not sure how can I paste the data here. I am using R-studio and dataset is too wide. I pasted an image. will that work? – ggplotdels Aug 14 '20 at 17:15
  • Hi Allan, how did you do that? – ggplotdels Aug 14 '20 at 18:05
  • To paste your data (even if it is wide), you can use `dput(your_df)` in your console. It will then output some code (starts with `structure(...`), which can be copied and pasted into your question--formatted as a code chunk--which is then able to be copied and pasted into an R console to recreate your dataset as is. If you have too large a dataset, you can always take a sample or representative piece of your dataset via `dput(head(your_df), X)` to take the first `X` lines, or use `dput(your_df[sample(1:length(your_df),X),])` to grab a sample of X lines of your dataset. – chemdork123 Aug 16 '20 at 17:24

0 Answers0