I need to merge multiple files into one data frame based on ID. Final Data frame is in form of map with ID's as below:
+-------+-----+-------------+-------------+---------------+
| Name | ID | CategoryID1 | CategoryID2 | CategoryID400 |
+-------+-----+-------------+-------------+---------------+
| name1 | ID1 | 0 | 1 | 0 |
| name2 | ID2 | 1 | 1 | 0 |
| name3 | ID3 | 0 | 0 | 0 |
| name4 | ID4 | 1 | 0 | 1 |
+-------+-----+-------------+-------------+---------------+
Those are binary variables (categories) and I need to assign 1 if it occurs no matter how many times. I have empty data frame (map) with col names and need to fill it with data merged from multiple files.
Data files to be merged and filled into one files looks as below. There can be replies so the same ID in 2 files may have assigned both categories but it does not matter, only important is it appears and 1 is to be assigned to master data frame.
+-------+-----+---------------------------------------------------------------+
| name1 | ID1 | CategoryID1 CategoryID4 |
| name2 | ID2 | CategoryID1 CategoryID2 CategoryID9 |
| name3 | ID4 | CategoryID150 CategoryID200 CategoryID400 |
| name4 | ID4 | CategoryID1 CategoryID4 CategoryID7 CategoryID15 CategoryID89 |
+-------+-----+---------------------------------------------------------------+
Creating a empty data frame is not problem, just wonder how to loop through the files. Important is raw files are \t separated for 3 columns but categories are separated by space.