I have a csv file, with three columns: Id, Main_user and Users. Id is the label and both other values as features. Now I want to load the two features (main_user and users) from the csv, vectorize them and assemble them as one vector. After using HashingTF as described in the documentation, how do I add a second feature "Main_user", in addition to the feature "Users".
DataFrame df = (new CsvParser()).withUseHeader(true).csvFile(sqlContext, csvFile);
Tokenizer tokenizer = new Tokenizer().setInputCol("Users").setOutputCol("words");
DataFrame wordsData = tokenizer.transform(df);
int numFeatures = 20;
HashingTF hashingTF = new HashingTF().setInputCol("words")
.setOutputCol("rawFeatures").setNumFeatures(numFeatures);