I'm trying to read a normal file from HDFS in class which I would be executing through spark-submit.
I have a method which does a String operations and its from this string output I create RDD.
I'm performing the below string operations before creating an RDD.
Should I use a StringBuilder or a StringBuffer for the variable valueString ?
while ((line = bf.readLine()) != null) {
String trimmedLine=line.trim();
if(trimmedLine.charAt((trimmedLine.length()-1))==';'){
if(extractionInProgress){
valueString=valueString.concat(trimmedLine.substring(0,trimmedLine.indexOf(";")));
keyValues.put(searchKey, valueString);
extractionInProgress=false;
valueString="";
}
else{
int indexOfTab=trimmedLine.indexOf(" ");
if(indexOfTab > -1){
String keyInLine=trimmedLine.substring(0,indexOfTab);
valueString=trimmedLine.substring(indexOfTab+1,trimmedLine.indexOf(";"));
keyValues.put(keyInLine, valueString);
valueString="";
}
}
}
else{
if(!extractionInProgress){
searchKey=trimmedLine;
extractionInProgress=true;
}
else{
valueString=valueString.concat(trimmedLine.concat("\n"));
}
}
}