I am C programmer and now I moved to Java. I am trying to convert C program in Java program. C programs simply calculate term frequency and inverse document frequency (tf/idf).
I created one data class
public class Data {
private String fileName,fileText;
private int fileId;
private float value;
public void addData(String fileName, String fileText, float value){
this.fileName = fileName;
this.fileText = fileText;
this.value = value;
}
public int getFileId(){
return this.fileId;
}
public String getFileName(){
return this.fileName;
}
public String getFileText(){
return this.fileText;
}
public float getValue(){
return this.value;
}
}
This class is responsible to store file name, file text, and Value (tf value or idf value).
The following class is responsible to store data:
public class main {
public static void main(String[] args) {
HashMap<String, Data> map = new HashMap<String, Data>();
Data dt = new Data();
dt.addData("abc.txt", "some contents", 2);
map.put("1",dt);
dt.addData("w", "some more contents in second file", 3);
map.put("2",dt);
System.out.println(map);
}
}
When I print map, it gives me some weird values. I think, I have to declare array of data class? I don't know how many files are there, therefore I can not put any static array number.
Also, how can I calculate TF and IDF based on this data structure?
In a C program, I simply read files, count the words divide by total number of words to get TF and a word divided by total occurrence of that word in all files to get IDF. I do not know how to do it using above data structure.
I get weird values. Maybe these are objects:
{2=test2.Data@19821f, 1=test2.Data@19821f}
Is there any way to get a specific value from Data class using getFileName
etc. functions?