0

I want to create a nested HashMap which returns the frequency of terms among multiple files. Like,

Map<String, Map<String, Integer>> wordToDocumentMap=new HashMap<>();

I have been able to return the number of times a term appears in a file.

  Map<String, Integer> map = new HashMap<>();//for frequecy count       
   String str = "Wikipedia is a free online encyclopedia, created and edited by 
     volunteers around the world."; //String str suppose a file a.java

    // The query string
    String query = "edited Wikipedia volunteers";

    // Split the given string and the query string on space
    String[] strArr = str.split("\\s+");
    String[] queryArr = query.split("\\s+");

    // Map to hold the frequency of each word of query in the string
    Map<String, Integer> map = new HashMap<>();

    for (String q : queryArr) {
        for (String s : strArr) {
            if (q.equals(s)) {
                map.put(q, map.getOrDefault(q, 0) + 1);
            }
        }
    }

    // Display the map
    System.out.println(map);

In my code its count the frequency of the given query Individually. But I want to Map the query term and its frequency with its filenames. I have searched around the web for a solution but am finding it tough to find a solution that applies to me. Any help would be appreciated!

  • What is the second hashmap supposed to contain? – beastlyCoder Aug 19 '20 at 20:21
  • @beastlyCoder 2nd hashmap is to count the terms with its frequency. **Map map = new HashMap<>();//for frequecy count** – Sanzida Sultana Aug 19 '20 at 20:24
  • `wordToDocumentMap.put(fileName, map);` – Louis Wasserman Aug 19 '20 at 20:34
  • so you want the word, and then how many times it shows up in the file – beastlyCoder Aug 19 '20 at 20:38
  • @beastlyCoder I want the query word individually with its filename. In a query, there are 3 words **edited Wikipedia volunteers**. I have been able to count each query individually, like {edited=3, Wikipedia 2, volunteers 1} and now I want to get filename with its frequecxy . – Sanzida Sultana Aug 19 '20 at 20:46
  • @LouisWasserman I tried it. but cannot get my expected result when I print. – Sanzida Sultana Aug 19 '20 at 20:47
  • 1
    @Sanzida your question is just not clear. You have a Map>. Presumably, that 'Integer' is some sort of count, one of the Strings is a filename, and the other string is a word? Which one is which? Do you want to map filenames to a mapping of word frequencies, or do you want to map a word to a mapping of 'this file contains that word this many times'? – rzwitserloot Aug 19 '20 at 21:22
  • @rzwitserloot sorry for this. I want to map filenames to a mapping of words frequency.like **>**. – Sanzida Sultana Aug 19 '20 at 21:26
  • What is `queryArr` and `strArr`? Why should `q` be equal to `s`? What is the point of `s`? – Louis Wasserman Aug 19 '20 at 21:29
  • @LouisWasserman In my code I splited my query string in whitespace.and my expected output like :{filename: a.java , } then :{filename: a.java , } ,and then :{filename: a.java , } in nested map format . but I cannot do mapping with filename – Sanzida Sultana Aug 19 '20 at 21:31
  • @LouisWasserman in my query there are 3 words, I searched the three words individually from a String line .queryarray is for searching the every single word from a string line.the output I get from the program {edited=2, wikipedia=1,volunteers=4}. and here **String str = "Wikipedia is a free online encyclopedia, created and edited by volunteers around the world."** is supposed a file a.java – Sanzida Sultana Aug 19 '20 at 21:41

1 Answers1

1

I hope I'm understanding you correctly.

What you want is to be able to read in a list of files and map the file name to the map you create in the code above. So let's start with your code and let's turn it into a function:

public Map<String, Integer> createFreqMap(String str, String query) {

    Map<String, Integer> map = new HashMap<>();//for frequecy count       

    // The query string
    String query = "edited Wikipedia volunteers";

    // Split the given string and the query string on space
    String[] strArr = str.split("\\s+");
    String[] queryArr = query.split("\\s+");

    // Map to hold the frequency of each word of query in the string
    Map<String, Integer> map = new HashMap<>();

    for (String q : queryArr) {
        for (String s : strArr) {
            if (q.equals(s)) {
                map.put(q, map.getOrDefault(q, 0) + 1);
            }
        }
    }

    // Display the map
    System.out.println(map);
    return map;
}

OK so now you have a nifty function that makes a map from a string and a query

Now you're going to want to set up a system for reading in a file to a string.

There are a bunch of ways to do this. You can look here for some ways that work for different java versions: https://stackoverflow.com/a/326440/9789673

lets go with this (assuming >java 11):

String content = Files.readString(path, StandardCharsets.US_ASCII);

Where path is the path to the file you want.

Now we can put it all together:

String[] paths = ["this.txt", "that.txt"]
Map<String, Map<String, Integer>> output = new HashMap<>();
String query = "edited Wikipedia volunteers"; //String query = "hello";
for (int i = 0; i < paths.length; i++) {
    String content = Files.readString(paths[i], StandardCharsets.US_ASCII);
    output.put(paths[i], createFreqMap(content, query);
}
randypaq13
  • 432
  • 3
  • 9
  • here is my query there are three words **edited Wikipedia volunteers** . and I count its frequency individually.if my file name is a.java how can I map those three words with the filename a.java? I am a beginner.can u pls tell me the step by step procedure? thanks in advance – Sanzida Sultana Aug 19 '20 at 20:59
  • I'm not sure what you mean. You can set the variable query to anything you want. So instead of String query = "hello"; it can be String query = "edited Wikipedia volunteers"; – randypaq13 Aug 19 '20 at 21:04
  • thanks for your feedback.sorry for the last comment. I can understand your code now correctly. – Sanzida Sultana Aug 20 '20 at 17:24