read and write multiple three character unicode words from a file

Question

Here the file read.txt contains multiple unicode words with surrogate pairs ..i m using k=3 in the for loop to get multiple three letter words from the file read.txt..i am having trouble in counting the surrogate pairs..(i.e)the character counts along with diacritic marks...want some ideas to count the surrogate pairs...when running the below code ..i m getting this error...

java.lang.ArrayIndexOutOfBoundsException: 13 at pp.main(pp.java:36) BUILD SUCCESSFUL (total time: 1 second)

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class pp {

    public static void main(String[] args) throws IOException {
     FileReader fr = null;
     BufferedReader br =null;
     FileWriter fw=null;
        BufferedWriter bw=null;

        String [] stringArray;
           int arrayLength ;
           String s="";
       String stringLine="";
        try{
    fr = new FileReader("D:\\OFFICE\\read.txt");
    fw=new FileWriter("D:\\OFFICE\\write.txt");
    br = new BufferedReader(fr);
    bw=new BufferedWriter(fw);
   while((s = br.readLine()) != null)
   {
        stringLine = stringLine + s;
        stringLine = stringLine + " ";
  }
    stringArray = stringLine.split(" ");
   arrayLength=stringLine.codePointCount(0, stringLine.length());
 for (int i = 0; i <arrayLength; i++)  
 {
 for(int k=3;k>stringArray[i].length();i++) 
    //for getting all 3 char unicode  lines
          {
      System.out.println(stringArray[i]);
           bw.newLine();
         } 
   int n=stringLine.offsetByCodePoints(0, i);
   arrayLength=stringLine.codePointAt(n);
      }
   fr.close();
    br.close();
    bw.flush();
        bw.close();
    }catch (ArrayIndexOutOfBoundsException e) {
    e.printStackTrace();
     }}}

Besides the [obvious](http://stackoverflow.com/questions/5554734/what-causes-a-java-lang-arrayindexoutofboundsexception-and-how-do-i-prevent-it), do you realize that you're using the platform encoding to read the file? Have you ensured that the data is read correctly before you attempt to do things with it? — Kayaman, Oct 21 '16 at 07:25
i tried this coding with english files it gives me a correct output....problem comes wen using unicode files — surya, Oct 21 '16 at 08:02
Well, what's your platform default encoding and what encoding are the files? — Kayaman, Oct 21 '16 at 08:08
having UTF-8 for default encoding....the files also in UTF -8 — surya, Oct 21 '16 at 08:11
i saw that..i didn't use any <= symbols...moreover i have the three character word inside the read.txt file...i don,t know y it arises here — surya, Oct 21 '16 at 08:20
It's not just `<=` symbols. You have the exception and you have the line where it occurs. Now it's up to you to debug and fix it. — Kayaman, Oct 21 '16 at 08:39

read and write multiple three character unicode words from a file

0 Answers0