0

Here the file read.txt contains multiple unicode words with surrogate pairs ..i m using k=3 in the for loop to get multiple three letter words from the file read.txt..i am having trouble in counting the surrogate pairs..(i.e)the character counts along with diacritic marks...want some ideas to count the surrogate pairs...when running the below code ..i m getting this error...

java.lang.ArrayIndexOutOfBoundsException: 13 at pp.main(pp.java:36) BUILD SUCCESSFUL (total time: 1 second)

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class pp {

    public static void main(String[] args) throws IOException {
     FileReader fr = null;
     BufferedReader br =null;
     FileWriter fw=null;
        BufferedWriter bw=null;

        String [] stringArray;
           int arrayLength ;
           String s="";
       String stringLine="";
        try{
    fr = new FileReader("D:\\OFFICE\\read.txt");
    fw=new FileWriter("D:\\OFFICE\\write.txt");
    br = new BufferedReader(fr);
    bw=new BufferedWriter(fw);
   while((s = br.readLine()) != null)
   {
        stringLine = stringLine + s;
        stringLine = stringLine + " ";
  }
    stringArray = stringLine.split(" ");
   arrayLength=stringLine.codePointCount(0, stringLine.length());
 for (int i = 0; i <arrayLength; i++)  
 {
 for(int k=3;k>stringArray[i].length();i++) 
    //for getting all 3 char unicode  lines
          {
      System.out.println(stringArray[i]);
           bw.newLine();
         } 
   int n=stringLine.offsetByCodePoints(0, i);
   arrayLength=stringLine.codePointAt(n);
      }
   fr.close();
    br.close();
    bw.flush();
        bw.close();
    }catch (ArrayIndexOutOfBoundsException e) {
    e.printStackTrace();
     }}}
DanielBarbarian
  • 5,093
  • 12
  • 35
  • 44
surya
  • 1
  • 4
  • Besides the [obvious](http://stackoverflow.com/questions/5554734/what-causes-a-java-lang-arrayindexoutofboundsexception-and-how-do-i-prevent-it), do you realize that you're using the platform encoding to read the file? Have you ensured that the data is read correctly before you attempt to do things with it? – Kayaman Oct 21 '16 at 07:25
  • i tried this coding with english files it gives me a correct output....problem comes wen using unicode files – surya Oct 21 '16 at 08:02
  • Well, what's your platform default encoding and what encoding are the files? – Kayaman Oct 21 '16 at 08:08
  • having UTF-8 for default encoding....the files also in UTF -8 – surya Oct 21 '16 at 08:11
  • What about the exception? Did you look at what I linked? – Kayaman Oct 21 '16 at 08:13
  • i saw that..i didn't use any <= symbols...moreover i have the three character word inside the read.txt file...i don,t know y it arises here – surya Oct 21 '16 at 08:20
  • It's not just `<=` symbols. You have the exception and you have the line where it occurs. Now it's up to you to debug and fix it. – Kayaman Oct 21 '16 at 08:39
  • ok....thanks for ur time – surya Oct 21 '16 at 08:43

0 Answers0