-1

I am trying to read the first words from my 100k+ word dictionary, I am new to java so spare me :D

The dictionary looks like this:

naklestite  naklestiti  Ggdvdm  0
nakljuÄŤiti nakljuÄŤiti Ggvn    1
nakljuÄŤit  nakljuÄŤiti Ggvm    0
nakljuÄŤil  nakljuÄŤiti Ggvd-em 0

I need to copy all the first words in new .txt file to get the output as follows:

naklestite  
nakljuÄŤiti 
nakljuÄŤit  
nakljuÄŤil

so far I am getting the whole lines as output instead of the first words.

package test;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;

public class moja {

    public static void main(String[] args) {
        try {
            File file = new File("SloveneLexicon.txt");
            FileReader fileReader = new FileReader(file);
            BufferedReader bufferedReader = new BufferedReader(fileReader);
            StringBuffer stringBuffer = new StringBuffer();
            String word;
            while ((word = bufferedReader.readLine()) != null) {

                String s = word;
                String[] fragments = s.split(" ");
                String firstColumn = fragments[0];
                System.out.println(firstColumn);
            }
            bufferedReader.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}
Ahmad Y. Saleh
  • 3,309
  • 7
  • 32
  • 43
Rok Ivartnik
  • 137
  • 5
  • 17

2 Answers2

0

Alright we have the base of Find PID of process that use a port on Windows and you now have a first draft of code.

If you tell me that above code will output an entire line of your document rather than its "first" column the onyl reason i can imagine is that the blanks between your words are not blanks like the ones you produce with the space bar of a keybord (but maybe a "invisible" sign or something like this).

So does your file look like this:

naklestite naklestiti Ggdvdm 0
nakljuÄŤiti nakljuÄŤiti Ggvn 1

Which i would describe as this:

<wordVariableLength><Blank><WordVariableLength><Blank><WordVariableLength><Number><EOL>

or more like this?

naklestitenaklestitiGgdvdm0nakljuÄŤitinakljuÄŤitiGgvn1

Also when you process the lines of your input file and have:

System.out.println(word);
System.out.println("check");

Will it lead to an Output like this?:

naklestite naklestiti Ggdvdm 0
check
nakljuÄŤiti nakljuÄŤiti Ggvn 1
check
...

As long as you cant even "select a column" of your raw input i see bad chances to help you any further :(

Community
  • 1
  • 1
JBA
  • 2,769
  • 5
  • 24
  • 40
  • actually it looks like this: something somethings daijf 20310 if this make any difference? so more then 1 space appart? as i said i am new to this – Rok Ivartnik Oct 15 '14 at 10:06
  • yeah actually it matters if its one or two spaces to seperate the "columns" correctly. The above code should however not print out the entire line of the said file unless there is other formatting in it than i expect from your description – JBA Oct 15 '14 at 10:10
  • You may send the file to any service like trashmail.de and tell me the adress so i can get the file (or upload it somewhere or what do i know) and i have a look at it (after lunch). – JBA Oct 15 '14 at 10:12
  • http://sourceforge.net/projects/obeliks/files/Resources/SloveneLexicon.txt.zip/download going to work now so will reply later or tommorow thanks for the help – Rok Ivartnik Oct 15 '14 at 10:19
  • What is the link with the "Pid process on Windows"? – Thomas Oct 02 '15 at 12:10
0

here is the problem

String[] fragments = s.split(" ");

you are trying to split by " " single space but there is no exactly single space between words .you got entire line because there is no single space to split between words

naklestite  naklestiti  Ggdvdm  0
nakljuÄŤiti nakljuÄŤiti Ggvn    1
nakljuÄŤit  nakljuÄŤiti Ggvm    0
nakljuÄŤil  nakljuÄŤiti Ggvd-em 0

there is " " 2 spaces between most of the words and there are 3 spaces as well as one space.no grantee to be single space or 2 spaces. what you want to do is split words by white spaces not a space .this will split words by one or more consecutive spaces.

so you have to change this line

String[] fragments = s.split(" "); 

to

String[] fragments = s.split("\\s+");

this one.then you will get the correct output

output>>

naklestite
nakljuÄŤiti
nakljuÄŤit
nakljuÄŤil
Madhawa Priyashantha
  • 9,633
  • 7
  • 33
  • 60