0

This is basically a dublicate of:

How to split a string into words. Ex: "stringintowords" -> "String Into Words"?

neverthelees, I have in my use a function like: public int Word(x) {code}, where for a string x,it will return an integer (+ve or -ve), and that integer will be an indication of how good or bad that partitioning is for the specific word. I should return the combination that gives the maximum number.

What I thought of doing for this is to create a table(i,j) , where i and j have length of the word, and fill out the table in tern like:

for i = 1 to n
   for j=i to n do 
      word(subset of x i to j)

and fill out the table, nevertheless, how on earth will I ever be able to retrieve the optimal solution (in a recursive way?)

any help appreciated.

EDIT: The optimal path is the one with the highest sum of word(x) function, i.e. if we have
a path(1,3)=10 , (3,6)=20, (6,7)=1 , and
a path (1,1)=0 , (2,5)=12, (5,7)=-1
then the sum of the 1st path > 2nd

EDIT2: I would like every one to know that this question has been answered by me after long hours of work , nevermind for not getting the solution, getting it yourself is always best i guess:P

cheers!:)

Community
  • 1
  • 1
sam
  • 15
  • 1
  • 6
  • I think, you need to get total string length and run a for loop for first letter 1st, first two letters as 2nd loop , first three letters for third loop..so on... by comparing with a database english dictionary...Until You reach the end of the loop or end of the string. – Karthik Ratnam Dec 04 '10 at 21:28
  • errr no because the i have to compute a Word(x) for every single partition(i think) so thetree is partitioned for the etree ,or t h e t r e e, or th et ree, or the tree , or thetree , every single entry in this table has to be filled out by calling the word() method. then i have to someway find the optimal path through that table.. EDIT: the optimal path is the one with the highest sum of word(x) function. i.e. if we have a path(1,3)=10 , (3,6)=20, (6,7)=1 , and a path (1,1)=0 , (2,5)=12, (5,7)=-1 then the sum of the 1st path > 2nd – sam Dec 04 '10 at 21:32

2 Answers2

0

Sam you are saying that:

you have a table filled with your position in the string, and booleans for each cell if there is a substring of length j .

example:

                    |       Positions of the example:
                    |            "xdeafcatmonologue"
 -------------------|-------------01234567890123476----------------
length of substring |                              
        1           |             11111111111111111     
        2           |             ------2----------
        3           |             -----3------3----
        4           |             -4--------------- 
        5           |             -----------------
        6           |             -----------------
        7           |             -----------------
        8           |             -----------------
        9           |             --------9--------
                    |              

      final         |             -4---3--9---3----

The dashes are technically zeros, and the numbers potentia and this is the table you will build.

You can build a much smaller table if you only store the maximum value for each position because your Word function tells you the maximum for each position. It doesn't matter if you iterate over substring size, your variable j, because Word does not use j.

Please correct me if I'm mistaken with any of these assumptions.

sova
  • 5,468
  • 10
  • 40
  • 48
  • well ok let me give you an example: – sam Dec 05 '10 at 01:12
  • aw actually word function does use J , it takes as a variable a word X , so using Word(x) means that the word x is a substring of our initial word A lets say.. soo the way we compute it is substring of A , i=1 , j=1, then j=2 etc.. – sam Dec 05 '10 at 01:24
  • Okay so if I understand what you mean it gives me a +1 for where I would have numbers in my table, and a -1 for where I have dashes or would have zeros? It is very difficult to understand what you are saying. – sova Dec 05 '10 at 05:22
0

ok let me give you an example:

word thetree


         1  2   3  4  5  6  7 ( ending index)

start 1  1  -1  2 -1 -1 -1 -1
      2  0  1  -1  -1 -1 -1 -1
      3  0  0   1  0  0  0  0
      4  0  0   0  1  0  0  4
      5  0  0   0  0  1  0  0
      6  0  0   0  0  0  1  0  
      7  0  0   0  0  0  0  1

well consider this solution, then if you store each colum individually then at colum number 3 you get 2, and at 7 you get 4, which gives a total of 6, nevertheless, if you take each letter seperately, then : 1+1+1+1+1+1+1= 7 (diagonally down i=j) which means that the talbe is incorrect...

this could work if i could only store the position of the previous entery that was best.(again i think)

sam
  • 15
  • 1
  • 6