1

My code in java and I have a long text (Maximum 500 characters) and I want to do a kind of Segmentation on this text, and in every segment I want only 6 characters for example: this is an example text:

String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";

and I want this result:

segment1: Syria

segment2: offici

Segment3: ally k

Segment n: ……

I have tried with for loop but I did not reach to my goal.. and also I have an error

java.lang.StringIndexOutOfBoundsException: length=67; regionStart=65; regionLength=5

This is my code:

    String msg = fullText;

for(int i=-1 ; i <= fullText.length()+1; i++){
            
     int len = msg.length();
     text = new StringBuilder().append(msgInfo).append(msg.substring(i, i + 6)).toString();
     
     msg = new StringBuilder().append(msg.substring(i +5, len)).toString();

     LogHelper.d(TAG, "teeeeeeeeeeeeext:"+i +" .."+ text);

        }

How i can do this segmentation correctly? Thankyou!

Community
  • 1
  • 1
Haya Akkad
  • 281
  • 4
  • 15
  • 1
    when extracting the last segment, make sure you are extracting only the available length – Raj Jul 15 '18 at 04:16
  • I'm confused by the StringBuilder stuff, seems redundant and possibly breaking since you're taking a substring of msg and then changing msg later. Do you just want an array of the segments or what? – obermillerk Jul 15 '18 at 04:17
  • @obermillerk yes just do segmentation and add some value in the beginning of every segment – Haya Akkad Jul 15 '18 at 04:21
  • @emaillenin yah, I have if condition in my original code // if(fulltext.lenght >0)... – Haya Akkad Jul 15 '18 at 04:22
  • you are running out of bounds you may start from 0 to the full length – Abdulwahid Jul 15 '18 at 04:24
  • 2
    So your end goal is to have the original string with some text inserted every x characters? – obermillerk Jul 15 '18 at 04:29
  • @obermillerk my end goal is to send text by small parts and every part start with value (segment number), so finally I can re-merge segments in correctly order after remove the segment number.. – Haya Akkad Jul 15 '18 at 04:35
  • my issue is not only the error of OutOfBoundsException, I also not reach to correctly segmentation because I have messing some characters in every partition @obermillerk – Haya Akkad Jul 15 '18 at 04:37
  • A similar question: [How to split array list into equal parts?](https://stackoverflow.com/questions/13678387/how-to-split-array-list-into-equal-parts) – LuCio Jul 15 '18 at 17:06

4 Answers4

2

You're on the right track, but you've over complicated this.

Try something like this

int segmentSize = 6;
String[] segments = new String[msg.length() / segmentSize + 1];

for (int i = 0; i < msg.length(); i += segmentSize) {
    // ensure we don't try to access out of bounds indexes
    int lastIndex = Math.min(msg.length(), i+segmentSize);
    int segmentNumber = i/segmentSize;
    segments[segmentNumber] = msg.substring(i, lastIndex);
}

This will put the segments in the array of that name. The Math.min(msg.length(), i+segmentSize) ensures that you don't try to pull characters beyond the end of the string, which is what caused that StringIndexOutOfBounds error you mentioned.

You can do something else instead of putting them in the array if you want. If your end goal is to have some longer string incorporating these segments, I would make a single StringBuilder outside of the for loop (like where the segments array is declared) and then you can append to that as needed inside the loop and access the result after the loop (ie sb.toString()) without making new instances of StringBuilder every loop iteration.

obermillerk
  • 1,560
  • 2
  • 11
  • 12
2

Here's a succinct implementation using Java8 streams:

String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";
final AtomicInteger counter = new AtomicInteger(0);
Collection<String> strings = fullText.chars()
                                    .mapToObj(i -> String.valueOf((char)i) )
                                    .collect(Collectors.groupingBy(it -> counter.getAndIncrement() / 6
                                                                ,Collectors.joining()))
                                    .values();

Output:

[Syria , offici, ally k, nown a, s the , Syrian,  Arab , Republ, ic, is,  a cou, ntry i, n West, ern As, ia...]
Pankaj Singhal
  • 15,283
  • 9
  • 47
  • 86
1

You can also use Regular expressions for splitting nth character, which splits exactly for every 6 characters

String s ="anldhhdhdhhdhdhhdhdhdhdhdhd";
String[] str = s.split("(?<=\\G.{6})");
System.out.println(Arrays.toString(str));

Output:

[anldhh, dhdhhd, hdhhdh, dhdhdh, dhd]
Ryuzaki L
  • 37,302
  • 12
  • 68
  • 98
1

Why not go with a while loop that is essentially iterating in increments of 6 until less than 6 characters are left?

I'm not sure how you're using the segments, so for now I've just left in print statements similar to the expected sample output you gave:

public class StringSegmenter {

    private static final int SEG_LENGTH = 6;
    private static final String PREFIX = "Segment%s: %s\n";

    public static void main(String[] args) {
        String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";

        int position = 0;
        int length = fullText.length();
        int segmentationCount = 0;

        // Checks that remaining characters are greater than 6, then prints segment
        // If less than 6 characters remain, prints remainder and exits loop.
        while (position < length) {
            segmentationCount++;

            if ((length - position) < SEG_LENGTH) {

                // Replace this with logging, or StringBuilder appending, etc...
                System.out.printf(PREFIX, segmentationCount, fullText.substring(position, length - 1));
                break;
            }
            // Replace this with logging, or StringBuilder appending, etc...
            System.out.printf(PREFIX, segmentationCount, fullText.substring(position, position + SEG_LENGTH));
            position += SEG_LENGTH;
        }
    }
}
John Stark
  • 1,293
  • 1
  • 10
  • 22