Javascript, split a string in 4 pieces, and leave the rest as one big piece

Question

I'm building a Javascript chat bot for something, and I ran into an issue:
I use string.split() to tokenize my input like this:
tokens = message.split(" ");

Now my problem is that I need 4 tokens to make the command, and 1 token to have a message. when I do this: !finbot msg testuser 12345 Hello sir, this is a test message

these are the tokens I get: ["!finbot", "msg", "testuser", "12345", "Hello", "sir,", "this", "is", "a", "test", "message"]

However, how can I make it that it will be like this: ["!finbot", "msg", "testuser", "12345", "Hello sir, this is a test message"]

The reason I want it like this is because the first token (token[0]) is the call, the second (token[1]) is the command, the third (token[2]) is the user, the fourth (token[3]) is the password (as it's a password protected message thing... just for fun) and the fifth (token[4]) is the actual message.
Right now, it would just send Hello because I only use the 5th token.
the reason why I can't just go like message = token[4] + token[5]; etc. is because messages are not always exactly 3 words, or not exactly 4 words etc.

I hope I gave enough information for you to help me. If you guys know the answer (or know a better way to do this) please tell me so.

Thanks!

score 3 · Answer 1 · edited May 23 '17 at 12:33

3

Use the limit parameter of String.split:

tokens = message.split(" ", 4);

From there, you just need to get the message from the string. Reusing this answer for its nthIndex() function, you can get the index of the 4th occurrence of the space character, and take whatever comes after it.

var message = message.substring(nthIndex(message, ' ', 4))

Or if you need it in your tokens array:

tokens[4] = message.substring(nthIndex(message, ' ', 4))

edited May 23 '17 at 12:33

Community

1
1

answered Aug 27 '16 at 19:20

nickb

59,313
13
108
143

Thanks for your reply, unfortunately the tokens I now get are: `["!finbot", "msg", "testuser", "12345"]` – Finlay Roelofs Aug 27 '16 at 19:23
I now have done this: used `tokens[4] = message.substring(nthIndex(message, ' ', 4));` hoever, I get the error: nthIndex is not defined – Finlay Roelofs Aug 27 '16 at 19:35
@FinlayRoelofs - You need to use the function that's defined in the answer I linked to... – nickb Aug 28 '16 at 01:32

score 2 · Answer 2 · answered Aug 27 '16 at 19:31

2

I would probably start by taking the string like you did, and tokenizing it:

const myInput = string.split(" "):

If you're using JS ES6, you should be able to do something like:

const [call, command, userName, password, ...messageTokens] = myInput;
const message = messageTokens.join(" ");

However, if you don't have access to the spread operator, you can do the same like this (it's just much more verbose):

const call = myInput.shift();
const command = myInput.shift();
const userName = myInput.shift();
const password = myInput.shift();
const message = myInput.join(" ");

If you need them as an array again, now you can just join those parts:

const output = [call, command, userName, password, message];

answered Aug 27 '16 at 19:31

Alex LaFroscia

961
1
8
24

Does Firefox 45.3.0 support ES6? – Finlay Roelofs Aug 27 '16 at 19:39
That is a good question that you can Google for yourself However, most people using ES6 in production are transpiling with a tool like Babel. – Alex LaFroscia Aug 27 '16 at 20:02
Thanks, I will look into it. never knew about ES6... always thought Javascript is just... well... Javascript... – Finlay Roelofs Aug 27 '16 at 20:08
Not lately! The language is evolving a lot, especially the last few years. Gaining new functionality all the time! More info here: http://babeljs.io/docs/learn-es2015/ – Alex LaFroscia Aug 28 '16 at 02:22
Very interesting... I will look into it! (and try to understand it) – Finlay Roelofs Aug 28 '16 at 11:27

score 2 · Answer 3 · answered Aug 27 '16 at 19:34

2

If you can use es6 you can do:

let  [c1, c2, c3, c4, ...rest] = input.split (" ");
let msg = rest.join (" ");

answered Aug 27 '16 at 19:34

Kevin

24,871
19
102
158

Does Firefox 45.3.0 support ES6? – Finlay Roelofs Aug 27 '16 at 19:39
Specifically you need support for destructuring. If you want that you should transpile with babel. – Kevin Aug 27 '16 at 20:18

Ilja Everilä · Accepted Answer · 2016-08-27T20:36:16.070

You could revert to regexp given that you defined your format as "4 tokens of not-space separated with spaces followed by message":

function tokenize(msg) {
    return (/^(\S+) (\S+) (\S+) (\S+) (.*)$/.exec(msg) || []).slice(1, 6);
}

This has the perhaps unwanted behaviour of returning an empty array if your msg does not actually match the spec. Remove the ... || [] and handle accordingly, if that's not acceptable. The amount of tokens is also fixed to 4 + the required message. For a more generic approach you could:

function tokenizer(msg, nTokens) {
    var token = /(\S+)\s*/g, tokens = [], match;

    while (nTokens && (match = token.exec(msg))) {
        tokens.push(match[1]);
        nTokens -= 1; // or nTokens--, whichever is your style
    }

    if (nTokens) {
        // exec() returned null, could not match enough tokens
        throw new Error('EOL when reading tokens');
    }

    tokens.push(msg.slice(token.lastIndex));
    return tokens;
}

This uses the global feature of regexp objects in Javascript to test against the same string repeatedly and uses the lastIndex property to slice after the last matched token for the rest.

Given

var msg = '!finbot msg testuser 12345 Hello sir, this is a test message';

then

> tokenizer(msg, 4)
[ '!finbot',
  'msg',
  'testuser',
  '12345',
  'Hello sir, this is a test message' ]
> tokenizer(msg, 3)
[ '!finbot',
  'msg',
  'testuser',
  '12345 Hello sir, this is a test message' ]
> tokenizer(msg, 2)
[ '!finbot',
  'msg',
  'testuser 12345 Hello sir, this is a test message' ]

Note that an empty string will always be appended to returned array, even if the given message string contains only tokens:

> tokenizer('asdf', 1)
[ 'asdf', '' ]  // An empty "message" at the end

Got one more question, what if I want to use lower amount of tokens? like let's say 3 tokens total. so 1 command would be 5 tokens, and another command would be 3. — Finlay Roelofs, Aug 27 '16 at 19:58
Hii there, sorry for bothering you again, but I've ben rewriting my bot to Node.JS and updated it a bit, however, it killed the tokenizer... Right now, the array I get after running it through the tokenizer is empty. the only thing that has changed is the `!finbot` parameter (it get's stripped away earlier in the code). so the array is now just `["msg", "testuser", "12345", "Hello", "sir,", "this", "is", "a", "test", "message"]` I've tried everything that came to mind, but I couldn't fix it by myself. — Finlay Roelofs, Sep 07 '16 at 15:48
It sounds like you're trying to pass an array as argument to the function. If true, then it is somewhat obvious that it'll not function. It was meant for splitting strings. On the other hand it should not return an empty array, but fail. — Ilja Everilä, Sep 07 '16 at 18:56
ah yes, I see :) I now passed the whole string to it and it works! thanks! now I can facedesk for other reasons :D — Finlay Roelofs, Sep 07 '16 at 18:58

Javascript, split a string in 4 pieces, and leave the rest as one big piece

4 Answers4