How can I get a character array from a string?

Question

How do you convert a string to a character array in JavaScript?

I'm thinking getting a string like "Hello world!" to the array
['H','e','l','l','o',' ','w','o','r','l','d','!']

score 564 · Accepted Answer · edited Jun 17 '19 at 14:24

564

Note: This is not unicode compliant. "IU".split('') results in the 4 character array ["I", "�", "�", "u"] which can lead to dangerous bugs. See answers below for safe alternatives.

Just split it by an empty string.

var output = "Hello world!".split('');
console.log(output);

See the String.prototype.split() MDN docs.

edited Jun 17 '19 at 14:24

Ray Foss

3,649
3
30
31

answered Dec 28 '10 at 16:41

meder omuraliev

183,342
71
393
434

39

This doesn't take into account surrogate pairs. `"".split('')` results in `["�", "�"]`. – hippietrail Feb 13 '15 at 18:15
89

See @hakatashi's answer elsewhere in this thread. Hopefully everyone sees this... ***DO NOT USE THIS METHOD, IT'S NOT UNICODE SAFE*** – i336_ Feb 05 '16 at 04:22
3

Bit late to the party. But why would someone ever want to make a array of a string? A string is already an array or am I wrong? `"randomstring".length;` `//12` `"randomstring"[2];` `//"n"` – Luigi van der Pal Dec 08 '16 at 11:19
8

@LuigivanderPal A string is not an array, but it is very similar. However, it is not similar to an array of characters. A string is similar to an array of 16-bit numbers, some of which represent characters and some of which represent half of a surrogate pair. For example, `str.length` does not tell you the number of characters in the string, since some characters take more space than others; `str.length` tells you the number of 16-bit numbers. – Theodore Norvell Apr 05 '19 at 13:00

score 417 · Answer 2 · edited May 27 '23 at 13:42

As hippietrail suggests, meder's answer can break surrogate pairs and misinterpret “characters.” For example:

// DO NOT USE THIS!
const a = ''.split('');
console.log(a);
// Output: ["�","�","�","�","�","�","�","�"]

I suggest using one of the following ES2015 features to correctly handle these character sequences.

Spread syntax (already answered by insertusernamehere)

const a = [...''];
console.log(a);

Array.from

const a = Array.from('');
console.log(a);

RegExp `u` flag

const a = ''.split(/(?=[\s\S])/u);
console.log(a);

Use /(?=[\s\S])/u instead of /(?=.)/u because . does not match newlines. If you are still in ES5.1 era (or if your browser doesn't handle this regex correctly - like Edge), you can use the following alternative (transpiled by Babel). Note, that Babel tries to also handle unmatched surrogates correctly. However, this doesn't seem to work for unmatched low surrogates.

const a = ''.split(/(?=(?:[\0-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]))/);
console.log(a);

A `for ... of ...` loop

const s = '';
const a = [];
for (const s2 of s) {
   a.push(s2);
}
console.log(a);

Note that this solution splits some emoji such as `️‍`, and splits combining diacritics mark from characters. If you want to split into grapheme clusters instead of characters, see https://stackoverflow.com/a/45238376. — user202729, Aug 30 '18 at 06:21
Note that while not breaking apart surrogate pairs is great, it isn't a general-purpose solution for keeping "characters" (or more accurately, *graphemes*) together. A grapheme can be made up of multiple code points; for instance, the name of the language Devanagari is "देवनागरी", which is read by a native speaker as five graphemes, but takes eight code points to produce... — T.J. Crowder, Sep 17 '18 at 12:08
This answer is being referred to by the official Mozilla documentation at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split — Zefiro, Jun 05 '21 at 22:48

insertusernamehere · Answer 3 · 2023-02-22T09:50:26.153

The spread Syntax

You can use the spread syntax, an Array Initializer introduced in ECMAScript 2015 (ES6) standard:

var arr = [...str];

Examples

function a() {
    return arguments;
}

var str = 'Hello World';

var arr1 = [...str],
    arr2 = [...'Hello World'],
    arr3 = new Array(...str),
    arr4 = a(...str);

console.log(arr1, arr2, arr3, arr4);

The first three result in:

["H", "e", "l", "l", "o", " ", "W", "o", "r", "l", "d"]

The last one results in

{0: "H", 1: "e", 2: "l", 3: "l", 4: "o", 5: " ", 6: "W", 7: "o", 8: "r", 9: "l", 10: "d"}

Browser Support

Check the ECMAScript ES6 compatibility table.

Further reading

spread is also referenced as "splat" (e.g. in PHP or Ruby or as "scatter" (e.g. in Python).

Demo

Try before buy

If you use the spread operator in combination with a compiler to ES5 then this wont work in IE. Take that into consideration. It took me hours to figure out what the problem was. — Stef van den Berg, Jun 21 '17 at 12:06

score 21 · Answer 4 · edited May 11 '23 at 16:09

21

You can use Array.from.

var m = "Hello world!";
console.log(Array.from(m))

This method has been introduced in ES6.

Reference

Array.from

The Array.from() static method creates a new, shallow-copied Array instance from an iterable or array-like object.

edited May 11 '23 at 16:09

M. Justin

14,487
7
91
130

answered Oct 13 '16 at 12:48

Rajesh

24,354
5
48
79

score 21 · Answer 5 · edited Mar 18 '23 at 09:14

There are (at least) three different things you might conceive of as a "character", and consequently, three different categories of approach you might want to use.

Splitting into UTF-16 code units

JavaScript strings were originally invented as sequences of UTF-16 code units, back at a point in history when there was a one-to-one relationship between UTF-16 code units and Unicode code points. The .length property of a string measures its length in UTF-16 code units, and when you do someString[i] you get the ith UTF-16 code unit of someString.

Consequently, you can get an array of UTF-16 code units from a string by using a C-style for-loop with an index variable...

const yourString = 'Hello, World!';
const charArray = [];
for (let i=0; i<yourString.length; i++) {
    charArray.push(yourString[i]);
}
console.log(charArray);

There are also various short ways to achieve the same thing, like using .split() with the empty string as a separator:

const charArray = 'Hello, World!'.split('');
console.log(charArray);

However, if your string contains code points that are made up of multiple UTF-16 code units, this will split them into individual code units, which may not be what you want. For instance, the string '' is made up of four unicode code points (code points 0x1D7D8 through 0x1D7DB) which, in UTF-16, are each made up of two UTF-16 code units. If we split that string using the methods above, we'll get an array of eight code units:

const yourString = '';
console.log('First code unit:', yourString[0]);
const charArray = yourString.split('');
console.log('charArray:', charArray);

Splitting into Unicode Code Points

So, perhaps we want to instead split our string into Unicode Code Points! That's been possible since ECMAScript 2015 added the concept of an iterable to the language. Strings are now iterables, and when you iterate over them (e.g. with a for...of loop), you get Unicode code points, not UTF-16 code units:

const yourString = '';
const charArray = [];
for (const char of yourString) {
  charArray.push(char);
}
console.log(charArray);

We can shorten this using Array.from, which iterates over the iterable it's passed implicitly:

const yourString = '';
const charArray = Array.from(yourString);
console.log(charArray);

However, unicode code points are not the largest possible thing that could possibly be considered a "character" either. Some examples of things that could reasonably be considered a single "character" but be made up of multiple code points include:

Accented characters, if the accent is applied with a combining code point
Flags
Some emojis

We can see below that if we try to convert a string with such characters into an array via the iteration mechanism above, the characters end up broken up in the resulting array. (In case any of the characters don't render on your system, yourString below consists of a capital A with an acute accent, followed by the flag of the United Kingdom, followed by a black woman.)

const yourString = 'Á';
const charArray = Array.from(yourString);
console.log(charArray);

If we want to keep each of these as a single item in our final array, then we need an array of graphemes, not code points.

Splitting into graphemes

JavaScript has no built-in support for this - at least not yet. So we need a library that understands and implements the Unicode rules for what combination of code points constitute a grapheme. Fortunately, one exists: orling's grapheme-splitter. You'll want to install it with npm or, if you're not using npm, download the index.js file and serve it with a <script> tag. For this demo, I'll load it from jsDelivr.

grapheme-splitter gives us a GraphemeSplitter class with three methods: splitGraphemes, iterateGraphemes, and countGraphemes. Naturally, we want splitGraphemes:

const splitter = new GraphemeSplitter();
const yourString = 'Á';
const charArray = splitter.splitGraphemes(yourString);
console.log(charArray);

<script src="https://cdn.jsdelivr.net/npm/grapheme-splitter@1.0.4/index.js"></script>

And there we are - an array of three graphemes, which is probably what you wanted.

This was so helpful. Really saved me on a project I am working on. Thanks!!! — raddevus, Jan 24 '21 at 22:20

score 11 · Answer 6 · edited Mar 18 '23 at 09:28

11

You can use the Object.assign function to get the desired output:

var output = Object.assign([], "Hello, world!");
console.log(output);
    // [ 'H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd', '!' ]

It is not necessarily right or wrong, just another option.

Object.assign is described well at the MDN site.

edited Mar 18 '23 at 09:28

Peter Mortensen

30,738
21
105
131

answered Apr 12 '16 at 05:53

David Thomas

2,264
2
18
20

2

That's a long way around to get to `Array.from("Hello, world")`. – T.J. Crowder Sep 17 '18 at 11:53
1

@T.J.Crowder That's a long way around to get to `[..."Hello, world"]` – chharvey Jun 28 '19 at 00:23
`Object.assign([], "")` is `[ "\ud83e", "\udd8a" ]`. This is objectively wrong. – Sebastian Simon Apr 06 '22 at 18:51
@SebastianSimon The original question was not that use case. – David Thomas Apr 19 '22 at 04:33
1

@DavidThomas There is no “use case” here. This is exactly the same string processing and encoding. This is about implementing a solution in actual code. Now there’s one solution that’ll work in 90 % of cases and one solution that’ll work in 100 % of cases, but at no additional cost — so it’s obvious which one to pick. `Array.from` etc. are the correct solutions, `Object.assign` etc. are the incorrect ones. The sentiment of providing solutions that just barely work in limited circumstances is harmful to the software industry. – Sebastian Simon Apr 19 '22 at 07:59

score 10 · Answer 7 · edited May 25 '19 at 11:29

10

It already is:

var mystring = 'foobar';
console.log(mystring[0]); // Outputs 'f'
console.log(mystring[3]); // Outputs 'b'

Or for a more older browser friendly version, use:

var mystring = 'foobar';
console.log(mystring.charAt(3)); // Outputs 'b'

edited May 25 '19 at 11:29

hashed_name

553
6
21

answered Dec 28 '10 at 16:43

dansimau

1,155
1
8
11

He is really looking to split with an empty string, not reference the original variable. – Schenz Dec 28 '10 at 16:47
4

-1: it isn't. Try it: `alert("Hello world!" == ['H','e','l','l','o',' ','w','o','r','l','d'])` – R. Martinho Fernandes Dec 28 '10 at 16:48
5

Sorry. I guess what I meant to say is: "you can access individual characters by index reference like this without creating a character array". – dansimau Dec 28 '10 at 16:50
3

Not reliably cross-browser you can't. It's an ECMAScript Fifth Edition feature. – bobince Dec 28 '10 at 17:25
8

The cross-browser version is `mystring.charAt(index)`. – psmay Dec 28 '10 at 18:04
Older versions of IE will trip up on this, at least IE7. Not sure about IE8. – Ally Sep 16 '13 at 12:35
1

+1 for `charAt()`--though I'd prefer to use the array-ish variant. Darn IE. – Zenexer Jul 04 '14 at 02:57
1

No, it isn't. The ability to select elements using brackets notation isn't the only feature of arrays in JavaScript. – Michał Perłakowski Apr 10 '16 at 03:45

score 7 · Answer 8 · edited Mar 18 '23 at 09:40

7

Four ways you can convert a string to a character array in JavaScript:

const string = 'word';

// Option 1
string.split('');  // ['w', 'o', 'r', 'd']

// Option 2
[...string];  // ['w', 'o', 'r', 'd']

// Option 3
Array.from(string);  // ['w', 'o', 'r', 'd']

// Option 4
Object.assign([], string);  // ['w', 'o', 'r', 'd']

edited Mar 18 '23 at 09:40

Peter Mortensen

30,738
21
105
131

answered Sep 17 '21 at 20:15

Aamir Kalimi

1,821
2
15
19

1

This answer would be better if the test string contained Unicode code points greater than U+FFFF. `"".split("")` and `Object.assign([], "")` are `[ "\ud83e", "\udd8a" ]` (due to strings being UTF-16 encoded and therefore indexed by their surrogate byte pairs if necessary); `[ ..."" ]` and `Array.from("")` are equivalent and result in `[ "" ]` (due to these two methods accessing the `Symbol.iterator` property which is Unicode aware). – Sebastian Simon Mar 15 '22 at 02:30

score 5 · Answer 9 · edited Mar 18 '23 at 08:47

5

The ES6 way to split a string into an array character-wise is by using the spread operator. It is simple and nice.

array = [...myString];

Example:

let myString = "Hello world!"
array = [...myString];
console.log(array);

// another example:

console.log([..."another splitted text"]);

edited Mar 18 '23 at 08:47

Nick Parsons

45,728
6
46
64

answered Jul 04 '20 at 15:07

Mohsen Alyafei

4,765
3
30
42

score 3 · Answer 10 · edited Mar 28 '19 at 21:03

3

You can iterate over the length of the string and push the character at each position:

const str = 'Hello World';

const stringToArray = (text) => {
  var chars = [];
  for (var i = 0; i < text.length; i++) {
    chars.push(text[i]);
  }
  return chars
}

console.log(stringToArray(str))

edited Mar 28 '19 at 21:03

KyleMit

30,350
66
462
664

answered Jun 28 '16 at 05:51

Mohit Rathore

428
3
10

3

While this approach is a little more imperative than declarative, it's the [most performant](https://jsperf.com/string-to-character-array) of any in this thread and deserves more love. *One limitation* to [retrieving a character on a string by position](https://stackoverflow.com/q/5943726/1366033) is when dealing with characters past the [Basic Multilingual Plan](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane) in unicode such as emojis. `"".charAt(0)` will return an unusable character – KyleMit Mar 28 '19 at 20:55
2

@KyleMit this seems only true for a short input. [Using a longer input makes `.split("")` the fastest option again](https://jsperf.com/string-to-character-array/2) – Lux Mar 31 '19 at 12:03
1

Also `.split("")` seems to be heavily optimized in firefox. While the loop has similar performance in chrome and firefox split is significantly faster in firefox for small and large inputs. – Lux Mar 31 '19 at 12:13
This method does not work for multi-byte characters! Try it with `str = '' ` and it will break. – Arnis Juraga Oct 01 '20 at 07:17

score 3 · Answer 11 · edited Mar 18 '23 at 09:29

3

A simple answer:

let str = 'this is string, length is >26';

console.log([...str]);

edited Mar 18 '23 at 09:29

Peter Mortensen

30,738
21
105
131

answered May 25 '19 at 09:55

Ajit Kumar

1,157
12
21

2

-1; this adds nothing that wasn't already included in [hakatashi's answer](https://stackoverflow.com/a/34717402/1709587). – Mark Amery Jan 17 '20 at 21:17

score 3 · Answer 12 · edited May 27 '23 at 13:49

3

As Mark Amery points out in his great answer - splitting on just code points may not be enough, especially for particular emoji characters or composed characters (eg: ñ which is made up of two code points n and ̃ which make up the one grapheme). JavaScript has an in-built grapheme segmenter available via the internationalization API (Intl) called Intl.Segmenter. This can be used to segment a string by different granularities, one of them being the graphemes (ie: user-perceived characters of a string):

const graphemeSplit = str => {
  const segmenter = new Intl.Segmenter("en", {granularity: 'grapheme'});
  const segitr = segmenter.segment(str);
  return Array.from(segitr, ({segment}) => segment);
}
// See browser console for output
console.log("Composite pair test", graphemeSplit("foo  bar mañana mañana"));
console.log("Variation selector test", graphemeSplit("❤️"));
console.log("ZWJ Test:", graphemeSplit("‍❤️‍‍"));
console.log("Multiple Code Points:", graphemeSplit("देवनागरी"));

edited May 27 '23 at 13:49

Mark Amery

143,130
81
406
459

answered Mar 18 '23 at 09:11

Nick Parsons

45,728
6
46
64

1

Also consider using [`normalize`](//developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/normalize). – Sebastian Simon Mar 19 '23 at 20:36
1

@SebastianSimon thanks - yep normalization is a nice option for dealing with characters that can be precomposed (such as `ñ` above), unfortunately, it won't work on all types of characters made up of multiple code points (such as emojis), but it's a good option if the string being used consists of composable code points – Nick Parsons Mar 20 '23 at 09:45
1

Very interesting to learn about this! Alas, [_Can I use_](https://caniuse.com/mdn-javascript_builtins_intl_segmenter) suggests that there's no support at all for `Intl.Segmenter` in Firefox yet, so I wouldn't want to use this on a public-facing webpage unless I could find a good polyfill. I am left uncertain about a couple of things after reading this answer: firstly, _is_ there a good polyfill out there in case one wants to use this on the public web, and secondly, [why does `Intl.Segmenter` take a locale parameter and what effect does it have?](https://stackoverflow.com/q/75747868/1709587). – Mark Amery May 27 '23 at 14:16
@MarkAmery, `@formatjs Intl.Segmenter` polyfill: https://github.com/formatjs/formatjs/tree/main/packages/intl-segmenter see also https://stackoverflow.com/questions/1026069/how-do-i-make-the-first-letter-of-a-string-uppercase-in-javascript/76777557#76777557 – mrienstra Sep 03 '23 at 02:10

score -1 · Answer 13 · edited Mar 18 '23 at 09:29

-1

Use this:

function stringToArray(string) {
  let length = string.length;
  let array = new Array(length);
  while (length--) {
    array[length] = string[length];
  }
  return array;
}

edited Mar 18 '23 at 09:29

Peter Mortensen

30,738
21
105
131

answered Mar 31 '19 at 23:31

msand

468
4
8

@KyleMit this seems faster than for i loop + push https://jsperf.com/string-to-character-array/3 – msand Mar 31 '19 at 23:33

score -1 · Answer 14 · answered Jan 07 '20 at 01:52

-1

Array.prototype.slice will do the work as well.

const result = Array.prototype.slice.call("Hello world!");
console.log(result);

answered Jan 07 '20 at 01:52

f3tknco

17
1

score -2 · Answer 15 · answered Feb 18 '20 at 18:19

-2

One possibility is the next:

console.log([1, 2, 3].map(e => Math.random().toString(36).slice(2)).join('').split('').map(e => Math.random() > 0.5 ? e.toUpperCase() : e).join(''));

answered Feb 18 '20 at 18:19

user2301515

4,903
6
30
46

How is this related to the question? – Sebastian Simon Mar 15 '22 at 02:25

How can I get a character array from a string?

15 Answers15

Spread syntax (already answered by insertusernamehere)

Array.from

RegExp `u` flag

A `for ... of ...` loop

Reference

Splitting into UTF-16 code units

Splitting into Unicode Code Points

Splitting into graphemes

Linked

Related

How can I get a character array from a string?

15 Answers15

Spread syntax (already answered by insertusernamehere)

Array.from

RegExp u flag

A for ... of ... loop

Reference

Splitting into UTF-16 code units

Splitting into Unicode Code Points

Splitting into graphemes

Linked

Related

RegExp `u` flag

A `for ... of ...` loop