This question is seeking support for a task comprised from 3 separate procedures.
How to split a string on spaces to generate an array of words? (The OP has a suboptimal, yet working solution for this part.)
- Because the pattern is only seeking out "spaces" between words, the pattern could be changed to
/ /
. This eliminates the check for additional white-space characters beyond just the space.
- Better/Faster than a regex-based solutions would be to split the string using string functions.
explode(' ',$descr)
would be the most popular and intuitive function call.
str_word_count($descr,1)
as Ravi Hirani pointed out will also work, but is less intuitive.
A major benefit to this function is that it seamlessly omits punctuation --
for instance, if the the OP's sample string had a period at the end, this function would omit it from the array!
Furthermore, it is important to note what is considered a "word":
For the purpose of this function, 'word' is defined as a locale dependent string containing alphabetic characters, which also may contain, but not start with "'" and "-" characters.
How to generate an indexed array with keys starting from 1?
- Bind a generated "keys" array (from 1) to a "values" array:
$words=explode(' ',$descr); array_combine(range(1,count($words)),$words)
- Add a temporary value to the front of the indexed array (
[0]
), then remove the element with a function that preserves the array keys.
array_unshift($descr,''); unset($descr[0]);
array_unshift($descr,''); $descr=array_slice($descr,1,NULL,true);
- How to convert a string to all lowercase? (it was hard to find a duplicate -- this a RTM question)
lcfirst($descr)
will work in the OP's test case because only the first letter of the first word is capitalized.
strtolower($descr)
is a more reliable choice as it changes whole strings to lowercase.
mb_strtolower($descr)
if character encoding is relevant.
- Note:
ucwords()
exists, but lcwords()
does not.
There are so many paths to a correct result for this question. How do you determine which is the "best" one? Top priority should be Accuracy. Next should be Efficiency/Directness. Followed by some consideration for Readability. Code Brevity is a matter of personal choice and can clash with Readability.
With these considerations in mind, I would recommend these two methods:
Method #1: (one-liner, 3-functions, no new variables)
$descr="Hello this is a test string";
var_export(array_slice(explode(' ',' '.strtolower($descr)),1,null,true));
Method #2: (two-liner, 3-functions, one new variable)
$descr="Hello this is a test string";
$array=explode(' ',' '.strtolower($descr));
unset($array[0]);
var_export($array);
Method #2 should perform faster than #1 because unset()
is a "lighter" function than array_slice()
.
Explanation for #1 : Convert the full input string to lowercase and prepend $descr
with a blank space. The blank space will cause explode()
to generate an extra empty element at the start of the output array. array_slice()
will output generated array starting from the first element (omitting the unwanted first element).
Explanation for #2 : The same as #1 except it purges the first element from generated array using unset()
. While this is faster, it must be written on its own line.
Output from either of my methods:
array (
1 => 'hello',
2 => 'this',
3 => 'is',
4 => 'a',
5 => 'test',
6 => 'string',
)
Related / Near-duplicate: