13

There are some discussions here, and utility functions, for splitting strings, but I need an ad-hoc one-liner for a very simple task.

I have the following string:

local s = "one;two;;four"

And I want to split it on ";". I want, eventually, go get { "one", "two", "", "four" } in return.

So I tried to do:

local s = "one;two;;four"

local words = {}
for w in s:gmatch("([^;]*)") do table.insert(words, w) end

But the result (the words table) is { "one", "", "two", "", "", "four", "" }. That's certainly not what I want.

Now, as I remarked, there are some discussions here on splitting strings, but they have "lengthy" functions in them and I need something succinct. I need this code for a program where I show the merit of Lua, and if I add a lengthy function to do something so trivial it would go against me.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
Niccolo M.
  • 3,363
  • 2
  • 22
  • 39
  • `[^;]*` is perfectly happy matching zero semicolons. So lua matches zero semicolons each time it gets to a delimiter. You can use "[^;]+" instead for a slightly better result but there are reasons the http://lua-users.org/wiki/SplitJoin page of the lua-users wiki runs as long as it does when talking about splitting strings. – Etan Reisner Nov 11 '13 at 14:18

3 Answers3

23
local s = "one;two;;four"
local words = {}
for w in (s .. ";"):gmatch("([^;]*);") do 
    table.insert(words, w) 
end

By adding one extra ; at the end of the string, the string now becomes "one;two;;four;", everything you want to capture can use the pattern "([^;]*);" to match: anything not ; followed by a ;(greedy).

Test:

for n, w in ipairs(words) do
    print(n .. ": " .. w)
end

Output:

1: one
2: two
3:
4: four
Yu Hao
  • 119,891
  • 44
  • 235
  • 294
  • Wow, thanks. Your solution works perfectly! (I won't close this question yet: if somebody could explain to me why my original code returns spurious empty strings I'd be grateful.) – Niccolo M. Nov 11 '13 at 14:09
  • 4
    @NiccoloM. Remember that `*` matches zero or more, the empty string where I marked `$` in the string `one$;two$;$;four$` is also a match. – Yu Hao Nov 11 '13 at 14:15
  • But what about `one$;$two$;$;$fo$ur$`? Why is the zero match only before `;` ? Why isn't it also after the `;`, and between every two letters? – Niccolo M. Nov 11 '13 at 14:20
  • 2
    @NiccoloM. Because `*` is greedy, it will try to match as long as possible, the non-greedy version to match zero or more is `-`. – Yu Hao Nov 11 '13 at 14:24
  • After thinking very long about this, I now understand. I see that regexps in Ruby (and probably in other languages as well) behave in exactly the same way. Thanks. – Niccolo M. Nov 11 '13 at 14:44
  • 2
    It's worth noting that LUA Patterns are not actually Regular Expressions, you will notice many differences between 'standard' regexp implementations and how LUA patterns operate. – Shaun Wilson Apr 15 '14 at 14:25
0

Just changing * to + works.

local s = "one;two;;four"
local words = {}
for w in s:gmatch("([^;]+)") do 
    table.insert(words, w) 
    print(w)
end

The magic character * represents 0 or more occurrene, so when it meet ',', lua regarded it as a empty string that [^;] does not exist.

Sorry for my carelessness, the words[3] should be a empty string, but when I run the original code in lua5.4 interpreter, everything works.

code here

running result here (I have to put links because of lack of reputation)

  • This does not give the OP's desired output, they want an empty string at index 3. `{ "one", "two", "", "four" }` – Nifim Dec 30 '20 at 18:18
  • @Nifim sry,I dont read the question carefully.But when I use lua5.4 interpreter, the original code suddenly works!? – Heywood Jason Dec 31 '20 at 08:58
-2
function split(str,sep)
    local array = {}
    local reg = string.format("([^%s]+)",sep)
    for mem in string.gmatch(str,reg) do
        table.insert(array, mem)
    end
    return array
end
local s = "one;two;;four"
local array = split(s,";")

for n, w in ipairs(array) do
    print(n .. ": " .. w)
end

result:

1:one

2:two

3:four

han xi
  • 7
  • 1