I want to replace utf8 html entities in html sources with real characters. I have the "entities" replacement table which is traversed with code bellow. If I run this code it utilizes my CPU up to 100%.
Please could you help me how to rewrite first loop in better way? I understand that in Lua strings are immutable so I think there are many copies of data variable and this could be the reason.
local entities = {
{["char"]="!", ["utf"]="!"},
{["char"]='"', ["utf"]="""},
{["char"]="#", ["utf"]="#"},
{["char"]="$", ["utf"]="$"},
{["char"]="%", ["utf"]="%"},
{["char"]="&", ["utf"]="&"},
{["char"]="'", ["utf"]="'"},
-- +312 rows more
}
local function clear_text(data)
for _, e in ipairs(entities) do
data = string.gsub(data, e.utf, e.char)
end
return data
end
-- this is just for testing ... replacement in many html sources
for i=1,200 do
local data = some_html_page_source()
clear_text(data)
end