1

I am doing various regepx_filters in an index to modify the stored index text, this from data that is originally in tagged html format (multiple zones). After I do so is is possible to now make a second index based on the first modified index that uses only one of the zones in the original index?

index_zones = Title, Author, Description

Can I, after indexing this with a custom configuration, then duplicate this index in some way that says

Create IndexB based on IndexA using ZONE:(Title) only

Say for instance I did the following regexp:

regexp_filter=(<Title>.*?ipad.*?)(<\/Title>)<Description>.*?Used.*?<\/Description>=>\1 Used\2 in order to index used into the Title Zone.

Now I want to reindex or make a new index with just the newly indexed

<Title>Bla bla ipad bla bla Used</Title>

is this possible? If not can I then update my Mysql table with the newly indexed text?

user3649739
  • 1,829
  • 2
  • 18
  • 28

1 Answers1

2

I don't think it's possible to create an index based on an existing sphinx index. I also don't think its possible to retrieve the regexp_filtered result - im pretty sure its only available to query against.

Why dont you do your regex's before sphinx indexing? For example, create a new db column ipad_used_regex and populate this with whatever scripting language you choose. Or using mariaDb with the PCRE Regex Enhancements you could build the regex match into the SQL, something like this:

SELECT Title, REGEXP_REPLACE(Title, "(<Title>.*?ipad.*?)(<\/Title>)<Description>.*?Used.*?<\/Description>", '\\1 Used\\2') as ipad_used_regex
FROM `your_table`

You could then use this SQL in your sphinx index source?

Community
  • 1
  • 1
joshweir
  • 5,427
  • 3
  • 39
  • 59
  • I think using the SQl in the source might get cumbersome as my example was a quite simple and watered down one of the actual but yeah it seems as if in this case I am going to have to pre-process the mysql which will add an unavoidable overhead which is too bad. zone_weights would be *quite* a helpful addition! – user3649739 Mar 30 '17 at 13:06