My original purpose was to map multiple columns from a table to one field in solr, without knowing the exact names of the columns. This may offer a chance to map all query result columns getting from the entity query to one field, for example all columns with type text (or varchar).
I'm using the DataimportHandler to import data from an sql server. There is no problem in datasource configuration since a direct mapping using column name and dependcy field name works fine. For example using
<entity name="foo" dataSource="my_database" query="select * from foo">
<field column="db_column_name" name="solr_field_name"/>
</entity>
After many hours of trying and searching I ended up defining an extra MappingTable containing all column names and fields the to map on like explained in Solr - DataImportHandler: When attempting to use column values as field names, multivalued fields only retain the first result
The MappingTable looks like
SOLR_FIELD MY_FIELD
name column_name
id column_id
content column_desc
content column_second_desc
SourceTable looks like
column_id column_name column_desc
123 goose this column doesnt quak
342 plain this is another description
There is no foreign key defined between this tables. So again what I thought to do here is using the MappingTable to get the data out of every row from SourceTable and put it into solr fields without explicite naming the column names of Sourcetable in Code(possibility to create an extra table will still keep the process dynamic enough.
This example shows how I tried to realize the mapping
<entity name="source" dataSource="my_database" query="SELECT * FROM SourceTable">
<entity name="mapping" query="select * from MappingTable" transformer="ClobTransformer" >
<field name="${mapping.SOLR_FIELD}" column="MY_FIELD" clob="true" />
</entity>
</entity>
As you may see this only returns me four documents that containing the words column_name, column_id, column_desc, column_second_desc as often as solr found rows(datasets) in my SourceTable.
After that I tried turning the terms around and use column="${mapping.MY_FIELD}" to dynamic get the column name from the MappingTable.
<entity name="mapping" query="select * from MappingTable" transformer="ClobTransformer" >
<entity name="source" dataSource="my_database" query="SELECT * FROM SourceTable>
<field name="${mapping.SOLR_FIELD}" column="${mapping.MY_FIELD}"/>
</entity>
</entity>
But this doesn't work for me. Solr doesn't execute the /dataimport?command=full-index neither gives me a hint in the "Logging" section in the solr admin.
Maybe there is a possibility to realize this using an foreign key in the table. I also thought about approaching it using a ScriptTransformer and looked at the answer in Dynamic column names using DIH (DataImportHandler) but this shows only the way to realize a dynamic custom solr field and this assumed also that the user knows the source column names. So I tries a kind of like explained in In solr dih import two double in one location
<script><![CDATA[
function catchColumns(row) {
var descriptions = new java.util.ArrayList();
for (var i=0; i<row.length; i++) {
var value = row[i];
if (value.equals(String))
{
descriptions.add(row[i]);
}
}
row.put("description",descriptions);
return row;
} ]]></script>
Using simple entity definition
<entity name="foo"
dataSource="my_database"
transformer="script:catchColumns"
query="select * from foo">
...
</entity>
But I guess the ScriptTransformer works only with named columns in field tags in the entity.
I hope someone maybe can find a mistake or offer me a better way.
Edit: there were missing some quotationmarks, this happened just in case of changing names and queries for the examples. Also I tried to fix some blury worded paragraphs.