Skip to content

Usage with Solr

Luke Lovett edited this page Feb 5, 2014 · 23 revisions

The Basics

Mongo Connector can replicate to the Solr search engine using the Solr DocManager. The most basic usage is the following:

mongo-connector -m localhost:27017 -t http://localhost:8983/solr -d <your-doc-manager-folder>/solr_doc_manager.py

This assumes there is a MongoDB replica set running on port 27017 and that Solr is running on port 8983 both on the local machine.

Mongo Connector and schema.xml

Additionally, Mongo Connector comes with an example schema.xml file that can help get you started integrating MongoDB with Solr search. Solr reads schema.xml in order to find field types, fields that documents may have, the primary key, and more. Mongo Connector will try to obtain the schema for Solr using the LukeRequestHandler at a special URI admin/luke/?show=schema&wt=json that is appended to the base Solr URL. So, in the example above, Mongo Connector will try to obtain the schema for Solr by sending a GET request to http://localhost:8983/solr/admin/luke/?show=schema&wt=json.

Mongo Connector will drop fields from MongoDB documents that aren't declared in your Solr core's schema in order to avoid Solr throwing exceptions and failing to insert those documents. If you don't define the fields you want in schema.xml and reload the Solr core, Mongo Connector will merrily continue stripping your MongoDB documents of the offending fields. You can check what Solr thinks the schema to your core is by visiting the aforementioned endpoint in your browser.

Unique Keys between Solr and MongoDB

MongoDB generally uses a field called _id to store unique keys in documents. Solr by default uses id for the same purpose. In both databases, these fields have mandatory presence in a document, so submitting a document unchanged from MongoDB to Solr while the unique key is still id will result in an exception from Solr, and the document will not be inserted. In order for Mongo Connector to replicate to Solr successfully, Solr needs to see the expected unique key in each document. There are two ways to do this:

  1. Mongo Connector can translate _id to id when operations are replicated to Solr if you specify the option --unique-key=id to mongo-connector. The new id field will hold a string-ified version of what was stored in the _id field.

  2. You can switch Solr's unique key to _id instead of id. If you're working from the schema.xml provided as part of Mongo Connector, this is already done for you! Otherwise, you can accomplish this by editing the schema.xml file and replacing the line:

     <uniqueKey>id</uniqueKey>
    

    with the line:

     <uniqueKey>_id</uniqueKey>
    

    You'll also need to add a field definition for this key. Inside the <fields></fields> tags, you should insert:

     <field name="_id" type="string" indexed="true" stored="true" />
    

    Finally, you'll need to reload your Solr core.

Clone this wiki locally