-
Notifications
You must be signed in to change notification settings - Fork 355
Running your own copy
The ConceptNet 5 server comes in three pieces: the main index in Apache Solr, a REST API that's served from Python, and a Web interface on top of that API.
You'll need a Solr representation of the data in ConceptNet. You can either get it by running the build process on your computer, or by downloading the built Solr data from http://conceptnet5.media.mit.edu/downloads/current/. (Look for the filename with "solr" in it.)
Our Solr environment is packaged up at:
http://conceptnet5.media.mit.edu/downloads/20120501/conceptnet5-solr-config.tar.gz
(That has an old version number in the URL, but the Solr configuration hasn't changed.)
You should be able to unpack that and run "java -jar start.jar" to get a server, and then use the included "import-solr-json.sh" to load the ConceptNet 5 data that you download separately from:
http://conceptnet5.media.mit.edu/downloads/current/
NOTE: This may give you an index that doesn't fit in memory and spends an unreasonable amount of time swapping to disk. The machines we run it on are dedicated servers with 64 GB of RAM. Before that, we ran it in two shards, each on Amazon EC2 m1.large instances with 17 GB of RAM. (This was expensive and not recommended.)
This is where you need the conceptnet5 Python code, so begin by checking out the Git repository at https://github.com/commonsense/conceptnet5.
The REST API is in conceptnet5/conceptnet_api.wsgi
. This can be run using a Python WSGI server or Apache's mod_wsgi. An example of running it in Gunicorn is included in gunicorn.sh
.
This is also a WSGI file, in conceptnet5/web_interface/conceptnet_web.wsgi
. You run it the same way as the API server, just pointing it to that .wsgi file instead.
Starting points
Reproducibility
Details