You are here

Apache Solr multicore (Drupal 7)

This article explains how to use Apache Solr as search engine for a Drupal 7 site (it also applies to Drupal 6). I assumes that you are running Debian 6, but the description can also be used on any other Linux based distribution.

Apache Solr is written in Java and we need to start by installing Java, You should note that Apache Solr can be run in a Tomcat installation I do however prefer to not use tomcat, mostly because I don't have much experiences with tomcat.

~$ apt-get install openjdk-6-jdk

Install Apache Solr

Apache Solr can be download from http://ftp.download-by.net/apache//lucene/solr/ and as of this writing it's version 1.4.0. You should check which version is supported by the Drupal module you are using to connect to Solr.

~$ wget http://ftp.download-by.net/apache//lucene/solr/4.1.0/solr-4.1.0.tgz
~$ tar -zxvf solr-4.1.0.tgz
~$ cp -rf solr-4.1.0 /opt/apache-solr-4.1.0
~$ mv /opt/apache-solr-4.1.0/example /opt/apache-solr-4.1.0/drupal

We need to copy the schema definitions and configuration found in the apache solr module from d.o. So stater by renaming the schemas that came with Solr in the conf folder adding a .bak to them.

~$ cd /opt/apache-solr-4.1.0/drupal/solr/conf
~$ mv schema.xml schema.xml.bak
~$ mv solrconfig.xml solrconfig.xml.bak
~$ cp <path to sol module>/solr-conf/schema.xml /opt/apache-solr-4.1.0/drupal/solr/conf
~$ cp <path to sol module>/solr-conf/solrconfig.xml /opt/apache-solr-4.1.0/drupal/solr/conf

I always have more that on Drupal site running on a single server, hence I configure Solr to run multicore so I can use the same Solr to index all my sites on the server. Next step is to update the configuration and create a folder for each site that should be indexed.

~$ cd /opt/apache-solr-4.1.0/drupal
~$ cp multicore/solr.xml solr/
~$ mkdir solr/foo_com
~$ cp -rf solr/conf/ solr/foo_com
~$ nano -w solr/solr.xml

In the configuration you should add a new core line for each site that you want to index as shown below.

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/cores">
    <core name="foo_com" instanceDir="foo_com" />
  </cores>
</solr>

Testing Solr connection

You should now be able to connect to Solr by stating it at the command line. First time you start Solr it will take some time to extract the Java archive file (jar). When it's running you should be able to use lynx to connect at the address http://localhost:8983/solr or in another browser. You should get an page with a "Solr" link and an "Admin" link to each core.

~$ cd /opt/apache-solr-4.1.0/drupal
~$ java -jar start.jar

Hit "Ctrl+c" to shut-down Solr after you have check the connection.

Automatically start Solr (run-script)

If the server is rebooted I want Solr to automatically start, which can be achieved by creating a run-script and insert it into the right run-levels.

~$ nano -w /etc/init.d/apachesolr

Add the following lines to the file and remember to update the version number in the "SOLR_DIR" variable in the script.

#!/bin/sh
### BEGIN INIT INFO
# Provides:            apachesolr
# Required-Start:    $local_fs $remote_fs $network $syslog $named
# Required-Stop:    $local_fs $remote_fs $network $syslog $named
# Default-Start:      2 3 4 5
# Default-Stop:      0 1 6
# X-Interactive:     true
# Short-Description: Start/stop apache sole search framework
### END INIT INFO

SOLR_DIR="/opt/apache-solr-4.1.0/drupal/"
JAVA_OPTIONS="-Xmx1024m -DSTOP.PORT=8079 -DSTOP.KEY=stopkey -jar start.jar"
LOG_FILE="/var/log/solr.log"
JAVA="/usr/bin/java"

case $1 in
    start)
        echo "Starting Solr"
        cd $SOLR_DIR
        $JAVA $JAVA_OPTIONS 2> $LOG_FILE &
        ;;
    stop)
        echo "Stopping Solr"
        cd $SOLR_DIR
        $JAVA $JAVA_OPTIONS --stop
        ;;
    restart)
        $0 stop
        sleep 1
        $0 start
        ;;
    *)
        echo "Usage: $0 {start|stop|restart}" >&2
        exit 1
        ;;
esac

After saving the script above you have to change its permission so it can be executed and add it to the different run-levels. You may need to install the "sysv-rc-conf" program, which is use to edit the run-levels.

~$ chmod 755 /etc/init.d/apachesolr
~$ /etc/init.d/apachesolr start
~$ apt-get install sysv-rc-conf
~$ sysv-rc-conf --level 2345 apachesolr on

Drupal

Go to your drupal site (foo.com) and enable Apache Solr framework, Apache Solr search and Drupal core search module. You can do this using drush and running the command below.

~$ drush en apachesolr apachesolr_search search

Go to admin/config/search/apachesolr/settings on the site (foo.com) and edit the default configuration and update the URL as below.

    Solr server URL: http://localhost:8983/solr/foo_com

If Drupal gets connected to Apache Solr you will get a green feedback massage if not a big red one. Now run the sites cron to index the site or run the commands below.

~$ drush solr-reindex
~$ drush solr-index

Comments

Thank you! The best article to make Solr start automatically.

~$ cd /opt/apache-solr-4.1.0/drupal/solr/conf

That directory doesn't exist in Solr 4.2.0. The real path is:

~$ cd /opt/apache-solr-4.1.0/drupal/solr/collection1/conf/

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.