Official website for Web Designer - defining the internet through beautiful design
FOLLOW US ON:
Author: Steve Jenkins
25th June 2013

How to add search with Solr

Discover how to set up Solr to index data and return search results via a PHP-based application

How to add search with Solr

Search is the lifeblood of the web, but on many sites it can be a decidedly underwhelming experience, lacking the richness and accuracy that users have come to expect from Google.

If you are developing sites using PHP and MySQL, search can be very difficult to do well – especially if you need it to be quick as well as comprehensive.
So, what is the solution that will give your sites a search facility that will work quickly and accurately?

Enter Apache Solr: an open source search tool providing fast and accurate results on a very scalable platform, with a whole host of advanced features: faceting, spellchecking, hit highlighting, result boosting and more. It runs as a Java servlet, so needs a container like Tomcat or Jetty – to get started, we’ll be using the build of Jetty supplied with the Solr download. It’ll run on OS X, Windows or Linux at the command line.
We’ll introduce the key Solr schema file, set one up for our sample application, then walk through using a PHP library to connect to our Solr server, index data from our database and return search results.

DOWNLOAD TUTORIAL FILES | AUTHOR: Russell Hicks

Start with a table

Let’s start by creating the database, which has a single table to store products. Fire up a MySQL client (MySQL Workbench, PHP MyAdmin or the command line are all good choices) and run these commands:

001 CREATE DATABASE solr-tut;
002 USE solr-tut;
003 CREATE  TABLE products (
004  product_id INT NOT NULL AUTO_INCREMENT ,
005  name VARCHAR(100),
006  description TEXT,
007  price FLOAT(2),
008  PRIMARY KEY (`product_id`) );

Fill the table up

Now let’s add some data – you can either use the command line, or run the supplied product-data.sql file. This table will get more sophisticated in the next tutorial, but for now this will give us five vintage-computing-themed products to start indexing and searching, with a range of prices.

001 INSERT INTO products (product_id, name, description, price)     VALUES (‘1’, ‘Amstrad CPC 464’, ‘Vintage computing, horrible     design’,     ‘29.99’);
002 INSERT INTO products (product_id, name, description, price)      VALUES (‘2’, ‘Vic 20’, ‘Now you are talking – pure retro joy ‘,     ‘14.99’);
003 INSERT INTO products (product_id, name, description, price)      VALUES (‘3’, ‘ZX Spectrum’, ‘One for the rubber enthusiasts.  48     whole k.’, ‘49.99’);
004 INSERT INTO products (product_id, name, description, price)     VALUES (‘4’, ‘Commodore 64’, ‘Games Central’, ‘39.99’);
005 INSERT INTO products (product_id, name, description, price)     VALUES (‘5’, ‘BBC Micro Model B’, ‘State of the art turtle     navigation.’,     ‘129.99’);

Start Solr

Download the 4.x release from the Solr website and unzip it to a convenient location on your machine (we used /opt/solr-4.2.0/). Solr comes supplied with an example application, ready to run – in a subdirectory of the unzipped Solr build called ‘example’. In the terminal in OS X, a shell in Linux or the command line in Windows, go to the example directory and start Solr:

001 java -jar start.jar

The Solr admin panel

You can now browse to the Solr admin panel from any web browser at localhost:8983/solr/. You should see the Solr admin panel, which we’ll look at in more detail a little later. If you can’t browse to it, look at the terminal output to see if there are any error messages. If all is well, stop Solr running by hitting Ctrl+C in your terminal.

The Solr schema

In the example app we’re looking at, the schema.xml file lives at
/example/solr/collection1/conf/schema.xml. Browse to this file and open it in your text editor of choice. Next you’re going to add some fields to your schema – these correspond directly to the fields you have already set up in the database. Solr is very flexible, allowing you to define data types with a huge variety of filters, but here we’re using some stock ones from the example schema.
Find the field definition for ‘id (<field name=”id” type=”string” indexed=”true” stored=”true” required=”true” multiValued=”false” />)’ and underneath it, add these lines:

001 <field name=”product_name” type=”text_general” indexed=”true” stored=”true”/>
002 <field name=”product_description” type=”text_general”         indexed=”true” stored=”true”/>
003 <field name=”product_price” type=”float” indexed=”true”         stored=”true”/>

Check Solr schema

After saving schema.xml, start the Solr instance again using the command below. Browse to the admin panel at localhost:8983. If you get any error messages, stop Solr by hitting Ctrl+C, check the schema file has no typos and is properly formed, then try again.

A blank application

Now it’s time to set up your PHP application. You should have a web server set up on your development machine, and a directory that we can use for this tutorial (either the site root, a virtual host or a subdirectory). We’ll refer to this location as the website home directory. We’ll also assume you can browse to the site at http://localhost – if you are using a different hostname, or a subdirectory, just use that instead. Now go to your website home directory and create four empty files with the following names:

001 export.PHP
002 search.PHP
003 constants.PHP
004 results.PHP

Time for constants

There are some variables we’ll need to use repeatedly – database connection parameters, and the host, name and port of our Solr instance. To make life easy, we’ll define those in our constants file. Open constants.PHP in your text editor and add the following lines, making sure you add your own database username and password where appropriate:

001 <?PHP
002 define(‘DBUSER’, ‘root’); //change this value to your DB     username
003 define(‘DBPASS’, ‘password’); //change this value to your DB     password
004 define(‘DBHOST’, ‘localhost’);
005 define(‘DBSCHEMA’, ‘solrtut’);
006 define(‘SOLRNAME’, ‘/solr’);
007 define(‘SOLRPORT’, ‘8983’);
008 define(‘SOLRHOST’, ‘localhost’);
009 ?>

Solr PHP clients

There are a number of clients available for PHP that work well with Solr, including the excellent Solarium. The simplest to get up and running with is the Solr PHP Client, available from bit.ly/17JAdPp. Download the latest archive and unzip into the root of your web app directory. Once you’ve extracted the archive into a subdirectory (call it SolrPHPCient), you can remove the zip file.

Fixing Solr PHP client

There is an issue with solr-PHP-client that prevents it from working with the current version of Solr (4.2), due to some Solr commands being deprecated. To make sure everything is working as it should, we need to apply a patch file, which is available from http://bit.ly/140TutI. Download the Service.PHP.patch file so it’s in the same directory as Service.PHP in the solr-PHP-client – /SolrPHPClient/Apache/Solr/Service.PHP.patch. Then you can either apply it using the command below, or any patching tool (Netbeans has one built in – just select Tools>Apply Diff Patch from the main menu).

Preparing data

Now let’s add data from the database to the Solr index. We’ll start by writing a skeleton file to connect to the database and grab the records. Open export.PHP in your text editor and add the following lines:

001 <?PHP
002 require(‘constants.PHP’);
003 $mysqli = new mysqli(DB-HOST, DB-USER, DB-PASSWORD, DB-    SCHEMA);
004 if ($mysqli->connect_errno) {
005    echo “Failed to connect to MySQL: (“ . $mysqli->connect_    errno     . “) “ . $mysqli->connect_error;
006 }
007 /* Select queries return a resultset */
008 if ($result = $mysqli->query(“SELECT * FROM products”)) {
009    printf(“Select returned %d rows.\n”, $result->num_rows);
010     /*We’re going to add rows to Solr here*/
011   /* free result set */
012   $result->close();
013 }
014 ?>

Indexing data

You should now be able to browse to localhost/export.PHP and see a result count. Now we know we can connect to the database and access our records, let’s loop through our results and send them to Solr. In export.PHP, replace the line ‘/*We’re going to add rows to Solr here*/’ with this code:

001    //declare an empty array to hold our data to send to Solr
002    $documents = array();
003    require_once(‘/solr-PHP/Service.PHP’);
004        $solr = new Apache_Solr_Service(SOLR-HOST, SOLR-PORT,     SOLR_    NAME);
005    while ($result = $results->fetch_object())
006          {
007        // For each result, create a new Solr doc
008        $document = new Apache_Solr_Document();
009        $document->id  = $result->product_id;
010        $document->description = $result->description;
011        $document->name = $result->name;
012        $document->price = $result->price;
013        //add document to array
014        $documents[] = $document;
015    }
016    if(!empty($documents))
017    {
018            $solr->addDocuments($documents);
019        $solr->commit();
020        $solr->optimize();
021    }
022 ?>

Pushing data to Solr

After making sure the Solr server is running (start it if it isn’t), refresh localhost/export.PHP in your browser. Solr will now be populated with the records from our database. There are many ways to get data into Solr, including simply sending it an appropriately formed XML file using CURL. Using the technique above, however, allows us to do some simple error checking and is very effective for replacing the contents of an entire Solr index.

Solr query syntax

Now we can check if the data is actually present by running some simple queries against our Solr instance. The query syntax is rather different to SQL – you start with the field name you want to query, then the data you want to match, like ‘product_name:Vic 20’. To start with, we just want to run a wildcard query to make sure our data is present and correct: ‘*:*’ will do that. We do this by forming an appropriate HTTP GET request directed at our Solr instance and placing the query string in a parameter called ‘q’ – so when we access the URL below, we should see our five records.

Search page skeleton

So we have a Solr index with some data in; now we want to make a form and results page so we can search it from our PHP application. We’ll start by making the form, which in this case is about as basic as it’s possible to get – open search.PHP in your text editor and add the following code:

001    <html>
002      <body>
003    <form action=”results.PHP” method=”get”>
004      <label for=”query”>Search:</label>
005      <input id=”query” name=”query” placeholder=”Enter your     search” />
006      <input type=”submit”/>
007    </form>
008  </body>
009 </html>

Search page details

Now you have a query form, let’s make a page to get some results. Open results.PHP in your text editor, then put the following comment skeleton in place:

001    <?PHP
002    //1. check that a query has been submitted, send user back to     search page otherwise
003    //2. if we have a query term, connect to Solr, query and grab the     result
004    //3. check the results – are there any? If not, display an     appropriate message
005    // if there are results, iterate through them and display
006    ?>

Get the query term

Some basic control here: check that the query string has been submitted, and if not, redirect back to our search page. Obviously all the usual advice about sanitising user input applies – in production, you should treat the user input that you’re passing to Solr with the same caution you’d use with anything being passed to a database server. So, in results.PHP, enter the following code under the comment that starts ‘//1. check that a query…’

001    if(!isset($_REQUEST[‘query’]) || empty($_REQUEST[‘query’]))
002    {
003    header(“Location: http://localhost/search.PHP”);
004    }
005    else
006    {
007        $query = $_REQUEST[‘query’];
008    }

Query Solr

We have a query, so we are going to connect to Solr using the same mechanism as we used when populating the index. Then we’ll use the search method of the Solr PHP library, which accepts a query string, an offset and a limit as parameters – the offset and limit work exactly as they do with MySQL queries, determining the starting record and the number of results respectively. So, in results.PHP, enter the following code under the comment that starts ‘//2. if we have a query term…’

001    //our required includes
002    require_once(‘constants.PHP’);
003    requ
004    //instantiate a Solr object
005    $solr = new Apache_Solr_Service(SOLRHOST, SOLRPORT, SOLRNAME);
006    //run the query
007    $results = $solr->search($query, 0, 10);ire_        008    once(‘SolrPHPClient/    Apache/Solr/Service.PHP’);

Checking results

Now you have a results object, which is stored in $results. First,
check that the query ran successfully by testing that results is not empty –
$solr->query will return false if it failed. If all is well, then get the number of results, which is stored in $results->response->numFound, and display it appropriately. So, under the comment that starts ‘//3. check the results…’

Display results

By now you’ll know whether the query ran and if it has produced any results, so the next stage is to iterate through them and display them to the user. We are passing the results through htmlspecialchars() to make sure any special characters in the Solr output are converted to appropriate HTML entities. So, in results.PHP enter the following under the comment that starts ‘//4. if there are results…’

001     echo ‘<table>’;
002     echo ‘<tr><th>ID</th>’ .
003                ‘<th>Name</th>’ .
004                ‘<th>Description</th>’ .
005            ‘<th>Price</th></tr>’;
006      {
007          foreach($results->response->docs as $doc)
008          {
009  echo ‘<tr><td>’ . htmlspecialchars($doc->id) . ‘</td>’ .
010    ‘<td>’ . htmlspecialchars($doc->product_name) .      ‘</td>’ .
011              ‘<td>’ . htmlspecialchars($doc->product_        description)  . ‘</td>’ .
012                      ‘<td>’ . htmlspecialchars($doc->product_price) .     ‘</td></tr>’;
013          }
014      }
015      echo ‘</table>’;

Error handling

Finally, you can add a little error handling around the Solr statement (this can also be applied to the export.PHP file). As we’re connecting to an external service (our Solr server), there is an obvious risk that the service may be unavailable, causing a fatal error. So here we can wrap the connection in a try/catch block, to handle the error and display an appropriate message.

001    try
002    {
003        //instantiate a Solr object
004    $solr = new Apache_Solr_Service(SOLRHOST, SOLRPORT, SOLRNAME);
005        //run the query
006        $results = $solr->search($query, 0, 10);
007    }
008    catch(Exception $e)
009    {
010        //you would probably want to log this error and display an     appropriate 
011        //(user friendly) message on a production site
012        echo($e->__toString());
013    }

Running a search

You can now browse to localhost/search.PHP and try a search like ‘product_name:Vic’ or ‘*:*’. This will produce a selection of search results as typically found when searching any other site.

Tags: ,
  • Tell a Friend
  • Follow our Twitter to find out about all the latest web development, news, reviews, previews, interviews, features and a whole more.
    • HJ

      The article is great, but full of mistakes! Took me hours to find them. Check constants and see how they were used export.php.
      (for example it was defined DBPASS you used DB-PASSWORD)

    • Utkarsh Agrawal

      Its been almost a day, but could not fix the error below. I get it when i finish the steps of “Indexing data”:

      Select returned 5 rows.

      Fatal error: Uncaught exception
      ‘Apache_Solr_HttpTransportException’ with message ”500′ Status:
      Internal Server Error’ in
      /opt/lampp/htdocs/xx/SolrPHPCient/Apache/Solr/Service.php:364
      Stack trace:
      #0 /opt/lampp/htdocs/xx/SolrPHPCient/Apache/Solr/Service.php(669):
      Apache_Solr_Service->_sendRawPost(‘http://localhos…’, ‘add(‘addDocuments(Array)
      #3 {main}
      thrown in /opt/lampp/htdocs/xx/SolrPHPCient/Apache/Solr/Service.php on line 364

    • mb

      Iam stuck on the indexing data part

      every time i execute tthe code i get and error “Call to a member function fetch_object() on a non-object in ” for the code line below

      while ($result = $results->fetch_object())

      please help as iam stuck on this for a long time
      thanks in advance