Categories
Uncategorized

Solr Resources

Where do you look for information on the Apache Solr project? There is good information, but it is a bit scattered.

The colorful front page introduces all the features.

For reference info, first look at the Reference Documents. This link is release specific; you may want a different version.

There is also the Reference Guide (a large pdf). It is release specific; you may want a different version.

The latest pre-release version of the above reference is in the Confluence Wiki

See also the Solr Community Wiki

The users mailing list archives is here. And the developer mailing list archives.

More approachable info can often be found on personal blogs:

The issue trackers are packed with good info:

StackOverflow has good questions, answers, and discussion.

Google books can be useful. To find out about “solr shard” you can try this query then select a book. For “solr nested”, try this query .

Solr Specific Search
I added a custom search widget to http://leirtech.com (in the footer, so scroll way down, with any tab). It is a site specific search, and its results are from the above listed sites only. For example, enter ‘nested’ and press enter: the best results are from the likes of
Yonick, Lucidworks, and the Cwiki. Please excuse Google’s promoted links at the top, Google feels a need to make some money.

Categories
Uncategorized

Solr heap size

Solr’s default heap size needs to be increased.

The following info is from Shawn Heisey’s post to the Solr mailing list. I copied it here as a note to myself, and in the hopes of helping Solr newcomers.

There are exactly two solutions to the OutOfMemoryError regarding heap
space:

1) Increase the heap size.
2) Decrease the memory requirements.

The default heap size that Solr 5.0 and later starts with is 512MB.
This is a very small heap size. We are aware that the default is very
small — this is intentional, so that the default install is runnable on
virtually any hardware.

Almost all production Solr installs will require increasing the heap
size. If you get a little bit of data in a Solr install and then make
complex query requests, it can easily require more than 512MB of total heap.

Categories
Uncategorized

Solr Search

Look, the Apache Solr search server is installed on the Blinkmonitor.com site now!

You will be thinking “big whup” perhaps, because WordPress (WP) already has Search built into it.

But .. WP’s search speed is limited by MySQL’s text search speed. That is fine for a few thousand posts, but when there are millions you will find yourself waiting for search results. Solr has its own inverted database, and indexes all the words in the posts or pages.

Better still, Solr has faceted search (not so for WP). Looking at the search results, you can select a category like ‘books’, or confine your search to a tag such as ‘Heritage’. You can order the results by relevance, or list  the newest ones first.

And highlighting is a big benefit. Looking at the search results, you will see snippets of the pages you were searching for, with the searched text highlighted.

How does this all work, it might seem unlikely that a complex PHP project like WP could be integrated with a complex Java project like Solr. This is all ‘easy’ because Solr has a RESTful interface. It responds to HTTP requests (such as GET, POST et al). When you type, say, “charlie” into the search box, WP does a GET to Solr. Solr accepts the search argument “charlie”, checks its index to find out which pages contain “charlie”, and returns a list of pages in the GET result. WP displays the list as links you can click on to see the pages.

When an author writes a WP page, WP sends it to Solr to be indexed. And when a page is updated, it gets sent to Solr again.

WordPress (WP) needs a plugin for this all to work.  There are several WP Solr plugins, and I chose the great WPSOLR plugin. It is free, but there are paid options that you might want to consider (disclosure: I am just a user, and am not paid for this mention). Paid installation support is available, but this will not be necessary if you are familiar with Solr.

Solr is quick enough to provide ‘autocomplete’ suggestions in the search box. I have that configured using the older spellchecker method. There is a suggester module, new as of last year, but I have not yet persuaded it to build its index. Soon..