Jump to content

Apache Solr


Recommended Posts

Do you think it would be possible to integrate Apache Solr into PW? It seems like all major CMS have plugins for this. In the office I work I am the only one using processwire (the others use ez Publish), and whenever I want to convince them to use PW they complain that without out Solr integration they don't want to. Personally I am happy with the speed and results of the pw api.

I have no idea how complicated it would be to build a SolrModule, but still I want to ask. It would be a good marketing keyword for sure ;)

Link to comment
Share on other sites

Isn't Solr Java-based and have something to do with Lucene? I don't know if I've actually come across a server with this, though could be wrong. It seems perhaps a bit specialized for our case, but willing to look into it.

Link to comment
Share on other sites

You are right, Solr is based on the Lucene Java search library. I haven't seen a server myself which comes with Solr by default, but it's easy to install if you can run java on the server. And you dont't have to know any Java to use Solr. All configuration with Solr can be done with XML and the queries are done with a REST-like API.

Solr is really great at getting weighted search results. And things like facets are for sure a must have for big shops or any big catalog like sites. I think your skyscraper site would be a nice example. There are many more features, have look here: http://lucene.apache.org/solr/features.html

At this video you get a quick idea what facets are about:

Thanks looking into it. If you need a better explanation tell me – I can ask my co-workers to post here.

Link to comment
Share on other sites

i played with solr a little bit a few months ago for a project at my university. i was interested in the facets feature most of the time. i´m sure solr is a powerful search engine and in combination with tika you can extract the content from office or pdf documents but for myself i found it hard to learn from real life examples. maybe i didn´t find the right locations for it. i stopped trying out solr because the data center at the university gave no support for java based solutions. i think they host the projects in a shared hosting environment and it seems thats not the best solution for java. so for low budget projects it may be difficult to find the right hosting partner.

but i think the integration with processwire couldn´t be the hardest part. there are a solr extension for php (http://php.net/manual/en/book.solr.php) or libraries like solarium (http://www.solarium-project.org).

for a feature like facets i found javascript solutions: http://eikes.github.com/facetedsearch/ or http://documentcloud.github.com/visualsearch/

Link to comment
Share on other sites

I'm a little confused about the faceted search part, because [if I understand the example and terminology correctly] this is quite simple to do in ProcessWire at present. Here's an example from villarental.com that I coded in ProcessWire a couple years ago. I got there by clicking the terms in the left bar:

Location: St. James

Price: Up to $500

Bedrooms: 3

Features: Villas with Pool

Of course, it doesn't let you select multiple items in each category, but that was by design (for ease of use, per client), not technical limitation, as it would be equally simple to implement. But there's nothing going on here except translating page references in URL segments to selectors. So when you guys mention facets as being a drive for something like Solr, what am I missing?

I have no doubt that Solr can achieve these things at larger scale and more quickly than something like MySQL, as it's dedicated to the task. Though am thinking the audience that would benefit from this in ProcessWire may be limited. I don't think anyone has yet reached a scalability peak with ProcessWire where this would matter. Still, I'm interested in anything that enables us to scale further and faster and I do like what Lucene does. But because something like this may have a more limited audience, it's probably something we'd need to find a sponsor for in order to pursue in the shorter term. But count me as interested.

for a feature like facets i found javascript solutions: http://eikes.github.com/facetedsearch/ orhttp://documentcloud...m/visualsearch/

The VisualSearch.js would tie beautifully to ProcessWire as the data source. Might be a good one to implement into the skyscrapers demo site.

Link to comment
Share on other sites

I am sure for small page counts ProcessWire can handle faceted searches quite well with some lines of custom code. I think the real advantage of Solr is mainly on high traffic sites with lot of pages. But I think you are right, in the moment the audience for something like this is quite limited. And there are more important features on the roadmap (like page revisions/draft/live version), to get into this enterprise cms area where features like this are needed.

If I could get a customer only if PW supports Solr I would be happy to sponsor this feature. Let's see what 2013 has to bring ;)

Thanks for looking into this.

  • Like 1
Link to comment
Share on other sites

  • 2 months later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...