• Friday
  • April 2
  • 2010

Pika and the Google Search Appliance make nice

For those who have followed The Findability Project, I am pleased to report we have surmounted the basic technical problems of targeting our Pika CMS with the Google Search Appliance.

The back story is one I have purposefully repeated whenever giving a presentation about the project, namely, that our Pika Plan A did not work. We encountered code anomalies in Pika that, among other things, cause it to auto-generate new case intakes and case records when it is crawled by the GSA. As a result, we were unable to use the GSA to crawl the Pika client case content dynamically generated as web pages. Plan A would have been the easiest, no-brainer way to go but we were not able to do so. So Plan B was to have the GSA target the Pika MySQL database directly. Status report: Mission accomplished.

There are GSA capacity issues for us, since our particular GSA’s one million “record” capacity means one million web pages or database records, inclusive, and these database records are not the same thing as the count for client case records. At any given time, we may have some 130,000 to nearly 200,000 client cases in our Pika system (and even more in archival data storage), but from a database perspective, these add up to multi-millions of “records,” e.g., various types of time records, case notes, contacts, and so on. Part of the challenge for us was to sort out which pieces of those millions of database records were the ones most needed and useful to our users.

The solution? Using a well-tailored query, we have the GSA do a selective crawl of the Pika MySQL database to return the most commonly sought and used Pika content: Case numbers, client names, office designations and case notes… tons of case notes. The basic technical explanation is the GSA performs a database query, returns it as an XML feed, indexes that feed, against which the user’s search terms are queried and ultimately returned as viewable HTML

What does the the search result look like? A Google search result. The clickable link displays the case number, client name, LSNC office and primary advocate name, e.g., “90-10-123456 ~ John Client ~ Sacramento ~ Jane Advocate.” Below that it displays in-context text with the search terms highlighted in bold, essentially like a regular Google search result. Clicking the link dynamically displays the actual Pika case note shown in context. Assuming there are multiple possible matches for a particular Pika case record, there is a link to display all the “omitted results,” akin to how regular Google searches work, so the users can see all possible, not just probable matches. Clicking through the GSA search result link also gives the user direct clickable access to the particular client case record since clicking through takes the user to the actual Pika client case record.

That’s the name of that tune.

Other posts of possible interest...

Comments are closed.