pdftotext is not available on your webserver. Server says: which: no pdftotext in (/usr/local/jdk/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/usr/local/bin:/usr/X11R6/bin:/root/bin:/opt/bin) .
Please ask your hoster to make it available for you or deactivate $collect_pdfs.
LiveSearch Bootstrap Example Page
including of the livesearch files is only needed on the search results page

Features

  • Easy to include and setup
  • Should work with every smaller and middle website
  • No need of any database
  • Pagination function
  • Caching of searchresults and content
  • Include unlinked files new since 1.2
  • Include external hosts (domains) new since 3.3
  • Define logical correlation between single words (AND, OR, as is)new since 3.4
  • Exclude paths and files from being searched for links new since 1.2
  • Exclude blocks of your Site from indexing by using comment tags new since 1.3
  • Exclude <script> blocks new since 2.0
  • Added performance switch if necessary new since 2.0
  • Search for META-description new since 2.0
  • Search for META-keywords new since 2.0
  • Search for images new since 2.0
  • Possibility to hide images from being indexed new since 2.0
  • Thumbnail-Generation on the Fly new since 2.0
  • Select method of website grabbing (curl or url_fopen) new since 3.0
  • Three different auto pagination styles
  • Possibility to search within pdf files new since 3.0
  • ...

Requirements

  • PHP 5.x
  • activated allow_url_fopen or enabled Curl and the webserver should be allowed to access your website
  • optional GD-Library for Thumbnail Generation
  • optional installed and available pdftotext binary and the site has to be hosted on a Linux server for PDF-indexing and searching

New since V3.4 2014-05-22

  • Improved: avoid following tel javascript and anchor links to prevent 404 errors
  • Added: Search logic for words - AND, OR or false - can be defined in the config
  • Fixed: sprintf troubles for image results headline (more than one image found)
  • Fixed: array_pop needs variable instead of returning array, added temporary array as variable to fulfill Strict Standards

New since V3.3 2014-03-01

  • Added fourth pagination style (real Bootstrap 3.1.1)
  • Possibility to activate UTF8 decoding during search if needed
  • Added $cachetime 0 to disable auto-caching
  • IMG retrieving improvement (not longer needed to use <img src=...>, variations working now too i.e.: <img class='...' src=...>)
  • Possibility to include external hosts-domains to the indexing process when linking to external single pages
  • Possibility to include external hosts-domains to the indexing process when embedding external images

New since V3.2 2013-07-09

  • Added third pagination style (real Bootstrap)
  • Possibility to define highlighting CSS class within the livesearch PHP class
  • Added a second highlight class to the css file
  • Possibility to define number of items in the search cloud
  • Possibility to define error messages for too short strings and no results
  • Pagination fix (images were count each as single search result, instead of all of them cobined to one)
  • URL retrieving improvement (not longer needed to use <a href=...>, variations working now too i.e.: <a class='...' href=...>)
  • PDF-files with spaces will be cached now too, but please don't use spaces or special characters for your web projects
  • Diffferent error outputs for no results and too short query strings
  • Some smaler fixes and improvements

New since V3.01 2012-07-10

just a small bugfix (allow_url_fopen and curl pre-checks)

New since V3.0 2012-06-29

  • Added possibility to search content within linked PDF files (Linux and pdftotext required)
  • Added possibility to use 2 different pagination styles (inspired by Dave Reejik)
  • Offering the possibility to use Curl OR url_fopen (auto-mode included)
  • One greater bugfix on the image search
  • Some smaler fixes and improvements

New since V2.0 2011-08-30

  • <script> blocks are excluded now
  • Added possibility to search for META-description and META-keywords
  • Added possibility to search for images by filename, alt-tag or title-tag
    Imagesearch can be turned on/off
  • Thumbnails of found images will be displayed and can be generated on the fly with the help of the GD-Library
    can be turned on (GD-thumbs)/off (CSS-thumbs)
  • Imges can be hidden by using the LSHIDE image class
  • Added performance fix for larger websites running into a timout or memory exhaustion

New since V1.3 2010-06-07

  • Added some error setup - handlers
  • Possibility to exclude complete blocks from indexing. i.e. your menues, ...
  • Added prior 5.1.2 fix for parse_url()

New since V1.2 2010-02-14

  • Double quote and escaped characters fixing for the collected Searchwords (Cloud)
  • Possibility to exclude files and directories under the basedir
  • Possibility to include unlinked files and directories under the basedir
  • Some major bugfixes to prevent from endless loops and to make caching faster

New since V1.1 2010-02-13

  • Some fixes
  • Possibility add GET-Variables for pagination ($add2query)
  • Possibility add GET-Variables for drawing searchresults ($add2query)
  • Possibility add GET-Variables for the SearchCloud ($add2query)

Installation/Usage of LiveSearch

Searchform

Just post any form with your searchfield and use as form action i.e. search.php

Setting for your web

upload ls-folder to your webproject - the ls-folder contains livesearch.class.php, livesearch.css, icon-pdf.png, cache

Include file and initialize Class

Include livesearch.class.php and initialize Class on every page where you want to use the Live Search functions
i.e. <?php
include("ls/livesearch.class.php");
$LiveSearch = new LiveSearch();
?>

Settings in the livesearch.class.php with sample settings and values

The baselink from where the grabbing should start var $baseurl = "http://www.homac.at/"; The absolute directory path on your webserver for your $baseurl ONLY needed for the GD-thumbnail generation var $basepath = "/users/mac/www/www.homac.at/htdocs/"; The URL to your ls directory ONLY needed to display the GD generated thumbnails var $lsurl = "http://www.homac.at/ls/"; The name of your search results page to prevent from endless loops and pagination (within the $basedir path) var $searchresultspage = "search.php"; Exlude paths or individual files under the $basedir from being checked for links, works with PDF files too (since V3.0) var $excl = array("dont_index",
"hideme.html",
"private/",
"docs/invoice.pdf",
);
Include individual files under the $basedir which aren't linked anywhere on your site,
for example hallo.txt isn't linked anywhere on demo page, but content can be found, works with PDF files too (since V3.0) var $incl = array("hallo.txt",
"data.pdf",
);
new since V3.3 - List of external hosts/domains
array() or array("List","of","domains")
Domains/Hosts of external linked pages or embedded external images
Just the host - no URLs no Protocols
Examples var $additionalHosts = array("www.anywhereelse.com","flickr.com"); new since V3.0 - Method auf sitegrabbing
auto, curl or url_fopen - if you're using auto curl will be tried before url_fopen will be adducted var $method = "auto"; Extensions for grabbing links var $checkext = array("htm","html","php","txt"); new since V3.0 - If you like to search within PDF files set this variable true (mind the requirements) var $collect_pdfs = true; new since V3.0 - the extensions of pdf files (usually it's just pdf :) ) var $pdfext = array("pdf"); Hours between caching processes (-1 for caching every search process could be okay for smaller dynamic content sites) var $cachetime = 12; //-1 for caching every access Results per page, if more results are found you can use the pagination function var $srch_res_per_page = 15; new since V3.4 - logical combination if searching for multiple words at once
OR ... splits to words and shows results containing ANY of the words
AND ... splits to words and shows results containing ALL of the words
false ... doesn't split and shows results for the WHOLE string as it is var $srch_logic = "OR"; new since V3.0 (bootstrap since 3.2, bootstrap3 since 3.3) - the style of the pagination
currently there are to styles avail (default, boxed, bootstrap, bootstrap3) - will be just used in the draw...methods
bootstrap uses Bootstrap 2.3.2 CSS styles
bootstrap3 uses Bootstrap 3.1.1 CSS styles var $pagerstyle = "boxed"; Min and max fontsize for the SearchCloud (px) var $cloud_min = 10;
var $cloud_max = 45;
new since V3.2 - Maximum number of items in the Search Cloud var $maxCloudItems = 50; new since V3.2 - Errormessages for query string length or no results (only íf you use the drawSearchResults method) var $errorToShort = '<div class="alert alert-error">You have to enter at least %1$s characters.</div>';
var $errorNothingFound = '<div class="alert alert-info">No search results for %1$s.</div>';
If you're running into performance-troubles on greater websites (timeout during caching, memory exhaustion ...) you should set this value to true, otherwise leave it false var $performance_fix = false; If you like to search for images too (filename, alt-tag, title-tag) set this variable true var $collect_images = true; The headline for your search results var $img_results_headline = '%1$s Images for %2$s'; //Number of images, Searchstring new since V3.0 The headline for your search result var $img_result_headline = '1 Image for %1$s'; //Searchstring The height of GD generated images - !!! in the livesearch.css there're are height definitions ltoo for CSS-thumbs !!! var $thumb_height = 70; If true the thmubnails will be genereated automatically with the help of the GD-Library, otherwise the images will be sized by CSS var $create_thumbs = true; UTF decoding for searching - if needed (true/false) - enabled by default var $utf8DecodeResults = true; Cache directory - have to be writeable, in the example below the path to cache directory will be calculated automatically relative to livesearch.class.php $this->cachedir = realpath(dirname(__FILE__)) . "/cache";

Live Search Functions

Cache Files

Just caching the files without searching - this action could take a while an will be called automatically while search process if no files are cached or the age of the cached files is older than the defined $cachetime $LiveSearch->cacheFiles();

Search

necessary to initiate the search, if no files are cached or the age of the cached files is older than the defined $cachetime the cacheFiles function will be called by the search function too $LiveSearch->search($_REQUEST["q"],$_REQUEST["p"]); or, if you like to design the results by yourself (you will get an array) $searchresults = $LiveSearch->search($_REQUEST["q"],$_REQUEST["p"]);

$_REQUEST["q"]
The value of the search, the searchstring
$_REQUEST["p"]
The current page, needed for the pagination
You will receive an array with the following keys if you assing the function to a variable (second line)
url
The url
title
The pagetitle
content
The snippet where the the searchstring is embedded
isPDF
0/1 - indicates if the result is a PDF (1) or not (0) new since V3.0
If image search is enabled the first key of the contains an array with the images too, the url is set to an "#", the title to the $img_results_headline and the content contains img tags of the found images
In example: Array
(
  [0] => Array
    (
      [0] => Array
        (
            [src] => http://envato.homac.at/dev/LiveSearch-2.0/example01/images/gravatar.jpg
            [title] => 
            [alt] => image
            [parenturl] => http://envato.homac.at/dev/LiveSearch-2.0/example01/index.php
            [GDThumb] => d9b023be3750db3cfbdcc72f0e71cc65.jpg
        )

      [title] => 1 Images for avatar
      [url] => #
      [content] => <a href="http://envato.homac.at/dev/LiveSearch-2.0/example01/images/gravatar.jpg"><img src="http://envato.homac.at/dev/LiveSearch-2.0/example01/ls/cache/thumbs/d9b023be3750db3cfbdcc72f0e71cc65.jpg" alt="image" title=""></a>
  
    )
  
  [1] => Array
    (
      [url] => http://envato.homac.at/dev/LiveSearch-2.0/example01/help.php
      [title] => LiveSearch - How it works
      [content] => ... Links (i.e. &amp;action=search) - don't forget the leading &amp; Current cloud for this website: <strong class="highlight">avatar</strong> search easter image super firefox keywords duper <strong class="highlight">avatar</strong>search wise Excluding blocks from being indexed since V 1.3 you're able to exclude/hide blocks...
    )

)
Available variables

After a successfull search you have access to some variables

$LiveSearch->searchcount
Total number of searchresults
$LiveSearch->p
Current Page
$LiveSearch->pages
Total pages
...
...
Clear Cache

Function to delete all cached files. Note: cached files will be deleted automatically if they are too old ($cachetime exceeded) or on every other caching process (-1) $LiveSearch->clearCache();

Clear Search Results

Function to remove stored search results $LiveSearch->clearSrch();

Clear Stored Search Counts

Function to remove stored searchstrings (used by the Search Word Cloud) $LiveSearch->clearSrchStr();

Pager

Returns paging information after search was successfull and results are more than the defined $srch_res_per_page, with these information you could build your own pagination $LiveSearch->pager(); Returns array with the following keys:

current
Current page
total
Total pages
Pagination Example

Returns an example output for the pagination if results are more than the defined $srch_res_per_page and will be called in the $LiveSearch->drawSearchresults() method too. $LiveSearch->drawPagination();or $LiveSearch->drawPagination("p","q")or $LiveSearch->drawPagination("p","q","&action=search")or, new since V3.0 $LiveSearch->drawPagination("p","q","&action=search","boxed") Syntax $LiveSearch->drawPagination([PageVarName], [SearchStringVarName], [Add2Query], [PagerStyle])

The Parameters
[PageVarName]
The name of the page variable (default p)
[SearchStringVarName]
The name of the search string (default q)
[Add2Query]
If needed you could add some Variables to the pagination Links (i.e. &action=search) - don't forget the leading &, default: false
[PagerStyle]
If you like to change the display style of the pagination use default or boxed (since V3.0) - default: default
Searchresults Example

An example output for the search results, including the pagination from above $LiveSearch->drawSearchresults();or $LiveSearch->drawSearchresults("p","q") $LiveSearch->drawSearchresults("p","q","&action=search") Syntax $LiveSearch->drawSearchresults([PageVarName], [SearchStringVarName], [Add2Query])

The Parameters
[PageVarName]
The name of the page variable (default p)
[SearchStringVarName]
The name of the search string (default q)
[Add2Query]
If needed you could add some Variables to the pagination Links (i.e. &action=search) - don't forget the leading &
Show collected URLs

Shows you the collected and cached Urls $LiveSearch->showUrls(); output for this website:

Array
(
    [0] => http://www.itfx-dev2.co.uk/search-facility-test/index.php
    [1] => http://www.itfx-dev2.co.uk/search-facility-test/howitworks.php
    [2] => http://www.itfx-dev2.co.uk/search-facility-test/contact.php
    [3] => http://www.itfx-dev2.co.uk/search-facility-test/
    [4] => http://www.itfx-dev2.co.uk/search-facility-test/gallery.php
    [5] => http://www.itfx-dev2.co.uk/search-facility-test/downloads.php
    [6] => http://www.itfx-dev2.co.uk/search-facility-test/sample1.php
    [7] => http://www.itfx-dev2.co.uk/search-facility-test/sample2.php
    [8] => http://www.itfx-dev2.co.uk/search-facility-test/sample3.php
    [9] => http://www.itfx-dev2.co.uk/search-facility-test/hallo3.txt
    [10] => http://www.itfx-dev2.co.uk/search-facility-test/envato2000.pdf
    [11] => http://www.itfx-dev2.co.uk/search-facility-test/envato.pdf
    [12] => http://www.itfx-dev2.co.uk/search-facility-test/Issue 16 Voice 2013.pdf
    [13] => http://www.itfx-dev2.co.uk/search-facility-test/26. May 27th 2014. FOIL UPDATE FOR MEMBERS. Update on Mitchell in May.pdf
    [14] => http://www.itfx-dev2.co.uk/search-facility-test/hallo.txt
)
Search Cloud

Shows you the Search Word Cloud $LiveSearch->printSrchCloud()or $LiveSearch->printSrchCloud("q") or, new since V1.1 $LiveSearch->printSrchCloud("q","&action=search") Syntax $LiveSearch->printSrchCloud([SearchStringVarName], [Add2Query])

The Parameters
[SearchStringVarName]
The name of the search string (default q)
[Add2Query]
If needed you could add some Variables to the pagination Links (i.e. &action=search) - don't forget the leading &
Excluding blocks from being indexed

since V 1.3 you're able to exclude/hide blocks from your website from LiveSearch by setting simple comment tags. This makes sense for menues on every page
Start hiding <!--LSHIDE--> Stop hiding <!--/LSHIDE--> Examples
On this example page the Ciao Codecanyon part on the index page can't be found. (ciao too :) )
other Examples
#1 Some words, can be found but <!--LSHIDE-->this combination can't be<!--/LSHIDE--> found #2 blabla
<!--LSHIDE-->
Mainmenue #1
Mainmenue #2 Mainmenue #3
<!--/LSHIDE-->
some text ...
<!--LSHIDE-->
Submenue #1
Submenue #2
<!--/LSHIDE-->
...

Excluding images being indexed (such as icons ...)

additionaly to the LSHIDE-blocks you can hide images from being indexed since V 2.0 by setting a class called LSHIDE to your images
These are some sample usage codes <img src='images/icons/contact.gif' alt='contact' class='icon LSHIDE' /> <!--won't be indexed-->
<img src='images/space.png' alt='' class='lshide' /> <!--won't be indexed-->
<img src="images/portfolio/homac.jpg" alt="homac" title="Homac e.U." class="float-left p5" /> <!--will be indexed-->
<img src="images/portfolio/envazo.jpg" alt=""" /> <!--will be indexed-->

Examples

Just the Formular

<form method="post" action="search.php">
<input type="text" name="q">
<input type="submit">
</form>

The Searchresults

<?php
include("ls/livesearch.class.php");
$LiveSearch = new LiveSearch();
?>
...
<?php
$LiveSearch->search($_REQUEST["q"],$_REQUEST["p"]);
echo "<p>" . $LiveSearch->drawSearchresults() . "</p>";
?>
...

or

<?php
include("ls/livesearch.class.php");
$LiveSearch = new LiveSearch();
?>
...
<?php
$search_results = $LiveSearch->search($_REQUEST["q"],$_REQUEST["p"]);
echo "Found: " . $LiveSearch->searchcount;
echo "Pages: " . $LiveSearch->pages;
echo "Current Page: " . $LiveSearch->p;
echo "<pre><b>Search Results</b><code>" .
   print_r($search_results,true) . "</code><pre>";
?>
...

Credits