siteResearch is a project that I’ve been working on for a long time. It’s main goals is to give you insight into the setup of your site. It tries to give answers and explanations about:

  • the inner linking of your site,
  • SEO
  • and how valid your html, css and javascript really is.


The tools can be launched from the command-line and are written in PHP 5.3.
They use a MySQL database to store results.

The code is not optimized for security and should not be executed on a production environment. Crawling and parsing html takes to much resources for a production server. Besides that, not having to worry about security, gives me more time to work on useful features. A setup with XAMPP, WAMP, MAMP is ideal. (I also use MAMP & XAMPP to do my testing.)

At the time of this writing the project includes a crawler with:

  • a set of filters to remove unwanted results
  • options to export the crawl results into various formats (csv/excel)

A detailed list of features can be found on the milestone page.

