I’ve now implemented such a system as an extension to Catalyst, the open source Perl web framework. The system isn’t yet ready for general distribution, but I’d like to share my approach.
First, I’ve gathered ten years of web access logs from WormBase, a generic model organism database where I work as the project manager.
Next, I correlated IP addresses with requests and tried to trace browsing patterns from one object to the next. This isn’t an exact science since we haven’t historically tried to uniquely identify users.
Data is loaded into a simple MySQL schema with object and object2related tables. Expediently simple.