Archive-It Full-Text Search: 200 Customers, 2000+ Collections, 1.3+ Billion Archival Web Pages.


11:45 - 12:30pm on Wednesday, October 19 2011

Description of Archive-It, the Internet Archive's subscription, self-serve web archiving service, focused on the full-text search system. With nearly 200 partners and over 2000 collections the custom Lucene-based system handles 3+ million index updates per day across an index that totals over 1.3 billion documents. This session will give a detailed description of the architecture and implementation of the Archive-It search system; highlighting many of the challenges due to the scale as well as complex use cases.



Comments