com.norconex.collector.http.filter
Interface IHttpDocumentFilter

All Superinterfaces:
Serializable
All Known Implementing Classes:
ExtensionURLFilter, RegexURLFilter

public interface IHttpDocumentFilter
extends Serializable

Filter a document after the document content is downloaded.

It is highly recommended to overwrite the toString() method to representing this filter properly in human-readable form (e.g. logging). It is a good idea to include specifics of this filter so crawler users can know exactly why documents got accepted/rejected rejected if need be.

Author:
Pascal Essiembre

Method Summary
 boolean acceptDocument(HttpDocument document)
          Whether to accept a HTTP document.
 

Method Detail

acceptDocument

boolean acceptDocument(HttpDocument document)
Whether to accept a HTTP document.

Parameters:
document - the document to validate
Returns:
true if accepted, false otherwise


Copyright © 2009-2013 Norconex Inc.. All Rights Reserved.