Specify here special rules to customize the Job Package download process. The available basic rules are:
Depth
Defines the maximum recursion depth that must be reached during a Job Package download job. A website structure, with regard to the web pages depth, can be viewed like this:
depth.png
Links Limit
Defines the maximum number of links that must be followed during the Job Package download process. When this limit is reached then the download process will stop;
Time Limit
Defiles the maximum time (in milliseconds) that a Job Package download process must not exceed. When this time limit is reached then the download process will stop.
File Size Filter
By means of these settings, the decisions can be made depending on the web resources file size. For example, in order to avoid downloading large files, these settings may be used.
The available file size filter properties are:
File Size (from)
Defines the start value of the file size interval, from which files will be considered by this fiter;
File Size (to)
Defines the end value of the file size interval, from which files will be considered by this fiter;
Reply "Content-Length' not available action
Defines the actual action that must be took for the file whose size if between File Size (from) and File Size (to). At this moment the two possible values are:
Save To Disk: saves the file to disk;
Reject File: rejects that particular file and will not download it.
The recursion depth defines how deep ItSucks should crawl through linked web sites. Think about a site structure like this:
depth.png
_ site3.html
/
_ site2.html
/
_ site1.html
/ \_ yellow.png
index.html
\_ background.png
If you set the recursion depth to 0, you will only get the index.html.
With a value of 1, you will get index.html, site1.html and background.png.
With a value of 2, you will get index.html, site1.html, background.png, site2.html and yellow.png.
With a value of 3, you will get index.html, site1.html, background.png, site2.html, yellow.png and site3.html.
When set to -1 it's unlimited.
Defines a time limit. If the time limit is reached, no more links are added to the "open" list. After all links in the "open" list are finished, the download ends.
Define a maximum limit of links (URLs). When the limitation is reached, no more links are added to the "open" list. After all links in the "open" list are finished, the download ends.
Defines a prefix for the URL. When set, only URLs which are beginning with the prefix are accepted. This can be handy of only a specific directory should be downloaded. Only a string is allowed, no regular expresions.
Example: http:www.example.com/section1/
A host filter can be set if ItSucks should follow only links whose hostname matches an regular expression. To do so, remove the ".*" entry from the "Allowed Hostname" box and add something like ".*google.de". In this case ItSucks will only retrieve files from an host like "images.google.de", "google.de" or "http://www.google.de". Be careful to not remove all entries from the filter list. In this case no hostname is allowed.
To control which filetypes should be saved on disk, this filter can be used. Only files matching one of the regular expressions are saved on disk. As an example to accept only jpegs, remove the ".*" entry from the list and add ".*jpg$". When removing all entries from the filter list, no files will be saved on disk.