D URL Crawler Status Codes

The crawler uses a set of codes to indicate the result of the crawled URL. Besides the standard HTTP status code, it uses its own code for non-HTTP related situations. Only URLs with status 200 will be indexed.

The following table lists the URL status codes.

Code	Description
200	URL OK
400	Bad request
401	Authorization required
402	Payment required
403	Access forbidden
404	Not found
405	Method not allowed
406	Not acceptable
407	Proxy authentication required
408	Request timeout
409	Conflict
410	Gone
414	Request URI too large
500	Internal server error
502	Bad gateway
503	Service unavailable
504	Gateway timeout
505	HTTP version not supported
902	Timeout reading document
903	Filtering failed
904	Out of memory error
905	IOEXCEPTION in processing URL
906	Connection refused
907	Socket bind exception
908	Filter not available
909	Duplicate document detected
910	Duplicate document ignored
911	Empty document
951	URL not crawled (this can happen if robots.txt specifies that a certain document should not be indexed)
952	URL crawled
953	Metatag redirection
954	HTTP redirection
955	Black list URL
956	URL is not unique
957	Sentry URL (URL as a place holder)
958	Document read error
959	Form login failed
1001	Datatype is not TEXT/HTML
1002	Broken network data stream
1003	HTTP redirect location does not exist
1004	Bad relative URL
1005	HTTP error
1006	Error parsing HTTP header
1007	Invalid URL table column name
1008	JDBC driver missing
1009	Binary document reported as text document
1010	Invalid display URL