home

boilerpipe JSON parameters

If boilerpipe was able to extract main content, a JSON response will look like this:

			{
  "response" : {
    "title" : "The page's title"
    "content" : "The extracted content (as a string)",
    "source" : "The page's URL"
  },
  "status" : "success"
}
		

If there was an error, the JSON response will be something like that:

			{
  "error" : {
    "message" : "A short error description",
    "code" : Numeric error code
  },
  "status" : "error"
}
		

The following errors may occur:

Code Description
100 Malformed URL
101 Could not fetch the URL
102 Unknown extractor mode or output format
103 No URL requested
104 Extractor would produce low-quality result
105 Unsupported content type (e.g., a video)
106 No content could be found under the requested URL (= target server 404)
113 URL/Host is blacklisted
99 Unknown error

Credits

This JSON API was inspired by Tom Taylor's Extractomatic, which in turn was inspired by boilerpipe.