|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.norconex.collector.http.HttpCollector
public class HttpCollector
Main application class. In order to use it properly, you must first configure
it, either by providing a populated instance of HttpCollectorConfig
,
or by XML configuration, loaded using HttpCollectorConfigLoader
.
Instances of this class can hold several crawler, running at once.
This is convenient when there are configuration setting to be shared amongst
crawlers. When you have many crawler jobs defined that have nothing
in common, it may be best to configure and run them separately, to facilitate
troubleshooting. There is no fair rule for this, experimenting with your
target sites will help you.
Constructor Summary | |
---|---|
HttpCollector()
Creates a non-configured HTTP collector. |
|
HttpCollector(File configFile,
File variablesFile)
Creates an HTTP Collector configured using the provided configuration fine and variable files. |
|
HttpCollector(HttpCollectorConfig collectorConfig)
Creates and configure an HTTP Collector with the provided configuration. |
Method Summary | |
---|---|
void |
crawl(boolean resumeNonCompleted)
Launched all crawlers defined in configuration. |
JobSuite |
createJobSuite()
|
File |
getConfigurationFile()
|
HttpCrawler[] |
getCrawlers()
|
File |
getVariablesFile()
|
static void |
main(String[] args)
Invokes the HTTP Collector from the command line. |
void |
setConfigurationFile(File configurationFile)
|
void |
setCrawlers(HttpCrawler[] crawlers)
|
void |
setVariablesFile(File variablesFile)
|
void |
stop()
Stops a running instance of this HTTP Collector. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public HttpCollector()
public HttpCollector(File configFile, File variablesFile)
configFile
- a configuration filevariablesFile
- a variables filepublic HttpCollector(HttpCollectorConfig collectorConfig)
collectorConfig
- HTTP Collector configurationMethod Detail |
---|
public File getConfigurationFile()
public void setConfigurationFile(File configurationFile)
public File getVariablesFile()
public void setVariablesFile(File variablesFile)
public HttpCrawler[] getCrawlers()
public void setCrawlers(HttpCrawler[] crawlers)
public static void main(String[] args)
args
- Invoke it once without any arguments to get a
list of command-line options.public void crawl(boolean resumeNonCompleted)
resumeNonCompleted
- whether to resume where previous crawler
aborted (if applicable)public void stop()
public JobSuite createJobSuite()
createJobSuite
in interface IJobSuiteFactory
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |