当前位置: 首页 > 搜索 > 正文

cygwin nutch java.io.IOException: Job failed异常解决

1 星2 星3 星4 星5 星 (1 次投票, 评分: 5.00, 总分: 5)
Loading ... Loading ...
baidu_share

在cygwin 运行nutch 报以下错误:

$ ./nutch crawl urls -dir crawl -depth 3 -topN 10
cygpath: can't convert empty path
solrUrl is not set, indexing will be skipped...
crawl started in: crawl
rootUrlDir = urls
threads = 10
depth = 3
solrUrl=null
topN = 10
Injector: starting at 2013-09-04 10:49:43
Injector: crawlDb: crawl/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injector: total number of urls rejected by filters: 0
Injector: total number of urls injected after normalization and filtering: 1
Injector: Merging injected urls into crawl db.
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:296)
        at org.apache.nutch.crawl.Crawl.run(Crawl.java:132)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

在nutch/logs/hadoop.log查看该文件

2013-09-04 10:24:39,818 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging to 0700
2013-09-04 10:24:39,818 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging to 0700
2013-09-04 10:24:39,818 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging to 0700
2013-09-04 10:24:39,818 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging to 0700
2013-09-04 10:24:39,818 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging to 0700
2013-09-04 10:24:39,818 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging to 0700
2013-09-04 10:24:39,820 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002 to 0700
2013-09-04 10:24:39,820 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002 to 0700
2013-09-04 10:24:39,820 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002 to 0700
2013-09-04 10:24:39,820 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002 to 0700
2013-09-04 10:24:39,820 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002 to 0700
2013-09-04 10:24:39,820 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002 to 0700
2013-09-04 10:24:39,837 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002\job.jar to 0644
2013-09-04 10:24:39,843 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002\job.split to 0644
2013-09-04 10:24:39,846 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002\job.splitmetainfo to 0644
2013-09-04 10:24:39,849 WARN  fs.FileUtil - Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1256492961\.staging\job_local1256492961_0002\job.xml to 0644
2013-09-04 10:24:39,883 ERROR mapred.FileOutputCommitter - Mkdirs failed to create file:/D:/nutch/bin/crawl/crawldb/791399875/_temporary
2013-09-04 10:24:40,005 WARN  mapred.LocalJobRunner - job_local1256492961_0002
java.io.IOException: The temporary job-output directory file:/D:/nutch/bin/crawl/crawldb/791399875/_temporary doesn't exist!
	at org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
	at org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)
	at org.apache.hadoop.mapred.MapFileOutputFormat.getRecordWriter(MapFileOutputFormat.java:46)
	at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:449)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:491)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)

得知是由于D:/nutch/bin/crawl/crawldb/791399875/_temporary doesn’t exist不存在造成的。
执行的命令

./nutch crawl urls -dir crawl -depth 3 -topN 10

crawl 目录是在nutch/bin/crawl,当然不对了。urls也是在nutch/bin/下。于是将urls移动到nutch目录下。

在nutch目录下,执行该命令:

./bin/nutch crawl urls -dir crawl -depth 3 -topN 10

上诉异常消失。

本文固定链接: http://www.chepoo.com/cygwin-nutch-java-io-ioexception-job-failed-error-resolved.html | IT技术精华网

cygwin nutch java.io.IOException: Job failed异常解决:等您坐沙发呢!

发表评论