cygwin nutch Failed to set permissions of path 异常解决
在windows安装nutch 1.7时出现以下错误。
$ ./nutch crawl urls -dir crawl -depth 3 -topN 5 cygpath: can't convert empty path solrUrl is not set, indexing will be skipped... crawl started in: crawl rootUrlDir = urls threads = 10 depth = 3 solrUrl=null topN = 5 Injector: starting at 2013-09-04 10:31:36 Injector: crawlDb: crawl/crawldb Injector: urlDir: urls Injector: Converting injected urls to crawl db entries. Exception in thread "main" java.io.IOException: Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator2019975669\.staging to 0700 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353) at org.apache.nutch.crawl.Injector.inject(Injector.java:281) at org.apache.nutch.crawl.Crawl.run(Crawl.java:132) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.Crawl.main(Crawl.java:55) |
这个是Windows下文件权限问题,在Linux下可以正常运行,不存在这样的问题。
对应的官方bug为:https://issues.apache.org/jira/browse/HADOOP-7682
我在windows下安装Cygwin.需要编译该源码。nutch 1.7包含的是hadoop-core-1.2.0.jar包.
1.在http://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-1.2.0/hadoop-1.2.0.tar.gz下载hadoop-1.2.0源码。
2.下载ant,将ant目录下的bin目录加入环境变量path.
3.将hadoop-1.2.0.tar.gz解压缩至文件夹。修改hadoop-1.2.0\src\core\org\apache\hadoop\fs\FileUtil.java,搜索 Failed to set permissions of path,找到689行,把throw new IOException改为LOG.warn
private static void checkReturnValue(boolean rv, File p, FsPermission permission ) throws IOException { if (!rv) { throw new IOException("Failed to set permissions of path: " + p + " to " + String.format("%04o", permission.toShort())); } } |
修改为:
private static void checkReturnValue(boolean rv, File p, FsPermission permission ) throws IOException { if (!rv) { /** throw new IOException("Failed to set permissions of path: " + p + " to " + String.format("%04o", permission.toShort())); */ LOG.warn("Failed to set permissions of path: " + p + " to " + String.format("%04o", permission.toShort())); } } |
3.修改hadoop-1.2.0\build.xml,搜索autoreconf,移除匹配的6个executable=”autoreconf”的exec配置
例如:
<!-- <exec executable="autoreconf" dir="${native.src.dir}" searchpath="yes" failonerror="yes"> <arg value="-if"/> </exec> --> |
4、在命令行切换到hadoop-1.2.0目录,执行ant(要在cygwin下执行,在windows命令行下会报错)。
6、用新生成的hadoop-1.2.0\build\hadoop-core-1.2.1-SNAPSHOT.jar替换nutch的hadoop-core-1.2.0.jar.
在cygwin下执行
nutch crawl urls -dir crawl -depth 3 -topN 10 |
上诉异常消失。
本文固定链接: http://www.chepoo.com/cygwin-nutch-failed-to-set-permissions-of-path-error-resolved.html | IT技术精华网
我按照你的修改了源代码,但是用ant去重新编译hadoop1.2.0的时候报错误,不知如何解决,
ivy-retrieve-common:
[ivy:cachepath] DEPRECATED: ‘ivy.conf.file’ is deprecated, use ‘ivy.settings.fil
e’ instead
[ivy:cachepath] :: loading settings :: file = E:\Apache下的包汇总\hadoop\hadoop-
1.2.0\ivy\ivysettings.xml
init:
2015-10-02 下午 10:06[touch] Creating E:\cygwin64\tmp\null1954040975
[delete] Deleting: E:\cygwin64\tmp\null1954040975
[exec] src/saveVersion.sh: line 33: svn: command not found
[exec] src/saveVersion.sh: line 34: svn: command not found
。。。,你可以把你编译好的hadoop1.2.0的源代码给我一下吗,整了好久都没整通了~~~~(>_<)~~~~