admin 管理员组

文章数量: 1086019

hadoop在windows中的环境变量配置

hadoop3.1.0 window win7 基础环境搭建

前言:在windows上部署hadoop默认都是安装了java环境的哈。

1、下载hadoop3.1.0

2、下载之后解压到某个目录

3、配置hadoop_home

新建HADOOP_HOME,指向hadoop解压目录,如:D:/hadoop。path环境变量中增加:%HADOOP_HOME%\bin;。

4、配置hadoop相关文件

hadoop基本文件配置:hadoop配置文件位于:hadoop/etc/hadoop下

hadoop-env.cmd / core-site.xml / hdfs-site.xml / mapred-site.xml

hadoop-env.cmd,主要是在文件末尾添加了红色的字

@echo off
@rem Licensed to the Apache Software Foundation (ASF) under one or more
@rem contributor license agreements.  See the NOTICE file distributed with
@rem this work for additional information regarding copyright ownership.
@rem The ASF licenses this file to You under the Apache License, Version 2.0
@rem (the "License"); you may not use this file except in compliance with
@rem the License.  You may obtain a copy of the License at
@rem
@rem     .0
@rem
@rem Unless required by applicable law or agreed to in writing, software
@rem distributed under the License is distributed on an "AS IS" BASIS,
@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@rem See the License for the specific language governing permissions and
@rem limitations under the License.@rem Set Hadoop-specific environment variables here.@rem The only required environment variable is JAVA_HOME.  All others are
@rem optional.  When running a distributed configuration it is best to
@rem set JAVA_HOME in this file, so that it is correctly defined on
@rem remote nodes.@rem The java implementation to use.  Required.
set JAVA_HOME=%JAVA_HOME%@rem The jsvc implementation to use. Jsvc is required to run secure datanodes.
@rem set JSVC_HOME=%JSVC_HOME%@rem set HADOOP_CONF_DIR=@rem Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler.
if exist %HADOOP_HOME%\contrib\capacity-scheduler (if not defined HADOOP_CLASSPATH (set HADOOP_CLASSPATH=%HADOOP_HOME%\contrib\capacity-scheduler\*.jar) else (set HADOOP_CLASSPATH=%HADOOP_CLASSPATH%;%HADOOP_HOME%\contrib\capacity-scheduler\*.jar)
)@rem The maximum amount of heap to use, in MB. Default is 1000.
@rem set HADOOP_HEAPSIZE=
@rem set HADOOP_NAMENODE_INIT_HEAPSIZE=""@rem Extra Java runtime options.  Empty by default.
@rem set HADOOP_OPTS=%HADOOP_OPTS% -Djava.preferIPv4Stack=true@rem Command specific options appended to HADOOP_OPTS when specified
if not defined HADOOP_SECURITY_LOGGER (set HADOOP_SECURITY_LOGGER=INFO,RFAS
)
if not defined HDFS_AUDIT_LOGGER (set HDFS_AUDIT_LOGGER=INFO,NullAppender
)set HADOOP_NAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_NAMENODE_OPTS%
set HADOOP_DATANODE_OPTS=-Dhadoop.security.logger=ERROR,RFAS %HADOOP_DATANODE_OPTS%
set HADOOP_SECONDARYNAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_SECONDARYNAMENODE_OPTS%@rem The following applies to multiple commands (fs, dfs, fsck, distcp etc)
set HADOOP_CLIENT_OPTS=-Xmx512m %HADOOP_CLIENT_OPTS%
@rem set HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData %HADOOP_JAVA_PLATFORM_OPTS%"@rem On secure datanodes, user to run the datanode as after dropping privileges
set HADOOP_SECURE_DN_USER=%HADOOP_SECURE_DN_USER%@rem Where log files are stored.  %HADOOP_HOME%/logs by default.
@rem set HADOOP_LOG_DIR=%HADOOP_LOG_DIR%\%USERNAME%@rem Where log files are stored in the secure data environment.
set HADOOP_SECURE_DN_LOG_DIR=%HADOOP_LOG_DIR%\%HADOOP_HDFS_USER%@rem
@rem Router-based HDFS Federation specific parameters
@rem Specify the JVM options to be used when starting the RBF Routers.
@rem These options will be appended to the options specified as HADOOP_OPTS
@rem and therefore may override any similar flags set in HADOOP_OPTS
@rem
@rem set HADOOP_DFSROUTER_OPTS=""
@rem@rem The directory where pid files are stored. /tmp by default.
@rem NOTE: this should be set to a directory that can only be written to by
@rem       the user that will run the hadoop daemons.  Otherwise there is the
@rem       potential for a symlink attack.
set HADOOP_PID_DIR=%HADOOP_PID_DIR%
set HADOOP_SECURE_DN_PID_DIR=%HADOOP_PID_DIR%@rem A string representing this instance of hadoop. %USERNAME% by default.
set HADOOP_IDENT_STRING=%USERNAME%set HADOOP_PREFIX=D:\study\bigdata\hadoop\hadoop-3.1.0
set HADOOP_CONF_DIR=%HADOOP_PREFIX%\etc\hadoop
set YARN_CONF_DIR=%HADOOP_CONF_DIR%
set PATH=%PATH%;%HADOOP_PREFIX%\bin

core-site.xml

<configuration><property><name>fs.defaultFS</name><value>hdfs://localhost:9000</value></property>
</configuration>

hdsf-site.xml

<!--这里是单机版所以是1--><!--注意这里的路径写法D前面有斜杠--><property><name>dfs.replication</name><value>1</value></property><property><name>dfs.permissions</name><value>false</value></property> <property><name>dfs.namenode.name.dir</name><value>/D:/study/bigdata/hadoop/hadoop-3.1.0/data/namenode</value></property><property><name>dfs.datanode.data.dir</name><value>/D:/study/bigdata/hadoop/hadoop-3.1.0/data/datanode</value></property>

mapred-site.xml

<description></description>标签中的内容可以删除<configuration>
<property><description>CLASSPATH for MR applications. A comma-separated listof CLASSPATH entries. If mapreduce.application.framework is set then thismust specify the appropriate classpath for that archive, and the name ofthe archive must be present in the classpath.If mapreduce.app-submission.cross-platform is false, platform-specificenvironment vairable expansion syntax would be used to construct the defaultCLASSPATH entries.For Linux:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*.For Windows:%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/lib/*.If mapreduce.app-submission.cross-platform is true, platform-agnostic defaultCLASSPATH for MR applications would be used:{{HADOOP_MAPRED_HOME}}/share/hadoop/mapreduce/*,{{HADOOP_MAPRED_HOME}}/share/hadoop/mapreduce/lib/*Parameter expansion marker will be replaced by NodeManager on containerlaunch based on the underlying OS accordingly.</description><name>mapreduce.application.classpath</name><value>/D:\study\bigdata\hadoop\hadoop-3.1.0/share/hadoop/mapreduce/*, /D:\study\bigdata\hadoop\hadoop-3.1.0/share/hadoop/mapreduce/lib/*</value>
</property><property><name>mapreduce.framework.name</name><value>yarn</value></property>
</configuration>

yarn-site.xml

<configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property></configuration>

5、winutils相关,hadoop在windows上运行需要winutils支持和hadoop.dll等文件

注意

a、hadoop.dll等文件不要与hadoop冲突。为了不出现依赖性错误可以将hadoop.dll放到c:/windows/System32下一份。

b、这是什么版本就是找什么版本的,不然会出现各种未知问题

下面是3.1.0的下载地址

下下来之后替换到D:\study\bigdata\hadoop\hadoop-3.1.0\bin里面的内容

6、到D:\study\bigdata\hadoop\hadoop-3.1.0\etc\hadoop路径下用管理员执行hadoop-env.cmd初始化环境(路径是自己的解压路径)

也可以用cmd窗口执行

7、到D:\study\bigdata\hadoop\hadoop-3.1.0\bin路径下执行

hdfs namenode -format

注意这里只执行一次,我因为之前没配好,出错了,所以这里执行了多次,导致后面datanode启动老是报各种各样的错,我是删掉文件,重新解压配了一遍才成功的

8、到D:\study\bigdata\hadoop\hadoop-3.1.0\sbin路径下执行

start-dfs.cmd启动dfs

start-all.cmd是启动全部程序(推荐)

9、启动之后没报错就是成功了

使用jps验证是否成功,下面就是成功了

D:\study\bigdata\hadoop\hadoop-3.1.0\sbin>jps
16032 NameNode
15956 ResourceManager
16996 NodeManager
17268 DataNode
19160 Jps

10、通过http://127.0.0.1:8088/即可查看集群所有节点状态

访问http://localhost:9870/即可查看文件管理页面:

11、后续操作自行百度

本文标签: hadoop在windows中的环境变量配置