很多朋友在剛開(kāi)始學(xué)習Hadoop的時(shí)候,都會(huì )以類(lèi)似于下面的一個(gè)例子來(lái)開(kāi)始自己的hadoop學(xué)習之旅:
public class MyHadoopCounter {
public static class MyHadoopMapper extends Mapper<LongWritable, Text, Text, Text> {
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
System.out.println(" ===== " + value);
context.write(new Text("Info"), value);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.addResource("core-site.xml");
Job myJob = new Job(conf, "MyJob");
myJob.setJarByClass(MyHadoopCounter.class);
myJob.setMapperClass(MyHadoopMapper.class);
FileInputFormat.addInputPath(myJob, new Path("data"));
FileOutputFormat.setOutputPath(myJob, new Path("out1"));
System.exit(myJob.waitForCompletion(true) ? 0 : 1);
}
}
(注:這個(gè)例子是我拷貝一位網(wǎng)友的)
然后,系統拋出了異常:
2011-12-17 17:17:37,912 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201112171704_0004_m_000000_0: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.xkq.hadoop.counter.MyHadoopCounter$MyHadoopMapper
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:866)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:261)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:255)
Caused by: java.lang.ClassNotFoundException: com.xkq.hadoop.counter.MyHadoopCounter$MyHadoopMapper
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:819)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:864)
... 8 more
2011-12-17 17:17:40,921 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201112171704_0004_m_000000_0'
2011-12-17 17:17:41,122 INFO org.apache.hadoop.mapred.JobInProgress: Choosing a failed task task_201112171704_0004_m_000000
我可以肯定這個(gè)異常絕大多數初學(xué)者都遇到過(guò),所以我想說(shuō),你們遇到這個(gè)尷尬的問(wèn)題是因為你們不懂Hadoop的源碼。下面我來(lái)好好分析這個(gè)問(wèn)題。
首先我想先說(shuō)一個(gè)這個(gè)配置文件的加載,其實(shí)這位網(wǎng)友的conf.addResource("core-site.xml")在Configuration conf = new Configuration()之后根本就是多此一舉,因為為在Configuration 類(lèi)的靜態(tài)代碼中加入了

另外,在JobConf類(lèi)的靜態(tài)代碼中也加入了

也就是說(shuō),Job的配置文件已經(jīng)會(huì )默認包含core-default.xml、core-site.xml、mapred-default.xml、mapred-site.xml,這里還要特別注意的是,如果你不想要這些默認的配置文件,當你創(chuàng )建時(shí)應該這樣
//false表示忽略默認的配置文件,true表示加載默認的配置文件,默認情況下是true
Configuration conf = new Configuration(false);
好,現在就來(lái)看看出現上面異常的原因,這個(gè)問(wèn)題主要出在myJob.setJarByClass(MyHadoopCounter.class)這條語(yǔ)句的本質(zhì)是想獲取MyHadoopCounter所在的jar包絕對路徑,然后把這個(gè)絕對路徑配置到作業(yè)的maprd.jar項,如果當前project中沒(méi)有MyHadoopCounter所在的jar包的話(huà),作業(yè)的配置文件中就沒(méi)有maprd.jar項,當TaskTracker在執行該作業(yè)的任務(wù)時(shí)就找不到MyHadoopCounter類(lèi)了,因此也就出現了上面的異常。
解決辦法:
1.將上面的而是代碼打成一個(gè)jar包,并將其引入加到當前工程中。
2.在客戶(hù)端的配置文件mapred.site.xml中配置
<property>
<name>maprd.jar</name>
<value>MyHadoopCounter所在jar包的絕對路徑</value>
</property>
聯(lián)系客服