Thursday, February 7, 2013

Common Newbie Pitfalls in Hadoop Java Applications

Just putting down common issues I faced while working with writing Java applications in Hadoop and how I resolved them.

Class not Found Exceptions


java.lang.RuntimeException: java.lang.ClassNotFoundException: net.victa.hadoop.example.WordCount$MapClass
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:865)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:195)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:593)

It turns out that in your code, you have to add the line
job.setJarByClass(WordCount.class);
Otherwise Hadoop and its nodes will have no idea how to find the classes. I found this from this site (just remove the commented line and it will load)

This log message appeared just before the Class Not Found exceptions in my hadoop log:
 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
Pretty clear that I was missing jar files, and the solution was pretty simple as well.

Mismatch Key from Map

java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
This was a bit more annoying. It turns out that the Hadoop In Action book was a bit outdated so the code examples in that book didn't quite work. A lot of the stuff was deprecated. This site explained it better. Afterwards I took a copy of the files in this site to make it work

File Does Not Exist and Error Reading Task, Connection Refused

java.io.FileNotFoundException: File does not exist: /experiments/bob
WARN mapred.JobClient: Error reading task outputConnection refused
These issues were just me being silly. I supplied the path /experiments as the word count directory. In this there was another directory called bob. Wordcount treated all the things it saw in the input directory as files, and failed to read that file (bob). So another gotcha that you have to be careful of.

No comments:

Post a Comment