Stage 1
A user/application can submit a job to the Hadoop (a hadoop job client) for required process by specifying the following items:
- The location of the input and output files in the distributed file system.
- The java classes in the form of jar file containing the implementation of map and reduce functions.
- The job configuration by setting different parameters specific to the job.
Stage 2
The Hadoop job client then submits the job (jar/executable etc) and
configuration to the JobTracker which then assumes the responsibility of
distributing the software/configuration to the slaves, scheduling tasks
and monitoring them, providing status and diagnostic information to the
job-client.
Stage 3
The TaskTrackers on different nodes execute the task as per MapReduce
implementation and output of the reduce function is stored into the
output files on the file system.
No comments:
Post a Comment