public class TeraInputFormat extends FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
LOG| Constructor and Description |
|---|
TeraInputFormat() |
| Modifier and Type | Method and Description |
|---|---|
RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> |
getRecordReader(InputSplit split,
JobConf job,
Reporter reporter)
Get the
RecordReader for the given InputSplit. |
InputSplit[] |
getSplits(JobConf conf,
int splits)
Splits files returned by
FileInputFormat.listStatus(JobConf) when
they're too big. |
static void |
writePartitionFile(JobConf conf,
org.apache.hadoop.fs.Path partFile)
Use the input splits to take samples of the input and generate sample
keys.
|
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSizepublic static void writePartitionFile(JobConf conf, org.apache.hadoop.fs.Path partFile) throws java.io.IOException
conf - the job to samplepartFile - where to write the output file tojava.io.IOException - if something goes wrongpublic RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws java.io.IOException
InputFormatRecordReader for the given InputSplit.
It is the responsibility of the RecordReader to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader in interface InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>getRecordReader in class FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>split - the InputSplitjob - the job that this split belongs toRecordReaderjava.io.IOExceptionpublic InputSplit[] getSplits(JobConf conf, int splits) throws java.io.IOException
FileInputFormatFileInputFormat.listStatus(JobConf) when
they're too big.getSplits in interface InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>getSplits in class FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>conf - job configuration.splits - the desired number of splits, a hint.InputSplits for the job.java.io.IOExceptionCopyright © 2009 The Apache Software Foundation