public class FileSorter
extends Object
Modifier and Type | Class and Description |
---|---|
private static class |
FileSorter.BlockReader
Reads a block of compressed lines from the temporary disk file, and
feeds them out one at a time.
|
static class |
FileSorter.FileOutput
Advanced API class: write output to a file
|
static interface |
FileSorter.Output
Advanced API interface for writing lines from the sorter
|
Modifier and Type | Field and Description |
---|---|
private ArrayList |
blockOffsets
Offsets of blocks already written to the temp file
|
private ArrayList |
curBlockLines
Buffer of lines in the current block
|
private int |
curBlockMem
Approximate amount of memory consumed by the current block of lines
|
static int |
DEFAULT_MEM_LIMIT
Default memory limit if none specified
|
private int |
memLimit
Approximate limit on the amount of memory to consume during sort
|
private int |
nLinesAdded
Count of how many lines were read in
|
private static String |
SENTINEL
Sentinel string used to mark end of blocks
|
private File |
tmpFile
File to use for temporary disk storage (automatically deleted)
|
Modifier | Constructor and Description |
---|---|
protected |
FileSorter()
Protected constructor -- do not construct directly; rather, use one
of the simple, intermediate, or advanced API methods below.
|
Modifier and Type | Method and Description |
---|---|
void |
addLine(String line)
Add a line to be sorted.
|
private static void |
clearFile(File f)
Delete, or at least truncate, the given file (if it exists)
|
void |
finish(FileSorter.Output out)
Perform the main work of sorting, sending the results to the specified
output.
|
private void |
flushBlock()
Flush currently buffered lines to the temporary file.
|
static void |
main(String[] args)
Simple command-line interface
|
private static int |
memSize(String s)
Give a rough estimate of how much memory a given string takes
|
int |
nLinesAdded()
Find out how many lines were added
|
static void |
sort(File inFile,
File outFile)
Simple API: Sort from an input file to an output file
|
static void |
sort(File inFile,
File outFile,
File tmpDir,
int memLimit)
Intermediate API: sort from a file, to a file, using a specified temporary
directory and memory limit.
|
static FileSorter |
start(File tmpDir,
int memLimit)
Advanced API, independent of input and output format.
|
public static final int DEFAULT_MEM_LIMIT
private File tmpFile
private int memLimit
private int nLinesAdded
private int curBlockMem
private ArrayList curBlockLines
private ArrayList blockOffsets
private static String SENTINEL
protected FileSorter()
public static void main(String[] args)
public static void sort(File inFile, File outFile) throws IOException
IOException
public static void sort(File inFile, File outFile, File tmpDir, int memLimit) throws IOException
inFile
- source of input lines, in UTF-8 encodingoutFile
- destination of output linestmpDir
- filesystem directory for temporary storage during sort. If
null, then the system default temp directory will be used.memLimit
- approximate max amount of RAM to use during sortIOException
public static FileSorter start(File tmpDir, int memLimit) throws IOException
tmpDir
- a filesystem directory to store temporary data during sort.memLimit
- approximate limit on the amount of RAM to use during sort.IOException
public void addLine(String line) throws IOException
line
- one line of data to be sortedIOException
public int nLinesAdded()
public void finish(FileSorter.Output out) throws IOException
IOException
private void flushBlock() throws IOException
IOException
private static void clearFile(File f) throws IOException
IOException
private static int memSize(String s)