Class CsvExternalSort
- java.lang.Object
-
- com.google.code.externalsorting.csv.CsvExternalSort
-
public class CsvExternalSort extends java.lang.Object
-
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULTMAXTEMPFILES
Default maximal number of temporary files allowed.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static long
estimateAvailableMemory()
This method calls the garbage collector and then returns the free memory.static long
estimateBestSizeOfBlocks(long sizeoffile, int maxtmpfiles, long maxMemory)
we divide the file into small blocks.static int
mergeSortedFiles(java.io.BufferedWriter fbw, CsvSortOptions sortOptions, java.util.List<CSVRecordBuffer> bfbs, java.util.List<org.apache.commons.csv.CSVRecord> header)
static int
mergeSortedFiles(java.util.List<java.io.File> files, java.io.File outputfile, CsvSortOptions sortOptions, boolean append, java.util.List<org.apache.commons.csv.CSVRecord> header)
static java.io.File
sortAndSave(java.util.List<org.apache.commons.csv.CSVRecord> tmplist, java.io.File tmpdirectory, CsvSortOptions sortOptions)
static java.util.List<java.io.File>
sortInBatch(long size_in_byte, java.io.BufferedReader fbr, java.io.File tmpdirectory, CsvSortOptions sortOptions, java.util.List<org.apache.commons.csv.CSVRecord> header)
static java.util.List<java.io.File>
sortInBatch(java.io.File file, java.io.File tmpdirectory, CsvSortOptions sortOptions, java.util.List<org.apache.commons.csv.CSVRecord> header)
-
-
-
Field Detail
-
DEFAULTMAXTEMPFILES
public static final int DEFAULTMAXTEMPFILES
Default maximal number of temporary files allowed.- See Also:
- Constant Field Values
-
-
Method Detail
-
estimateAvailableMemory
public static long estimateAvailableMemory()
This method calls the garbage collector and then returns the free memory. This avoids problems with applications where the GC hasn't reclaimed memory and reports no available memory.- Returns:
- available memory
-
estimateBestSizeOfBlocks
public static long estimateBestSizeOfBlocks(long sizeoffile, int maxtmpfiles, long maxMemory)
we divide the file into small blocks. If the blocks are too small, we shall create too many temporary files. If they are too big, we shall be using too much memory.- Parameters:
sizeoffile
- how much data (in bytes) can we expectmaxtmpfiles
- how many temporary files can we create (e.g., 1024)maxMemory
- Maximum memory to use (in bytes)- Returns:
- the estimate
-
mergeSortedFiles
public static int mergeSortedFiles(java.io.BufferedWriter fbw, CsvSortOptions sortOptions, java.util.List<CSVRecordBuffer> bfbs, java.util.List<org.apache.commons.csv.CSVRecord> header) throws java.io.IOException, java.lang.ClassNotFoundException
- Throws:
java.io.IOException
java.lang.ClassNotFoundException
-
mergeSortedFiles
public static int mergeSortedFiles(java.util.List<java.io.File> files, java.io.File outputfile, CsvSortOptions sortOptions, boolean append, java.util.List<org.apache.commons.csv.CSVRecord> header) throws java.io.IOException, java.lang.ClassNotFoundException
- Throws:
java.io.IOException
java.lang.ClassNotFoundException
-
sortInBatch
public static java.util.List<java.io.File> sortInBatch(long size_in_byte, java.io.BufferedReader fbr, java.io.File tmpdirectory, CsvSortOptions sortOptions, java.util.List<org.apache.commons.csv.CSVRecord> header) throws java.io.IOException
- Throws:
java.io.IOException
-
sortAndSave
public static java.io.File sortAndSave(java.util.List<org.apache.commons.csv.CSVRecord> tmplist, java.io.File tmpdirectory, CsvSortOptions sortOptions) throws java.io.IOException
- Throws:
java.io.IOException
-
sortInBatch
public static java.util.List<java.io.File> sortInBatch(java.io.File file, java.io.File tmpdirectory, CsvSortOptions sortOptions, java.util.List<org.apache.commons.csv.CSVRecord> header) throws java.io.IOException
- Throws:
java.io.IOException
-
-