The main problem this approach does not attempt to solve is that the DB2 unload files save numeric fields as the actual value, not the digit equivalent (i.e., the number 84 is unloaded as the ASCII-equivalent "T", not "84"). This code obviously does not reference the DB2 "punch" (e.g., parse instruction) files, so it makes no attempt to parse the files into fields itself - that is a separate exercise in my case. BTW, if there is a good way to import these files into Oracle automatically, please let me know, as I have not been able to find a better solution.
This code is fairly generic, and can be used for other purposes beyond converting DB2 unload files, so if you have a need to replace non-printable characters in text files, you can start with this code base.
package
com.threeleaf.bin2txt;
import java.io.File;
import
java.io.FileInputStream;
import
java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import
java.io.OutputStream;
/**
* Purpose is to read a file and replace non-printable
characters with a given character.
* Specifically, I want to use this to make DB2
unload files parsable with other applications so
* that the data can be imported into Oracle.
*
* @author John A. Marsh
* @since 2011-10-27
*/
public final class Bin2Txt {
/**
* Run this class from the command line with:
*
java Bin2Txt
<pathAndFilename>
.
*
* @param args
* the filename to convert
* @throws IOException
* Signals that an I/O
exception (e.g., file not found) has occurred.
*/
public static void main (final String[] args) throws IOException {
final byte ASCII_SPACE = 32;
final byte ASCII_CR = 13;
final byte ASCII_LF = 10;
final byte ASCII_TILDE = 126;
try {
final File file = new File(args[0]);
final InputStream
inputStream = new FileInputStream(file);
final long fileLength =
file.length();
/*
* Array needs to be created with
an int type, so need to check to ensure that file is
* not larger than
Integer.MAX_VALUE.
*/
if (fileLength > Integer.MAX_VALUE) {
throw new IOException("File is too
big");
}
/* Create the byte array to hold the data
*/
final byte[] bytes = new byte[(int) fileLength];
/* Read in the bytes */
int offset = 0;
int numRead = 0;
while (offset < bytes.length && (numRead
= inputStream.read(bytes, offset, bytes.length - offset)) >= 0) {
offset += numRead;
}
/* Ensure all the bytes have been read in
*/
if (offset < bytes.length) {
throw new IOException("Could not
completely read file " + file.getName());
}
inputStream.close();
for (int i = 0; i < bytes.length; i++) {
if (bytes[i] ==
ASCII_CR && bytes[i + 1] == ASCII_LF) {
/*
* Preserve line breaks
(carriage return + line feed) by skipping over them.
* Note that I don't check
for end of file here because I already know my
* particular files will
never end with a CRLF.
*/
i = i + 2;
}
if (bytes[i] <
ASCII_SPACE || bytes[i] > ASCII_TILDE) {
/* Replace all
non-printable characters. */
bytes[i] = ASCII_TILDE;
}
}
/* Output file name will be the same as the
input, with ".out.txt" added to the end. */
final OutputStream
outputStream = new FileOutputStream(args[0] + ".out.txt");
outputStream.write(bytes);
outputStream.close();
} catch (final ArrayIndexOutOfBoundsException e) {
/*
* If no file was passed on the
command line, this exception is generated. A message
* indicating how to the class
should be called is displayed.
*/
System.out.println("Usage: java
Bin2Txt filename\n");
}
}
}
Here is a batch file that will convert all the files in a given directory:
::
Directory where Bin2Txt.class is located ::
cd
C:\projects\workspace\bin2txt\bin\
::
Put in directory where unload files are ::
for %%f in ("C:\projects\Database\Unloads\*.txt") do call java com.threeleaf.bin2txt.Bin2Txt
%%f
No comments:
Post a Comment