Writing Using Our Previous Example
Let's revisit the sample data format we used when we explored how to parse and read flat files with Flatworm.
| Name | Start | End | Length | Type |
| Type | 1 | 2 | 2 | Char |
| First | 3 | 27 | 25 | Char |
| Middle | 28 | 52 | 25 | Char |
| Last | 53 | 77 | 25 | Char |
| Acct. ID | 78 | 92 | 15 | Char |
Below is the sample flat file we will be writing.
CDJOHN MARK DOE 111111111111111
CDPAUL RICHARD STEPHENS 222222222222222
CDRINGO JACK ERICSON 333333333333333
Now that we know the file format of our flat file and we have some
sample data to parse we'll need to create an XML document describing
our file format for Flatworm.
<?xml version="1.0" encoding="ISO-8859-1"?>Closer Look at the Descriptor File
<!--<!DOCTYPE file-format SYSTEM "http://www.blackbear.com/dtds/flatworm-data-description_1_0.dtd">-->
<file-format>
<converter name="char" class="com.blackbear.flatworm.converters.CoreConverters" method="convertChar" return-type="java.lang.String"/>
<record name="clientData">
<record-ident>
<field-ident field-start="0" field-length="2">
<match-string>CD</match-string>
</field-ident>
</record-ident>
<record-definition>
<bean name="client" class="org.javaconfessions.sample.Client"/>
<line>
<record-element length="2"/>
<record-element length="25" beanref="client.firstName" type="char">
<conversion-option name="justify" value="left"/>
<conversion-option name="pad-character" value=" "/>
</record-element>
<record-element length="25" beanref="client.middleName" type="char">
<conversion-option name="justify" value="left"/>
</record-element>
<record-element length="25" beanref="client.lastName" type="char">
<conversion-option name="justify" value="left"/>
</record-element>
<record-element length="15" beanref="client.accountId" type="char">
<conversion-option name="justify" value="left"/>
</record-element>
</line>
</record-definition>
</record>
</file-format>
Looking further at the XML descriptor file, you can see it is rather simple to describe our file format for Flatworm. The record tag is the beginning of describing our client data records. Within the record tag, we have our record-ident
tag. This is so Flatworm knows how to identify the types of records in
a flat file. Most flat file formats have different types of records
including header, footer, detail, batch headers, batch footers, etc.
This mechanism allows Flatworm to parse out all of these different
record types from the same file. The field-ident tag gives the specifics on how to identify the record. Field-start and field-length identifies what to test to identify the record type. Within the match-string tags
is where the text that would be used to identify this record as a
clientData record. In the descriptor above, we have described
clientData records as starting with the characters CD.
This is where we actually map out each record element to a bean
property for Flatworm. This section of the document starts with a bean definition that tells Flatworm which Java class to use when parsing this record type. The record-element tags setup where each field in the record is located, the data type, and where to plug it into the Java bean during parsing.
Here is the source code for my Client bean.
package org.javaconfessions.sample;Now that we have described the data model for our project, below is a sample class that populates a couple of Client beans and then has Flatworm write out the data in our specified file format.
public class Client {
private String firstName;
private String middleName;
private String lastName;
private String accountId;
public String getFirstName() {
return firstName;
}
public void setFirstName(String pFirstName) {
firstName = pFirstName;
}
public String getMiddleName() {
return middleName;
}
public void setMiddleName(String pMiddleName) {
middleName = pMiddleName;
}
public String getLastName() {
return lastName;
}
public void setLastName(String pLastName) {
lastName = pLastName;
}
public String getAccountId() {
return accountId;
}
public void setAccountId(String pAccountId) {
accountId = pAccountId;
}
@Override
public String toString() {
return "First Name: " + firstName + "\nMiddleName: " + middleName
+ "\nLastName: " + lastName + "\nAccount ID: " + accountId
+ "\n";
}
}
package org.javaconfessions.sample;A Closer Look at ClientDataWriter.java
import com.blackbear.flatworm.FileCreator;
public class ClientDataWriter {
public static void main(String[] args) {
FileCreator fileCreator = null;
try {
fileCreator = new FileCreator(args[0], args[1]);
fileCreator.open();
fileCreator.setRecordSeperator("\n");
Client client = new Client();
fileCreator.setBean("client", client);
client.setFirstName("JOHN");
client.setMiddleName("MARK");
client.setLastName("DOE");
client.setAccountId("111111111111111");
fileCreator.write("clientData");
client.setFirstName("PAUL");
client.setMiddleName("RICHARD");
client.setLastName("STEPHENS");
client.setAccountId("222222222222222");
fileCreator.write("clientData");
client.setFirstName("RINGO");
client.setMiddleName("JACK");
client.setLastName("ERICSON");
client.setAccountId("333333333333333");
fileCreator.write("clientData");
} catch (Exception e) {
e.printStackTrace();
} finally {
try{
fileCreator.close();
} catch( Exception e ) {
e.printStackTrace();
}
}
}
}
Looking at the sample code above, we can see that first we are creating a FileCreator object using parameters passed in. The first parameter given is the path to our Descriptor XML Document and the second parameter will be the location where the data file should be written. Next we call the open() method on the FileCreator object to open our file. The next call sets the record separator for our data file which in this case is a new line.
After getting our FileCreator object setup, we instantiate a Client bean. After instantiating our Client bean, we call setBean() on the FileCreator to tell the FileCreator about our Client bean and what type of bean it is. Notice that "client" is the name of our bean in the descriptor file. The next part is pretty self explanatory, we setup our first client record, then call fileCreator.write() passing in the type of record we want to write, in this case a "clientData" record. Then we repeat for the second and third records.
After writing out the records, we call the close() method on the FileCreator to close out the data file.
Gotcha
The Flatworm FileCreator will always put the record identifiers at the beginning of the record in the flat file. This doesn't always work because sometimes, you may have a situation where the record identifier in the middle of the record. I have a patch for the FileCreator below if you want to change this in the source. If you apply this patch, you will need to setup the record identifier portion of the bean description with the default values, in our case here "CD". However, you will have to do this for all constant data types in your file format that aren't specified in your bean or your file format will be incorrect.
Index: src/com/blackbear/flatworm/FileCreator.java
--- src/com/blackbear/flatworm/FileCreator.java Base (1.3)
+++ src/com/blackbear/flatworm/FileCreator.java Locally Modified (Based On 1.3)
@@ -212,12 +212,12 @@
// record-ident contain what is considered hard-coded data
// for the output line, these can be used to uniquely identify
// lines for parsers. We need to write them out.
- Vector recIdents = record.getFieldIdentMatchStrings();
+ /*Vector recIdents = record.getFieldIdentMatchStrings();
for (Iterator itRecIdents = recIdents.iterator(); itRecIdents.hasNext();)
{
String id = (String) itRecIdents.next();
bufOut.write(id + delimit);
- }
+ }*/
// Iterate over record-element items
Vector recElements = line.getRecordElements();
17 comments:
Nice work. Is it possible to not create a record identifier in the flat file?
Sorry, I need a little help with FlatWorm. Where is the dtd file? The url is a resort... And without dtd I have a NoClassDefFoundError..
Thank you,
Matteo
Just comment out the line that declares the DTD as I did in my sample configuration file above.
@Anonymous
You can do this by declaring a default record type. I don't believe I have posted anything on this, but I can have something up today or tomorrow.
I am using Flatworm to make a fixed-width file.
Each line in the file is like:
ID(12)... Name(12)... Company(8)... Dept(10)
And the word "ABC" is to be put in Company (for each ID). I don't have it in the bean.
The rest I am getting them from the bean.
How do I do it??? Please suggest.
Thanks.
Can you provide a few sample lines of your file?
Hi Michael,
This is a neat tool. Thanks for developing it. I am planning to use this in one of my projects . The one main problem I have with this implementation is the mandatory value for record-ident/match-string element. I would like to populate this through the program. Since I have a variable record ID I prefer not to specify this in the XML. I would like to specify only the record definition. I tried field-start=0 and field-length=0 and no value for match-string element but got a "null" prefix in the output. Do you have any suggestions how to get around this problem.
Thanks
Santhosh
Santosh,
Thanks, but I didn't develop Flatworm. However I have used it extensively and decided to write how to perform some tasks that weren't readily available already.
Are you going to have multiple record types within your file? That is what this identification is for. It is not to identify the data so much as the type of data. Generally within flat files you'll have headers, maybe some batch headers, detail records, etc.
So in order for a record type to be identified, it generally has a static value somewhere within the line. For example, all detail record lines may start with DR.
For a simple example where we're writing out a flat file of employees. We're going to have two record types. A file header and the detail records. The file header is going to give the date the file was created and the number of detail records. The file header always start with FH. So a sample file header may look like this:
FH2010021700002
This file header was created on Feb 17, 2010 and this file contains two detail records.
The detail record contains the employee's first name, last name and employee id. All detail records start with DT.
DTJohn Doe 1234567890
DTBob Smith 0987654321
In this instance, you are assigning the employee IDs on the fly, but you are using DT as the match string so flatworm knows what type of file you are trying to write.
Does this help with what you are doing?
Also, I forgot to mention, that by specifying this value as the default-value in the conversion options, you won't have to put the match string in your XML.
Thanks for the reply Michael. Soon after I posted my question, through trial and error I was able to generate my records in the required format by leaving out the element from the definition.
Santhosh
Good to hear Santosh. ;-)
I'm currently using Flatworm for a project. Its working great. However, I am wondering is there a clean way to handle nulls when writing to a file? I am iterating over a large list of customers and may or may not have some information, like phone numbers etc. Is there a way in the xml to set it to handle nulls? For now I have added null checks to my getters (ie. if (null == city) return new String();) but I am wondering if there is a better way? Thanks!
Seems like a very useful library.
But I was wondering. What is the advantage of this record identifier?
If it should accept the following as first field on the record
Isn't that easier and readable?
I am able to write the resultset into a flat file according to the code and config xml given,but the results are coming in one single line instead of multiple lines.
He is mu code:
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery(query);
while(rs.next()){
client.setFirstName(rs.getString("first_name").toString());
client.setLastName(rs.getString("last_name").toString());
client.setAddress(rs.getString("address").toString());
fileCreator.write("clientData");
}
I am not able to write the result set into multiple lines.Its coming in a single line.Please suggest.
I've created a definition of a line/record which contains elements which are not contiguous (ie where a "end" finishes, the following "start" does not follow). The ConfigurationReader blows up with a null in a NumberFormatException. I assume from the examples that all fields should be contiguous?
Also is Flatworm threadsafe? Can I have multiple threads using the same object or do I need to either synchronize access or have separate FileFormat objects in Thread Specific data (ThreadLocal)?
Ok I found the prob, it was a misnaming of an element.
Error messages are pretty cryptic
Post a Comment