Integrating XSL-FO into Web-based applications
JavaMail and PDF report creation
This article demonstrates how we can integrate XSL-FO, XSLT, and JavaMail into our existing web-based applications. I show you how we can generate PDF reports for an application through the use of XSLT and XSL-FO embedded within the Java application. I also illustrate how the generated PDF file can be sent as an e-mail attachment using JavaMail.
Although a variety of Web-based technologies such as servlets or Web services are available, I chose the JSP approach. A simple JSP-based test harness was written to demonstrate the integration of all these technologies. An HTML table report generation example is also included to show how the XSL-FO table elements correspond to the HTML table elements. Although using XSL-FO and XSLT may be overkill, the greater degree of formatting control and flexibility may prove advantageous.
The entire system was written in Java running on Windows 2000 Professional. The tools used to build this system include:
* Java 2 Platform, Standard Edition (J2SE): http://java.sun.com/j2se/
* JavaMail 1.3: http://java.sun.com/prod ucts/javamail/
* JavaBeans Activation Framework 1.0.2: http://java.sun.com/products/jav abeans/glasgow/jaf.html
* JDOM Beta 9:www.jdom.org
* Log4j v 1.2.8 from Apache: http://log ging.apache.org/log4j
* FOP-0.20.5: http://xml.apache.org/fop/index.html
* Apache Tomcat 5.0.16 (JSP and Servlet Engine): http://jakarta.apache.org/tom cat/index.html
Formatting Objects Processor (FOP)
XSL-FO is part of the Extensible Stylesheet Language (XSL) family of recommendations from the W3C. XSL is used to define XML document transformations and presentations and is made up of:
XSLT is a language for transforming XML documents (mainly from XML to XML or XML to HTML).
XPath allows you to identify specific parts of your XML document and to write expressions to refer to, for example, the nth child element of the specified XML file. It is used extensively by XSLT for referencing specified elements within the input document for further processing.
XSL-FO is an XML language that defines page formatting and layout.
FOP is an implementation of the XSLFO specification defined by W3C. It is both an open source library and an application used to convert your XML documents into paginated output. FOP supports a number of different output formats, such as PDF, Postscript, PCL, and text.
XSL-FO, in particular Apache FOP, was used because FOP allows for easy conversion from XML to PDF. The formatting commands are not embedded into the Java application and are stored in a separate file. This means I can easily change the “look” of the resulting document. Also, XSL-FO may be elevated to a W3C standard in the near future.
Although there are a number of free PDF libraries (non-XSL-FO) available, such as PJ by Ethymon (www.etymon. com/epub.html) and retepPDF (www.ret ep.org.uk/retep/ home.do), I have chosen to compare iText (www.lowagie.com/iText) to Apache FOP because iText has been around much longer and is a popular package. Table 1 highlights some of the differences between iText and Apache FOP.
I have adopted an n-tier architecture for this system (see Figure 1). Clients communicate with the Web server, which serves up the HTML and JSP pages. The relationship between these pages is described in detail below. The JSP page instantiates helper objects (GenStatistics, Pairs) that live on the Web server. These in turn connect to the application server, retrieving the results and storing them in the helper objects. The results are then presented to the clients.
[FIGURE 1 OMITTED]
Figure 2 shows the relationship between the various HTML and JSP pages in the application. The start.html page is the entry point into the system. These pages are not elaborate, as their purpose is to serve as a test framework. When the “Submit” button on the start page is pressed, the genReport.jsp page is called. This checks the user selection. Depending on the type of report requested, either the genHTMLReport.jsp or genPDFReport.jsp page is invoked.
[FIGURE 2 OMITTED]
Both pages invoke the main class, GenStatistics, which connects to the application server, retrieves the data from the database, and stores the results in an array of the Pairs Bean. However, for illustration, I have removed this unnecessary complexity from the code examples and replaced it with a simple “read the data from file” illustration.
If the report is requested in HTML format, the genHTMLReport.jsp page is invoked.
If a PDF report is requested, the genPDFReport.jsp page is invoked. The results are stored in the Pairs array. This is traversed and the data is “XML-ized” using JDOM. An XSLT file containing the embedded XSL-FO commands is defined and applied to the XML-ized data, creating a FO file, which can then be passed to the Apache FOP driver to be rendered.
When the PDF report has been successfully generated, it can either be mailed out as an attachment to the specified recipient(s) or saved and displayed.
* Supporting Data Structures, Pairs Bean: I have defined a “Pairs” JavaBean to store the information retrieved from the application. It contains two attributes: a name (type String) and its value (type int). It contains methods for setting and retrieving attributes.
* The Main Class, GenStatistics: The application reads the data from the database and stores it in the “Pairs” array. When the results are returned, the data is “XML-ized.” I used JDOM for this purpose. JDOM is much easier to use than DOM. It is a Java representation of a XML document.
–To use JDOM, I first create the root node,
Element root = new Element(“statistics”)
–Next, I create a Document object, passing in the root node:
Document myDoc = new Document(root)
–To add more nodes to the root element, I iterate through the “Pairs” array. A new Element is created for each “Pairs” element in the array:
Element e = new Element(pairs[i].getName());
–To set the text value associated with the node, I call the setText method:
–This element is added to the root node, via root.addContent.
–When the XML Document object has been populated, it will have the structure shown in Figure 3.
[FIGURE 3 OMITTED]
Finally, the transform method is called to produce the XSL-FO commands. This takes as input the Document object, an XSLT stylesheet containing the embedded XSL-FO commands, and an FO output filename. An FO file containing the XSL-FO commands is created. The PDF is generated invoking the FOPDriver. Alternatively, the XSLT stylesheet can be set up to transform the XML document into HTML, which is then displayed on your browser (see Figure 4).
[FIGURE 4 OMITTED]
Using the XSLT Style Sheet with XSL-FO
Although it is possible to create the XSLFO commands by hand, it is troublesome to edit and modify it each time the XML contents change. Instead, it is easier to use an XSLT stylesheet to transform the XML data into an XSL-FO file (see Listing 1).
LISTING 1 * Transform the XML data into an XSL-FO file
2 <xsl:stylesheet version="1.0"
8 <fo:simple-page-master master-name="simple"
22 Server Statistcs
25 Server Statistics at <xsl:value-of
38 <fo:table-cell border-
42 Type of
44 <fo:table-cell border-style="solid"
45 border-color=”black” border-
53 <fo:table-cell border-style="solid"
54 border-color=”black” border
61 <fo:table-cell border-style="solid"
62 border-color=”black” border
63 padding-before=”3pt” padding
64 padding-start=”3pt” padding
1. The XSLT file, dsstats.xsl, starts with the XML and namespace declaration (see lines 1-4).
2. A template rule is declared on line 5. This rule searches for and matches the root tag, replacing it with the content following it.
3. Line 6 is the root element for the XSLFO document. This typically contains the followed by one or more elements.
4. The element defines the page for our document (line 7). The element embedded within the tag defines the page layout required for this application (lines 8-19).
5. The page-height and page-width attributes (lines 9-10) define the size of the physical page. The master-name attribute (line 8) declares a name for this master page. It is referenced by the master-reference attribute in the element (line 21). The meaning of the remaining attributes is shown in Figure 5 (lines 11-18).
[FIGURE 5 OMITTED]
6. The element (lines 23 and 28) allows the data occurring within these tags to appear on every page. Lines 23-30 show how the title and page number can be made to appear on every page.
7. The element is used to format paragraphs, titles, figure captions, and table titles. It can also contain raw text. In the code snippets presented here, I show an example containing raw text plus an XSL command (line 25). The XSL command matches the attribute “date” associated with the tag “statistics”, extracting its value. The second example shows raw text plus an embedded command (line 29).
8. The element (line 31) contains the actual content. This is made up of sequences of fo:block, fo:block-container, fo:table-and-caption, fo:table and fo:list-block. The flow-name attribute specifies where the flow’s content will be placed.
9. The XSL-FO command is used to generate tables. The command (line 33) is embedded within the elements (line 32). The table-layout attribute is set to “fixed”. This is the only option currently supported. Next, I specify the element (lines 34-35), setting the column-width attribute to a specified value.
10. The element (line 36) contains all the elements (line 37). Each element contains the elements (line 38). This is where all the work for the tables is done. A number of properties are associated with the element. This controls the table look and feel.
11. Finally, to populate the cells of the table, I use the XSL command to iterate through all the child elements contained in the XML data via (line 51). For each child element of the statistics tag, get the name of the node (line 58) and its value, (line 66) and populate the table cells.
12. To ensure that the XSLT and XSL-FO commands are error free (before attempting to integrate all the components), I verified that the XSLT file produced the correct PDF output by running the files through the fop.bat utility:
fop -xml myTest.xml -xsl dsstats.xsl -pdf myTest.pdf
Transforming the Document
Having created the XSLT stylesheet containing the embedded XSL-FO commands, the next step is to programmatically invoke this stylesheet within the application. The output of this transformation step produces a .FO file. This contains the XSL-FO commands and data for populating the table rows. This data was extracted from the XML document created previously. The sequence of steps is outlined below:
1. A StreamSource is first created from the given XSLT style sheet File object:
StreamSource strmSource = new
2. Next, create the Transformer object from the TransformerFactory, passing in an instance of the StreamSource object.
TransformerFactory transformerFactory =
Transformer transformer =
3. Invoke the transform method, passing in the JDOMSource and JDOMResult as arguments. Note, the JDOMSource constructor takes in a Document object as input.
4. Finally, output the resulting .FO file using the XMLOutputter object:
XMLOutputter xmlOutputter = new
XMLOutputter(” “, true);
This will create an XMLOutputter object with the specified indent (usually a number of spaces). If the second argument is true, new lines will be printed.
, new FileOutputStream(fopFileName));
PDF Report Creation
Once the .FO file is created, I call the FOP Driver run() method to render the document.
1. First, set up logging. The MessageHandler object handles the global logging of all FOP processes. The FOP Driver handles per-instance logging. Both of these have to be set using an implementation of org.apache.avalon.framework.logger.Logg er. I used the Log4Jlogger implementation because existing code on the application server currently utilizes Log4J.
2. Create a Log4JLogger object and associate this with the static logger object created on start up in the GenStatistics class, i.e.
4JLogger Alogger = new
3. Next, instantiate the FOP Driver.
Driver driver = new Driver();
4. Set the logger associated with the FOP Driver to point to Alogger.
5. Make the screen logger point to Alogger.
6. Set the type of rendering desired via
7. Set the input source to
8. Set the output source to
9. Finally, render it.
If you are specifying XSLT and XML files as input, change steps 7 and 9 to:
InputHandler inputHandler = new
Mailing the Generated PDF Report as an Attachment
After the PDF report has been generated, I use the JavaMail API to send the document out as an email attachment. 1. Obtain a default Session.
Session s =
myProperties is the properties object. I supply the mail.smtp.host property defined in the Mail.properties file. The Authenticator object (the second argument) is used to indirectly check access permissions. If the Authenticator object is set to null, this means anyone can get the default session.
2. Create a new Message.
Message msg = new MimeMessage(s);
3. Set the To, From, Subject, and Date Sent fields.
4. Create the Message Body Parts.
MimeBodyPart mbp2 = new
5. Attach the file to the message.
FileDataSource fds = new
6. Create a multipart object and add the message body parts from step 4 to it.
Multipart mp = new MimeMultipart();
7. Add the multipart object to the message.
8. Send the message.
HTML Report Generation
Instead of generating a PDF report, I also show how XSLT can be used to transform the XML document created previously into an HTML report. A complete listing of the XSLT commands for this can be seen in Listing 2.
LISTING 2 * XSLT commands to translate XML to HTML
2 <xsl:stylesheet version="1.0"
11 Server Statistics at <xsl:value-of select="
|Type of Object||Total Number|
1. Lines 1-3 show the XML and namespace declarations.
2. The template rule is shown on line 4. This searches for and matches the root tag, replacing it with the content following it.
3. Table creation begins on line 13. The table headings are specified on lines 14-17.
4. Iterate through all the child nodes of the element, populating each row of the table with the object name and number of objects (lines 18-23).
5. From Listing 2, we see that the XSLFO table and HTML table commands are very similar. Table 2 shows the relationship between the two.
Running the Test Application
1. Install Tomcat 5.X on your system
2. Drop your JSP files into the C:jakarta-tomcat-5.0.16webappsMyApplicationjspfolder
3. Make sure that the associated JAR files are in the C:jakarta-tomcat-5.0.16webappsMyApplication WEB-INFlib directory
4. Start up Tomcat, C:jakarta-tomcat-5.0.16binstartup.bat
5. Start up your browser and point it at the start.jsp page
This article has shown you how the various technologies such as XSL-FO, XSLT, and JavaMail can be integrated into existing Web-based applications. A simple front end using JSP is provided as a test harness. I also explained how XSLT can be used to generate the XSL-FO commands, illustrated how the PDF or HTML reports are generated, and showed how JavaMail is used to e-mail the PDF reports to a specified recipient list.
Table 1 * iText vs. FOP
iText Apache FOP
Uses its own input format, iText-xml Uses the XSL-FO
Used for postprocessing FOP-generated
PDF files (merging, updating, and encrypting)
Document generation is much faster for Slower for long
long documents documents
Experimental XML2PDF functionality Fully supports XML
to PDF conversion
Table 2 * Mapping between XSL-FO and HTML table elements
XSL-FO Element HTML Element