‘Jazz’ up your applications with open source Java: add spell-check functionality – ColdFusion and Java
Darron Schall
By now it should be no surprise to hear ColdFusion and Java mentioned in the same sentence. You’ve probably seen the examples, read the tutorials, and poured over lines of Java code in hope of enhancing your ColdFusion applications. Either that, or you’ve filed the information in a “nice to know” vault and continued on with your ordinary development habits. No matter which category you fall into, this article is for you.
One of the greatest benefits of ColdFusion and Java integration is being able to rely on Java’s immense open source community to enhance your ColdFusion applications. So, chances are, whatever you’re looking for has already been built in Java–just waiting to be leveraged in your ColdFusion code. No idea how to integrate the two? Not a problem–this article is here to help. With a little bit of coding, I’ll show you how to leverage an open source Java spell-checking engine in your ColdFusion applications. Let’s see … Java. Open source. Spell checking. Free. Do I have your attention yet?
Scoping out the Playing Field
The first step to solving any problem is describing what the problem is. In my case, I needed a way to add spell-check functionality to a ColdFusion application on a very low budget. After doing a bit of searching I found some ColdFusion custom tags that would probably get the job done, but didn’t have the bones to pony up for any of them. With some Java experience under my belt, I decided to try a different approach to the problem. Enter the open source community previously mentioned.
One search on Google for “open source Java spell check” yielded approximately 65,000 results, with the very first result showing promise. I followed the link and discovered “Jazzy,” hosted on SourceForge (sourceforge.net). The first two sentences on the project page caught my attention: “There are currently no Java open source spell checkers. This is a project that seeks to remedy that.” Sweet. I snagged the source and documentation and got down to business.
Preparing Your Development Environment
Before I describe how I got Jazzy to play nice with ColdFusion, here’s the usual list of everything you’ll need to follow along. If you’ve done Java development before, then your computer is probably already prepared. If not, it’s download time.
* The Java 2 SDK. You can find J2SE downloads at http://java. sun.com/j2se/downloads.html, under the “Download” heading. After clicking through on the latest version (1.4.2 at the time of this writing), you will be presented with a myriad of download options. Under a heading such as “Download J2SE v 1.4.2_02,” find the right version for your operating system and click the “download” link under the SDK column. You’ll have to accept a license before downloading and installing the software. If you have trouble installing the Java 1.4.2 SDK you can visit http://java.sun.com/j2se/ 1.4.2/install.html for installation help.
* Once the Java 2 SDK is installed, Eclipse is next on the download list. Eclipse is a fantastic IDE available as a free download at www.eclipse.org/downloads/index.php. The latest release is version 2.1.2. After clicking through on a download site, click on the version number, then look for the Eclipse SDK download that best fits your system. On Windows, there is no setup program to run–just extract the zip file for your preferred installation location. I extracted mine to c: to have the program installed in c:eclipse. You’ll need to run c:eclipseeclipse.exe once extracted to finalize the installation.
(Note that you don’t need Eclipse to develop in Java. You can use a simple text editor and the command-line Java compiler if you like. However, I’ll be using Eclipse for all of the Java coding required, so you’ll need to download it if you want to follow along. For a listing of Java IDEs, their popularity, and a brief description about each one, check out the voting for Java Developer’s Journal’s Best Java IDE of 2003 at www.sys-con.com/java/readerschoice2003/liveupdate. cfm?BType=9. Eclipse is my personal favorite because it’s open source, has an active development community, is backed by 30 major software vendors, is relatively easy to use, and is very feature rich.)
* You’ll need access to a ColdFusion MX 6.1 server. I have the developer version installed locally, using IIS 5 as the Web server. Although the Web server isn’t important in this article, it’s very important that you’re using ColdFusion MX 6.1. The 6.1 release fixes a bug where java.lang.IllegalAccess Exception would be thrown when trying to access certain public methods in Java classes. Samuel Neff mentioned this at www.rewindlife.com/archives/000049.cfm. You can find ColdFusion MX 6.1 at www.macromedia.com/software/ coldfusion/.
* You’ll also need the Jazzy source and documentation. The binary release isn’t necessary since we’ll be building from the source code. The latest version at the time of this writing is .5, and you can find the downloads on the Jazzy SourceForge project page at http://sourceforge.net/projects/jazzy. For now, just save the .zip files somewhere; we’ll worry about extracting them later.
* Finally, you’ll need a dictionary file. A dictionary file is a one word per line, case-sensitive alphabetical listing of correctly spelled words that you want the spell checker to validate against. In case-sensitive alphabetical order, all words beginning with a capital letter come before those beginning with a lowercase (Zimbabwe would come before aardvark). The reason is that the ASCII values for uppercase letters are numerically lower than the ASCII values for lowercase letters. Again, the Jazzy Project page is where a sample dictionary can be downloaded from. There are actually two dictionaries listed, but you’ll only need to download english. 0.zip–just save the .zip archive to disk since we’ll be extracting it later.
Running the Example
Now that we have everything we need and the development environment is set up, it’s time to figure out how Jazzy works so we can integrate it in our ColdFusion code.
Fire up the Eclipse IDE, and create a new project by selecting File -> New -> Project. Select “Java” and press the next button. Give the project a name of “CFSpellCheck” and make note of where the project directory is created. On Windows, the default location will be a directory with the same name as the project, off of the “workspace” directory under wherever you chose to unzip Eclipse to. In this example, that directory is “C:eclipseworkspaceCFSpellCheck.” Press the finish button to create the project. After pressing finish, you may get prompted by Eclipse to switch to the Java Perspective. If so, click yes.
Now that the project is created we can get to work with the Jazzy source code. Extract the jazzy-doc.zip and jazzy-src.zip archives to a temporary directory. Copy everything under the “src” directory (a .java file and the “com” directory) to the project directory that we just created. Don’t be surprised to see a “.project” and a “.classpath” file in the project directory–those are created by default when Eclipse creates the project.
Next, make a “dict” directory under the project directory, and extract english.0.zip to it. Right-click on the project name in the “Package Explorer” in Eclipse, and select “Refresh” from the menu to update the project.
With the source code in place, we can build their examples. Expand “com.swabunga.spell.examples” and double-click the SpellCheckExample.java file. From the menubar, select Run -> Run As -> Java Application. When you run the example, you’ll notice that the “Console” panel in Eclipse contains an error that occurred while trying to run the program. The very first line of the error (“java.io.FileNotFound Exception: dictphonet.en”) indicates that the program is looking for “phonet.en” in the “dict” directory. Doh!
Now, before you get mad at me for showing you an example that doesn’t work, I did this on purpose. Whenever you’re trying something like this, there’s no guarantee that it’s going to work the first time. I wanted you to experience a problem right away so that when something does go wrong for you, you don’t get frustrated. Open source is great, but it does have its pitfalls. If you find yourself running into trouble, ask around on mailing lists, forums, newsgroups, or try search engines.
Speaking of search engines … Google to the rescue again! When we search for “phonet.en” only a few links come up, but thankfully they’re all associated with Jazzy. Visit http://cvs. sourceforge.net/viewcvs.py/jazzy/jazzy/dict/phonet.en and click on the download link next to the “Revision 1.1” heading. Make a new text document named “phonet.en” in the “dict” subdirectory under your project directory, copy and paste the text from the previous link into that file, then run the example again.
With the program now running successfully, we’re prompted to enter some text to spell check. I purposely entered text with spelling errors and was pleasantly surprised when the program found a misspelled word and offered the correct spelling as a suggestion. Try it for yourself! Everyone loves the “Hello world” example, so go ahead and spell that wrong to see if the program will correct it for you.
Awesome, it works … now, to dig in and figure out how to get this to work with ColdFusion.
Understanding Jazzy’s Innards
The best place to look for help is the documentation and source code, and we have access to both in this case. At the very least, you’ll always have access to source code when dealing with open source projects. Since we have an example program running successfully and we’re still looking at the code, let’s start there.
But wait … before we start, what are we even looking for? Because we need to use a Java object in ColdFusion, we’re looking for a class with methods like setText, setDictionary, and runSpellCheck or checkSpelling. We’ll also need a way to get the spelling error information from Jazzy to ColdFusion. This information would include the mispelled word, the list of suggested spellings, and the string position where the spelling error was detected, so we need a method along the lines of getErrors or getSpellingMistakes.
Before going any further, make sure that line numbers are being displayed inside of Eclipse. This can be accomplished by selecting “Window” from the menu bar and clicking on “Preferences.” Expand “Java” on the left, then click “Editor.” Find the “Show line numbers” check box, then hit “Apply.” Click “OK” to close the dialog.
Looking at SpellCheckExample.java we don’t see any of the methods we’re looking for. However, at line 30 we can see the dictionary being created with the dictionary and phonet files. Line 42 is the checkSpelling call that invokes the spell-checking engine with the text to check as a parameter. At line 49 we see a possible bottleneck. Whenever a spelling error occurs, an event is raised and the spellingError method handles it. This is great for allowing the user to fix mistakes as they arise in a Java application, but that level of interactivity won’t work in a ColdFusion application since all of the processing completes on the server before the client gets a chance to interact with the application. We’ll have to create a workaround for that.
With a little insight into how Jazzy works, we can take a look at the documentation provided to see if any other classes have the desired methods. Ideally, we just want to create a Java object and use it without having to do any additional coding. Find the index.html in the temporary directory you extracted the jazzy-doc.zip to, and open it up in a Web browser.
What you’re looking at is documentation generated automatically by JavaDoc. If you’re not sure what JavaDoc is, check it out on Sun’s Web site at http://java.sun.com/j2se/javadoc/. The left-hand column is a listing of all of the classes associated with Jazzy. When you click on a class name you’ll see a listing of all of the methods available in that class. Now is a good time to click around and see what you find, and at the same time, familiarize yourself with JavaDoc style documentation if you’ve never seen it before.
What you’re looking for specifically are public methods. ColdFusion is not allowed to call private or protected methods in a Java class–these methods are internal to the class that defines them and are not intended to be used by developers using the class. If you see the static keyword, it means that you don’t need an instance of the class to call the method. You can create the Java object in ColdFusion and use the method right away, without initializing it.
After looking around some, it doesn’t look like there’s a class that will do what we want, so we’ll have to just go ahead and make one!
Writing a Java Wrapper for Jazzy
The first thing we need to do is make a new class that we can instantiate via ColdFusion. In Eclipse, right-click on the project name (CFSpellCheck) and select New -> Class. In the name field, enter “CFSpellCheck,” and enter “com.sys_con. ColdFusion” in the package field. We use an underscore in place of a dash in “sys-con” because the dash in Java has special meaning (in this case, it would be the subtraction operator). A package, simply put, is a group of related classes that has the added benefit of eliminating name collisions. For more information on packages check out http://java.sun.com/ docs/books/tutorial/java/interpack/packages.html. I always use the “com.darronschall” prefix when creating packages since that is my domain name, which is unique to me and identifies me as the author.
Click “Finish” to create the class and accept the default values Eclipse provides. Copy the code in Listing 1 into the file that was just created. (Code examples for this article can be downloaded from www.sys-con.com/coldfusion/sourcec/cfm.) Select Run -> Run As -> Java Application to see the wrapper in action. Inside the main method beginning on line 71, you can see that we first create a new CFSpellCheck object, set the dictionary, set the text, run the spell check, and then get the errors. This is the same flow that we’ll have when we use CFSpellCheck in our ColdFusion code.
Listing 1
package com.sys_con.ColdFusion;
import java.io.File;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import com.swabunga.spell.engine.SpellDictionary;
import com.swabunga.spell.engine.SpellDictionaryHashMap;
import com.swabunga.spell.event.SpellCheckEvent;
import com.swabunga.spell.event.SpellCheckListener;
import com.swabunga.spell.event.SpellChecker;
import com.swabunga.spell.event.StringWordTokenizer;
/**
* @author Darron Schall (darron@darronschall.com)
*
*/
public class CFSpellCheck {
private SpellDictionary dictionary;
private SpellChecker spellChecker;
private ArrayList errors;
private String textToCheck;
public CFSpellCheck() {
errors = new ArrayList();
}
public ArrayList getErrors() {
return errors;
}
public String getText() {
return textToCheck;
}
public void checkSpelling() {
spellChecker = new SpellChecker(dictionary);
spellChecker.addSpellCheckListener(new SpellCheckListener() {
public void spellingError(SpellCheckEvent event) {
SpellingError s = new
SpellingError(event.getInvalidWord(),
event.getWordContextPosition());
System.out.println(“SpellingError – ” +
event.getInvalidWord() + ” at position “
+
event.getWordContextPosition());
List suggestions = event.
getSuggestions();
Iterator suggestedWord = suggestions.
iterator();
while (suggestedWord.hasNext()) {
String suggestion =
suggestedWord.next().
toString();
s.addSuggestion(suggestion);
System.out.
println(suggestion);
}
errors.add(s);
}
});
spellChecker.checkSpelling(new
StringWordTokenizer(textToCheck));
}
public void setDictionary(String dictFile) {
try {
dictionary = new SpellDictionaryHashMap(new
File(dictFile));
} catch (Exception e) {
e.printStackTrace();
}
}
public void setText(String txt) {
textToCheck = txt;
}
public static void main(String args[]) {
CFSpellCheck spellcheck = new CFSpellCheck();
spellcheck.setDictionary(“dict/english.0”);
spellcheck.setText(“This is some text that needs
speel checkin”);
System.out.println(spellcheck.getText());
spellcheck.checkSpelling();
System.out.println(“… done!”);
System.out.println(spellcheck.getErrors());
}
private class SpellingError {
private String word;
private int position;
private ArrayList suggestions;
public SpellingError(String word, int position) {
init(word, position);
}
public void init(String word, int position) {
this.word = word;
this.position = position;
suggestions = new ArrayList();
}
public void addSuggestion(String suggestedWord) {
suggestions.add(suggestedWord);
}
public String getWord() {
return word;
}
public int getPosition() {
return position;
}
public ArrayList getSuggestions() {
return suggestions;
}
public String toString() {
return “word:” + word + ” | position:” + position +
” | suggestions: ” + suggestions;
}
}
}
It is not my intent to explain every single line of code in the Java wrapper. However, I do want to highlight some of the more interesting aspects.
I’ve omitted the phonet file for the sake of simplicity and brevity. There is only an option to specify a dictionary file, with a setDictionary method that’s defined starting on line 59. The file name is passed into the method as a string, and a dictionary is created by the file. If any error occurs during the dictionary creation process, the error is just dumped to the screen.
The checkSpelling method on line 38 is where most of the magic happens. I’ve created an automated event handler for spelling errors. Whenever a spelling mistake is encountered, the word, position, and all of its suggestions are saved into an array. In order to save that information, I needed to create a data type to hold that data.
There is no “StructNew” command in Java that I could leverage to create a container for the data, so I had to create a container on my own. This called for another class to be created with private variables to store the data and methods to manipulate those variables. Because I only want to use this class as a data type inside of the CFSpellCheck class, I defined it as an “inner class” by using the class keyword inside of a class and marked it as “private” to restrict access so that only CFSpellCheck can use it. The private inner class SpellingError starts on line 82 and will save the word, the position, and the list of suggestions for the word. A toString method in the SpellingError class is defined starting on line 113 that will return a string containing the values inside of a SpellingError variable, useful for debugging purposes.
As you can see, there really isn’t a lot to the wrapper class, and it should be fairly straightforward and easy to understand. The next step is invoking the wrapper from our ColdFusion application. Before doing this, we can go ahead and comment out the main method since it’s no longer needed. This can be accomplished by using Java’s multi-line comment operators. Place a “/*” on line 71 before the “public” keyword. At line 80, after the closing curly brace, place a “*/” to mark the end of the comment. We can also comment out the informational messages generated by lines 43 and 50. To comment out a single line, use Java’s single-line comment operator, two forward slashes in a row. Before the word “System” place a “//” there to mark the line as a comment so that the Java compiler ignores the line. By default, you should see the line turn green inside Eclipse. Finally, we’ll need to export our project.
There are two ways to export the project. One is to bundle everything in a .jar file (Java Archive). The other is the manual process of copying all of the .class files to the directory we want to deploy from. Creating a .jar file is the easier and more elegant of the two approaches, so I’ll be taking that approach. The latter approach makes missing a file easy, and requires multiple files to be shuffled around. The .jar file is a single file containing all of the required .class files for project deployment.
To create a .jar in Eclipse, first make sure the CFSpellCheck.java file has been saved. Then, right-click on the project name in the Package Explorer. Select “Export” from the menu. Select “JAR File” from the dialog and then click the “Next” button. Uncheck all of the files on the right-hand side (.project, .classpath, and possibly CFSpellCheck.jar) as they will not be needed for deployment. Click “Browse” to choose a location where the .jar file will be created, and then click “Finish” to create the .jar file.
We’re now ready for the ColdFusion side of things … finally!
Calling the Java Wrapper from ColdFusion
In order to invoke the CFSpellCheck class from ColdFusion, the ColdFusion server itself needs to know where to look for it. This information can be controlled in the ColdFusion Administrator. Log in to the ColdFusion Administrator and select “Java and JVM” from the menu. In the classpath field we can specify where ColdFusion will look for classes. I usually make a directory called “JavaClasses” on the Web server, and add “C:JavaClasses” to the classpath.
Once the classpath has been saved, you’ll need to restart the ColdFusion service for the changes to take effect. After doing this, copy the .jar file to the “JavaClasses” directory. Additionally, you’ll need to copy the “english.0” dictionary file to a directory on your Web server. I just copied the “dict” directory under the Eclipse project directory to “C:JavaClasses” as well.
If you’re using the J2EE version of ColdFusion MX, the easiest way to make the .jar file accessible is to either copy it to the “lib” directory for your application server or for the particular instance you want to make it available to in a multiple instances environment. To make the .jar available to all instances in JRun, this would be the {JRun install directory}/lib directory. To make the .jar available to one particular instance, create a “lib” directory off of the {JRun install directory}/servers/{instance name}/SERVER-INF directory and put the .jar file there. Either approach requires that you stop and restart any application server instance that will use the spell checker. The “english.0” file can be copied along with the .jar or can be put in any other directory on the server, as you specify the path to this file in the Java code.
Now that the classpath is set up and the .jar file and dictionary file are in place, we can start using the CFSpellCheck class in our applications. The ColdFusion code is shown in Listing 2.
Listing 2
<meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1″ />
if (StructKeyExists(form, “checkText”)) {
CFSpellCheck = CreateObject(“java”,
“com.sys_con.ColdFusion.CFSpellCheck”);
CFSpellCheck.init();
CFSpellCheck.setDictionary(“C:/JavaClasses/dict/english.0”);
CFSpellCheck.setText(form.checkText);
CFSpellCheck.checkSpelling();
errors = CFSpellCheck.getErrors();
if (ArrayLen(errors)) {
WriteOutput(“Errors:
“);
for (i = 1; i lte ArrayLen(errors); i = i + 1) {
WriteOutput(“word: ” & errors[i].
getWord() & “
“);
WriteOutput(“position: ” &
errors[i].getPosition() & “
“);
WriteOutput(“suggestions: “);
suggestions = errors[i].getSuggestions();
for (j = 1; j lte ArrayLen(suggestions);
j = j + 1) {
WriteOutput(suggestions[j] &
“
“);
}
WriteOutput(“
“);
}
} else {
WriteOutput(“No errors found!
“);
}
} else {
form.checkText = “”;
}
<textarea name="checkText" rows="4"
cols=”40″>#form.checkText#
This example is about as simple as they come. A form is presented to the user that, when submitted, runs the text in the textarea through the spell checker. If any errors are encountered they’re just dumped to the screen.
In the example, take note of the path to the dictionary. I’m using an absolute path from the C: drive and using forward slashes to separate the directories. This is standard Java syntax for defining directory paths. If you get a “dictionary must be non-null” error it means that the path to the dictionary file is not correct and the specified dictionary file could not be found. Also, note how similar the ColdFusion code looks to the main method we commented out in the CFSpellCheck class. Interesting, no?
Now that we’ve got the spell checker working successfully … what’s next?
Where We Go from Here
There are only a few things that I have in mind to enhance the CFSpellCheck class. I’d like to add in the ability to specify a phonet file for more accurate spell checking, and I’d like to be able to let users add words to the dictionary. Multilingual support would be a nice addition, as well as being able to specify some spell-checking flags to ignore uppercase words, words with numbers, and Internet addresses.
Other than that, the only feature I have in mind can be implemented in client-side code via JavaScript and doesn’t involve modification to the Java class.
This feature would be popping up a “spelling error” dialog with options for replace, ignore, replace all, and ignore all, which would give users the level of interactivity that they have probably come to expect when spell checking text.
If you’re wondering what else you can do with Java, here are three different project ideas:
* Leverage JasperReports, a report generating library found at http://jasperreports.sourceforge.net/ for your reporting needs.
* Use Struts-Menu to create some slick menus for your ColdFusion applications. Struts-Menu can be found at http://struts-menu.sourceforge.net/. Their Web site contains some very cool demos.
* Build .swf files dynamically on the server with JavaSWF. You can find JavaSWF at www.anotherbigidea.com/javaswf/.
Conclusion
The open source Java community presents many possibilities for ColdFusion developers. Integrating Java and ColdFusion may be easier and require less coding than you expect. In this article, I showed you how to leverage an open source Java spell checker named Jazzy by writing a small wrapper class that exposed the key methods necessary for spell checking.
I also showed you how to call this wrapper from ColdFusion to enable spell checking in your applications. I hope that I’ve gotten you excited at the possibilities of leveraging Java in ColdFusion and that the information presented in the article shows you the necessary steps that lead to success.
Resources
Here are some links with useful information related to Java, ColdFusion, or both.
* The Java Tutorial: http://java.sun.com/docs/books/tutorial/ index.html
* Calling Java Objects from ColdFusion: www.intermedia.net/support/coldfusion/cfdocs/Developing _ColdFusion_Applications/cfobject8.html
* Java projects on SourceForge: http://sourceforge.net/search/?words=java
Download the Code …
Go to www.coldfusionjournal.com
Darron Schall is an application developer interested in all things programming, from ActionScript to XML and everything in between. He is a recent computer science graduate from Lehigh University, and maintains a Flash-related weblog at www.darronschall.com.
darron@darronschall.com
COPYRIGHT 2004 Sys-Con Publications, Inc.
COPYRIGHT 2004 Gale Group