Tackling documents, forms and fields with Aspose

The challenge

One subsystem of our Java-based mortgage processing system must generate and deliver about 50 forms for each loan application, populating them with data from our database and delivering them as PDF files.

Some of the forms are industry-standard, but most are custom Word and PDF forms implemented by our customers. We needed to allow them to create those custom forms, marked up with fields that our application would then populate and generate, and, for the Word forms, convert to high-fidelity PDF documents.

In addition to the forms,  the subsystem also has to convert image files in a variety of formats to PDF.

To do this processing, we recently evaluated a number of tools to manipulate Microsoft Word and PDF documents. In this post, we’ll explain how we explored several options and ultimately arrived at our choice, Aspose.Words for Java and Aspose.PDF for Java.

First steps

The first forms we implemented were the industry-standard ones. We decided to have these forms created by engineers, not customers, and made available in a “standard form” library.

As we were more comfortable coding than using Word, these forms were implemented using Apache Velocity templates that were populated and then converted to PDF using Prince XML. Prince allows you to define a form in HTML and generate a PDF from it with high fidelity.

Both are excellent tools and this approach worked for our industry-standard forms. But it did require programming and you have to create HTML/CSS to create a pixel-perfect facsimile of the forms, which is not simple. It’s also time-consuming and the forms not easily modified, thus the approach is not too scalable. But we have quite a few stable forms built and working well with this approach.

Removing the engineers

Our users’ custom forms, however, presented a different challenge. The Velocity/Prince approach required programming. That wasn’t a viable solution for the custom forms, numbering in the hundreds, and often revised.

We needed to allow our users to create forms with embedded fields and upload them to a repository in our application. When needed for a loan, our application would parse the forms for the merge fields, set the fields from database values, and convert the merged document to PDF for delivery. The Word-to-PDF conversion had to be high fidelity and retain all formatting.

The users were most comfortable working with Microsoft Word documents and fillable PDFs so we sought a solution that would allow them to embed Microsoft Word documents and PDFs with embedded fields.

For Word, the solution was Word’s merge field feature. This allows a field to be inserted in the Word document, labeled with a field ID (known to our application), and then located using a Word inspection tool and and set to database values.


Inserting a mail merge field in Microsoft Word

Word document with merge field placeholders

Word document with merge field placeholders

Exploring our options

So we set off to evaluate several tools by writing prototype code and generating documents.

We had some experience with the Apache POI open source project. We use it for some simple exports of data to Excel files. So we examined its ability to manipulate Word documents. While it worked, we found it not robust or mission-critical-ready enough yet for our application and needs, especially for mail merge work (critical for us). Nor could it convert the Word documents to PDF.

We also considered docx4j which is open source, pretty robust (including merge field support) and has an active community. But it does not directly convert Word to PDF either. Some people use iText for the conversion but we desired a one-tool solution and saw fidelity loss using iText.

We decided to explore the Aspose suite of products. Aspose.Words for Java allows manipulation of Word documents, and Aspose.PDF offers PDF manipulation functionality. The products are not open-source.

Aspose offers a no-cost 30-day evaluation license. Given the complexity of what we were doing, we ended up needing more than 30 days and Aspose graciously extended our evaluation licenses.

The Aspose.Words API is intuitive and permitted all the granularity and control we needed with a minimum of coding. It even had the ability to use some advanced Word mail merge features we thought would be beyond a non-Microsoft tool. One such feature is the nested mail merge region feature. This feature allowed several forms to be designed in a way that would have required a lot of custom code if we didn’t have the feature in our tool.

We were pleased with the speed of Aspose.Words in merging the fields and generating the PDFs. We were surprised to find Aspose.Words generates, populates and converts to PDF faster than our Velocity/Prince approach. We assumed the latter was more lightweight.

But wait, there’s more

While Aspose.Words provided the tools we needed to process documents with fields and convert them to PDF, we had other PDF needs as well:

  • Setting PDF fields directly.
  • Converting files in various formats (bmp, gif, jpeg, png, svg, tiff, and txt) to PDF,
  • Concatenating and splitting PDFs
  • Inspecting PDF documents to find and extract specific text in the PDF for finding e-signature fields

The tools that do these kinds of things are Aspose.PDFiText, PDFBox, and JPedal. Aspose.PDF was the only one that provided all of the features we required. Using one tool versus several was a major addition to the “pro” column for Aspose, especially given we were using Aspose.Words as well.

We had experience using iText before and gave it a shot. While you could find and extract page and x/y coordinates of fields, it wasn’t out of the box. You had to do some of the work yourself. You also could not use regular expressions. Only Aspose provided us the ability to use regular expressions, which was very helpful.

One of our requirements was the ability to merge a large number of PDFs (80-100) into one. Using iText, it took an unacceptable (for us) amount of time. We thought this might simply be the nature of the task, but we were surprised to find that Aspose did it significantly (and acceptably) faster.

We found Aspose.PDF was the most capable for converting image files to PDF with high fidelity. Below is one such conversion, from JPG to PDF.

JPG (top) converted to PDF (bottom) with Aspose.PDF

JPG (top) converted to PDF (bottom) with Aspose.PDF

Working with Aspose

We completed the prototyping for all our features using Aspose. As we worked more and more with it, we became convinced that it offered the most robust API, required the least amount of custom coding (very little), and was the most stable and reliable.

Some portions of the Aspose Java API are somewhat non-standard, with method names containing underscores and some that could be more intuitive (e.g. getRectangel_Rename_NameStake Form.get_xfa()). But Aspose appears to be cleaning up several of these issues each release.

Because you don’t have access to the source code, for real nitty-gritty questions or debugging you must rely on the detail level of the documentation (which is quite good), and support (also good). Some of the Java documentation isn’t as complete as the .NET documentation, but the APIs are so similar, you can reference the .NET version even though you’re using Java. And the documentation includes plenty of (.NET and Java) examples for most operations. These are most helpful.

In addition to the API and functionality itself, we were impressed with other factors of the product as well.

The Aspose development and release cycle indicates an active project, with monthly updates generally consisting of dozens of improvements or fixes. Until recently, Aspose did not have a Maven repo, but support was open to the idea and said they’d do that soon. (Maven users will know that without it, you must download the jar after each monthly release or create a Maven artifact in your own repository.) In the two months or so since we brought it up, they have already created a Maven repo. A good sign that they both listen to customer input and are actively improving their product.

The Aspose support staff has been very responsive to our queries, even during our evaluation phase. We typically receive a response with a helpful answer in under 24 hours. Support often asks for code and the artifacts so that they can replicate the problem. The staff has shown themselves to be very technically competent in understanding and resolving our issues, and pleasant to work with.

Aspose it is

Considering all these factors, we decided to go with Aspose.Words and Aspose.PDF. They integrated well into our application, our engineers found it easy to use, and the users are able to work with their familiar Microsoft Office tools. They’re robust, solid, and fast and Aspose provides good documentation and support. They’re not inexpensive, but you pay for quality and they do the job we need very well.

As much as we use and support open source libraries and tools, sometimes the commercial product is the better choice.

Given our good experience with Aspose.Words and Aspose.PDF, we are planning on evaluating Aspose.Cells in the future, to handle our application’s need to process Excel spreadsheets.


Driving Development Without A Governor

In agile development, you hear a lot words like “lean”, “velocity”,” extreme”, and well… “agile”. They evoke an image of speed – nimble and fast, like a cat. With agile we’re supposed to be moving quickly and responding to change easily. These are both good things for sure, but is there such a thing as moving too quickly?

Speed limiters in automobiles are termed governors. One use of the governor is to limit the rotational speed of the engine to protect the engine from damage due to excessive speed. So is there such a thing in Agile development? Do we need such a thing? Agile development’s major benefit is to get continuous feedback from your customers as you build. You are continuously planning, testing, integrating, and refining as the requirements change and grow. The faster you build and deploy, the sooner you’ll get new feedback, but the rush to completion can cause its own problems.

The same way that the governor in a car prevents you from wrecking your engine, and limits how fast you could recklessly drive, an agile governor should help prevent making major errors in your code and accruing technical debt.

Technical debt can occur in several forms: pieces that were meant to be refactored never are, not enough brain cycles are put into the initial design, code gets copied and pasted more than it should. Then, when the customer gives you an approval of a feature, there is always the temptation to close the feature and move on to the next without having one last look to fix things you promised yourself you would at the end. You may end up with a feature that on the surface appears to work, but internally is hard to test, hard to read, inefficient and impossible to enhance without the fear of introducing bugs.

So we still want to deploy features fast to get feedback, but without hurting the integrity of the feature. What is the governor in agile development?

As you develop you should be writing unit tests. (You are writing unit tests, right?) This is one kind of governor.  The benefits of unit tests are well known, but it’s more than  testing correctness and finding regressions. It’s when you stop and think about the use cases and how they will be handled. In essence, it forces you to think through the problems you will encounter and design solutions to those problems. It forces you to slow down and think a little instead of just head-down code.


For a long time, we thought unit testing was enough. In the past 10 years, testing has really become mainstream, so it should be assumed at this point, right? But is there more?

We’re a small team, with a lot to do. Revolutionizing lending takes a lot of work. This isn’t a project management tool or the next wannabe social network. We can be as agile as possible, but our minimum viable product is still not very minimal. We feel the constant pressure to get the next feature checked off before we can hit our 1.0.

We have really good quality code and we were testing, but a few months ago, we put our need for speed aside, and wanted to make sure our quality was even better. We devoted even more time to our deployment process and quality assurance.   What we found is that we could improve even more. The unit tests and  testing were not enough.  A more robust quality assurance process was a speed limiter for us, but again, that might just be a good thing.

With the increased QA, we found that it would probably be worthwhile to do more code reviews. Not just sometimes. Not just when someone asks. Always. We held off on this one longest, worried about the cost in terms of productivity. What we found is that code reviews are just as important as unit testing and even more important in many respects.  Code reviews create accountability for the feature. There is something about knowing someone else will be looking at your code that encourages you to design and implement a feature well. This is the time where someone with prejudice attempts to find all flaws in the design or implementation. So it behooves the requester to look at things one more time before the code reviewer starts. It is not meant to embarrass, but an opportunity to catch issues upfront, exchange ideas, and learn the feature. It also gives the developer and reviewer an opportunity to learn from each others strengths and weaknesses. The code reviewers must not rush and just skim the code, as this bypasses all the benefits of the code review.

It is tempting to develop without tests, code review, and thorough QA, the same way some gear heads love to get rid of their speed limiters to see if they can make a car go faster than it was ever meant to. What we found, though, is that its better to get there in one piece. We can continue writing code in a fast pace but as long as we don’t rush unit testing, code reviews, and quality assurance, we’ve created an agile development environment that limits how much we can rush, which keeps us at a safe speed.

What about you? Do you drive development without a governor?

Web Platform

Bootstrapped up with nowhere to go

or: How I Learned to Stop Using Twitter Bootstrap and Love LESS

Octane (our product) has been around for a while – we’ve watched the trends in UI design, frameworks, and developer mindshare changing along the way. We’ve been from jQuery to YUI back to jQuery, with a brief consideration of Dojo along the way. At this point our codebase is mostly custom css and mostly custom javascript widgets, with some great third party libraries where they  met our standards and fit our needs exactly. I remember when Bootstrap first came out – it was fresh, and had a lot of good ideas. We were doing some new development work at the time in a separate part of the app – something which would be more public facing and could use a bit of a fresh start with new choices. We took a leap of faith on Bootstrap and followed it through 2.0, investing in LESS and integrating it into our build.

Recently, we’ve started a pretty large UI overhaul, with fresh design work, heading in a slightly new direction, we were going to try to build something more cohesive across our projects. It was a flat design, going along with the trends, and we found the Flat UI theme built on bootstrap which matched a lot of our design. It seemed like a clear win.

When work began, I devoted some time to getting up to speed on what the current best practices were in the CSS world. I knew we had built up some cruft, and I wanted to clean it up along the way. I found OOCSS and SMACSS and decided to give it a deeper look. I watched some presentations and read the SMACSS book, and it all clicked pretty well. Many of the ideas I was familiar with, such as the technique of using multiple classes to refine styles, but I hadn’t really gone through the level of structural rigor that OOCSS and SMACSS prescribe. It made sense to me, though, so I wanted to use it during our overhaul in conjunction with Bootstrap, which didn’t seem to conflict too much…

Well, long story short, after really digging in and trying to make it all work – to make the app work the way I needed it to, and look the way I needed it to – I found that the number of overrides I was making became much more than the value proposition of Bootstrap to begin with. Combined with the Flat UI theme, I was writing overrides of overrides. I had two sets of variables and mixins, plus my own. I was dropping entire chunks of functionality like grids and forms because they were doing more harm than good. In the end, I ultimately used some code from Bootstrap and the Flat UI theme, but I condensed it all without overrides, and really just used it as a scaffolding for the final design.

So what’s the conclusion here? Should people avoid Bootstrap? Would I do it over again differently? This is my final takeaway:

  • I’m glad Bootstrap got me into using LESS (judiciously).
  • Bootstrap helped me figure out a good way of organizing my LESS files.
  • Bootstrap has some good ideas about breaking down some of the fundamental components of a page and figuring out what those pieces are. What the vocabulary is. A lot of that I kept even if the css, classnames, and even html changed.
  • I’m a snob and I think the dropdown and tooltip widgets (the only two I even tried using) are kind of crap. They work fine in prototyping, but have fundamental flaws.
  • Bootstrap makes tradeoffs to have nicer looking html. We take a different approach to get nice view code, and don’t need to make those tradeoffs.
  • Finally, I realize that not all apps or websites are like Octane. Other people may have no time or talent for design or implementation of custom CSS, in which case, Bootstrap is a pretty good foundation. It also works pretty well for just prototyping or scaffolding.

Before I made the final decision to remove our dependency on Bootstrap, I looked around to see what others were saying. There’s a lot of praise for Bootstrap out there. Given that my ideals had shifted to OOCSS/SMACSS, I was curious about that intersection. I got my answer here. To pull some choice quotes:

“Yeah, my initial impression has been that design is too tightly coupled to structure. You can’t easily reskin w out ripping out or overwriting their styles.”

That was from Nicole Sullivan, creator of OOCSS and well respected authority of good CSS.

“I’ve used it for 3 projects now. At first I loved it, but now I think that I live only a few aspects of it – the mixins, icons and buttons.

It is an amazing project, but I actually prefer the lightness of the OOCSS framework.

While you can customize things in bootstrap, you ant take it too far before things start to break.

In the end, what it’s taught me about LESS is what I appreciate the most. I’m building a LOOCSS project :-p”

That was from another person in the thread, and it really echoed my own experience.

I’ve got more thoughts and experiences on OOCSS and SMACSS, and where we’re going with it, but I’ll just leave that for another post.

Web Platform

You’ve got to think for yourselves.

You are all individuals.

I hate when technical debates turn into religious wars. Today’s war on tap? Progressive enhancement.

For anyone who hasn’t heard the ruckus, here’s a quick summary:

  • Progressive Enhancement (PE) became a term in 2003 and by 2008 became a best practice. It was a mantra chanted repeatedly, the same way CSS over tables was pounded into the minds of every web developer. Here’s a nice history/overview.
  • Within the past couple of years, as browsers have gotten better, single-page applications have gained a lot of traction, completely abandoning the page request model and rendering all html client-side.
  • Many proponents of PE, noticing the trend, have written/presented about why it is still important.
  • On August 27th 2013, a tumblr called Sigh, JavaScript was launched “reminding” people that PE is still important by posting images of sites that don’t work without JavaScript.
  • On September 2, 2013, Tom Dale (from ember.js) posted an article declaring PE “dead”. Despite the article’s controversial and hyperbolic title, it was (IMHO) actually fairly balanced and well reasoned throughout.
  • Chaos ensued on twitter and the rest of the web.

There have been some other summary posts trying to provide their own insight. Here’s mine:

I think the information presented by both sides is far more valuable than the conclusions they reach. They both have compelling arguments, and they both have good examples to support their case, but as I hope we’ve all learned time and again, you should never use dogma in place of actual thought. If you look at the data and arguments presented, I think you can actually come away with valuable guidelines when making decisions, but despite opposing conclusions, I think the guidelines compliment each other well as long as you keep context in mind.

If you have content that you want deep linkable, especially something as small as a tweet, you damn well better be able to load the page fast. What is the fastest way to do that? Serving the html for initial page load is pretty damn good at it. This is probably not just tweets, but news articles, blogs, recipes, etc. However, even if you go that route, remember that you’re making an argument from a position of efficiency/good user experience. And guess what, there’s a million other things that go along with that. What are you doing about ads? What about number of http requests, etc. Serving html and enhancing is just one tool in the arsenal for faster web apps. I will also go out on a limb and say that it’s not really the original point of PE.

The efficiency argument has another flaw I haven’t seen get much attention too – if you’re going to delay rendering until after JavaScript runs to prevent the dreaded FOUC, then in terms of efficiency, the argument is basically dead right there, at least for the cases where the progressive enhancement actually has to update the DOM before rendering. In both cases the browser has to wait until all of the JavaScript is loaded and executed before rendering anything. Sure it will work without JS turned on, but that is a different argument than the efficiency one.

So what about the other argument for PE? What about no JavaScript? PE started as a movement because of the potential lack of JavaScript by the client – which (as they always push home) also means google! This was the PE argument that Tom Dale was mostly railing against in his post. He posits that the userbase with JavaScript turned on is high enough that if you want to use 100% JavaScript, that’s ok, especially if you don’t care about google. I found his argument logical and sound. The web platform has grown up a little. JavaScript can be expected for a wide enough audience, that you can now make the case for ignoring the few with it turned off. It may not be for everyone, but it is an extremely valid choice – one which can be made more often. It is the choice that we made for Octane.

The way I see it, one of the most powerful features of the web platform is how flexible it is. Some kind of html rendering is available on almost every electronic device or platform with a screen. That is a virtue of the web, not a contract. The logic that all software available online has to be usable from all possible devices is clearly false. Not all software is made to work that way, and not all software that can be made to work that way is. So where is the line? And what exactly is the boogie man we are attempting to fight off with PE that it should be “ingrained in the DNA of anyone who works on the web“. Is it a morality argument or simply a business one? As “Sigh, JavaScript” demonstrates (unintentionally), there are clearly many successful businesses with websites that don’t work without JS. As for morality, with the exception of resources such as government websites, I have trouble finding the moral argument.

Is PE dead? No. Is it necessary for all projects that can do it? No. In some projects, is it a requirement? Yes. For the rest, it is an optimizaion. It is like choosing to create a native app for blackberry or windows 8. It is a competitive advantage to be able to reach users in more places. In some cases, it could be hugely important to some of your users. Ultimately, though, the goal is to make the best product you can, and you will always have to make hard choices about the best way to do that. It will always be a balance. Add a new feature or make the product faster? Fix a bug or improve documentation? Improve the UI or support IE7? These aren’t dichotomies, they are priorities. The amazing thing about the web is that with enough time, you might be able to do it all, but how you get there is up to you.