The JDK’s XSD 1.0 Grip and the Elusive Xerces Upgrade

Ever found yourself staring down a legacy application, a dusty relic from a bygone era of software development? I certainly did this past week. My mission: to breathe new life into an old Java application that, among other things, analyzes proprietary XML files. Now, I know what some of you junior developers might be thinking: “XML? Seriously? That’s not exactly cutting-edge!” And you’d be right, it’s not the blockchain or AI. But here’s the kicker: XML has an often-underestimated superpower – the ability to validate a file against a defined grammar. This grammar, my friends, is called an XSD, or XML Schema Definition. Fun fact for the curious: you write XSDs in XML itself.
The application I was tasked with already leveraged XSDs for validation, which was a good starting point. However, like many things in older systems, it was stuck in a past version: XSD 1.0. Fast forward to today, and XSD 1.1 has been around for a while, bringing with it some genuinely significant enhancements. Specifically, I was eyeing the new capabilities for assertions and identity constraints, features that could finally implement numerous `//TODO validate` comments scattered throughout the Java codebase.
I thought this upgrade would be a straightforward affair. A quick dependency bump, perhaps a minor code change, and boom – XSD 1.1 power unleashed! Oh, how wrong I was. What followed was a rabbit hole of discovery, frustration, and ultimately, a hard-won victory. Join me as I recount the journey, the roadblocks, and the surprising solution to getting XSD 1.1 validation working in Java.
The JDK’s XSD 1.0 Grip and the Elusive Xerces Upgrade
My first instinct, and likely yours too, was to assume that if XSD 1.1 was a W3C Recommendation, Java’s underlying XML processing would have embraced it. It turns out that the JDK, under the hood, uses a bundled and slightly customized version of Xerces for parsing. You can peek into this world yourself by exploring the `com.sun.org.apache.xerces.internal.jaxp` package within your installed JDKs. The catch? This internal implementation is firmly rooted in XSD validation 1.0.
So, the next logical step was to bypass the JDK’s internal components and pull in the “real” Apache Xerces project. Surely, the official, standalone project would have progressed and implemented XSD 1.1 by now? I added the latest available Xerces version to my build, recompiled, and… nothing. The validation still behaved like XSD 1.0. This sent me digging through the Xerces JARs themselves.
A Constant Puzzle: SCHEMA_1_1_SUPPORT = false
My investigation quickly led me to a file named `Constants.java` within the Xerces library. And there it was, staring back at me:
public final class Constants { /** XML 1.1 feature ("xml-1.1"). */ public static final String XML_11_FEATURE = "xml-1.1"; // Constant to enable Schema 1.1 support public final static boolean SCHEMA_1_1_SUPPORT = false; public final static short SCHEMA_VERSION_1_0 = 1; public final static short SCHEMA_VERSION_1_0_EXTENDED = 2;
}
If you’re as puzzled as I was about `SCHEMA_1_1_SUPPORT` being a hardcoded `false` constant, welcome to my world of head-scratching. This wasn’t a configuration I could simply flip. It was deeply embedded. My curiosity, or perhaps my stubbornness, pushed me further down the rabbit hole.
I discovered two potentially promising branches within the Xerces-J project’s source control: `xml-schema-1.1-dev` and `xml-schema-1.1-tests`. Naturally, I checked the `Constants` class in the `xml-schema-1.1-dev` branch, hoping for a different outcome:
public final class Constants { /** XML 1.1 feature ("xml-1.1"). */ public static final String XML_11_FEATURE = "xml-1.1"; // Constant to enable Schema 1.1 support public final static boolean SCHEMA_1_1_SUPPORT = false; public final static short SCHEMA_VERSION_1_0 = 1; public final static short SCHEMA_VERSION_1_0_EXTENDED = 2; public final static short SCHEMA_VERSION_1_1 = 4; }
Still `false`! Though the inclusion of `SCHEMA_VERSION_1_1 = 4` felt like a tantalizing hint of future support, it wasn’t production-ready. I scoured Maven Central for a dedicated Xerces artifact specifically for XSD 1.1, but my search yielded nothing. Later, while writing this very post, I remembered checking the official Xerces downloads page. There *is* indeed a XSD 1.1 distribution available for download. But that would mean manually downloading the JAR, crafting a dummy POM, and then publishing it to an internal Artifactory – a task made infinitely harder by my lack of write access. Possible, yes, but far too time-consuming for the immediate need.
The Alternative Detour: Saxon’s Strengths and Stumbling Blocks
With Xerces proving to be a dead end for a straightforward upgrade, I started looking for alternatives. My search repeatedly pointed to one name: Saxon. For those unfamiliar, Saxon is a robust and comprehensive collection of tools for processing XML documents. It’s renowned for its XSLT, XPath, and XQuery processors, and crucially, it also includes an XML Schema Processor.
The documentation for Saxon clearly stated its support for both XSD 1.0 and XSD 1.1. This sounded like the perfect solution! I dove into its API, imagining my problems dissolving. However, a deeper investigation quickly revealed two significant drawbacks that put a damper on my enthusiasm:
Proprietary APIs and Pricing Tiers
First, while Saxon *can* be used with the standard JAXP (Java API for XML Processing) API, to truly unlock its full power – including its XSD 1.1 validation capabilities – you’re encouraged, if not required, to switch to Saxon’s proprietary API. This meant a more substantial refactoring effort than I had initially budgeted for, moving away from standard Java platform APIs.
Second, and perhaps more critically, Saxon comes in two distinct flavors: the Enterprise Edition (EE) and the Home Edition (HE). The Home Edition is free and open source, which was ideal for my budget-constrained project. The catch? The Home Edition *does not* offer XSD 1.1 validation. That feature is reserved exclusively for the paid Enterprise Edition. With no budget for new software licenses and a tight deadline, Saxon, despite its technical prowess, became another path I couldn’t take.
The Unexpected Lifesaver: AI and a Forgotten Maven Gem
At this point, I felt like I was back at square one. Building Xerces from source was not feasible. Saxon was off the table due to cost and API lock-in. My last hope, a somewhat desperate one, was to consult an AI assistant. And I must say, for once, it was a lifesaver.
It turns out that two Xerces builds with XSD 1.1 features *do* exist on Maven Central! They were published years ago, between 2015 and 2016, by OpenGIS, during a period when publishing artifacts to Maven Central was perhaps less stringent. A quick look at their POM actually links directly to the `xml-schema-1.1-dev` branch I had found earlier. This meant a fully functional, albeit somewhat “wild,” version of Xerces with XSD 1.1 support was out there, available as a dependency!
With the correct dependency in hand, the rest was mercifully straightforward. I was just a few lines of code away from achieving my goal:
var schemaFactory = SchemaFactory.newInstance(Constants.W3C_XML_SCHEMA11_NS_URI); var schema = schemaFactory.newSchema(schemaFile);
var saxParserFactory = SAXParserFactory.newInstance();
saxParserFactory.setNamespaceAware(true);
saxParserFactory.setSchema(schema); var reader = saxParserFactory.newSAXParser().getXMLReader(); reader.setContentHandler(handler);
reader.setErrorHandler(handler);
reader.setEntityResolver(handler);
The key here is using `Constants.W3C_XML_SCHEMA11_NS_URI` when instantiating the `SchemaFactory`, explicitly telling it to use the XSD 1.1 schema version. Then, it’s a matter of setting that schema on your SAXParserFactory, and you’re good to go. Of course, I still need to properly vet this “wild” build for any potential security implications, given its age and origin. But functionally, it works: I can now leverage the powerful assertion and identity constraint features from XSD 1.1!
Conclusion
My journey to implement XSD 1.1 validation in a legacy Java application was far more convoluted than I initially anticipated. What seemed like a simple version upgrade turned into a deep dive into the intricacies of Java’s XML processing, the evolving landscape of XML parsers like Xerces and Saxon, and ultimately, a reliance on an unexpected discovery on Maven Central.
I hope that by sharing my struggles and eventual success, this post will serve as a beacon for other developers who find themselves in the same predicament. Upgrading to XSD 1.1, with its richer validation capabilities, can significantly improve the robustness and data integrity of applications that rely on XML. Sometimes, the path to modernization isn’t straight; it’s a winding road filled with hidden constants, proprietary licensing, and the occasional helpful AI prompt. Happy coding!
To go further:
Originally published at A Java Geek




