Monday, January 21, 2008

SAXParseException: -1:-1: Premature End Of File - Misleading error

Today I had to look at a piece of code a colleague had written, using my XPathAccessor class. She used it in a servlet which gets XML formatted requests. As those are generated by an external 3rd party tool we agreed on some XML schema definitions. Everything they send us needs to conform to its corresponding schema, each reply we send gets validated against a different set.

In order to allow independent testing on either side, we provided a little test kit that allows testing our system without having to set up a servlet engine. Basically it just takes a file, reads it into a String and hands that to the handler.

First it gets parsed without validation. This is necessary to find out which type of request we were send (the address is the same for all of them). After the root element is known, it will be read again, this time using the right schema to verify the request.

Once that is done, some reply is put together and sent back to the client. So far, so good.

When I looked at the code I could see nothing wrong. Nevertheless each time a request was sent, we got a

2008-01-21 10:02:12,889 INFO  [STDOUT] [Fatal Error] :-1:-1: Premature end of file.
org.xml.sax.SAXParseException: Premature end of file.
    at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)

I had a look at the source of the XML input and first suspected a superfluous CR-LF after the closing element of the root tag. On the net some people claimed that this might be a cause for the error above. But removing that did not help.

This is the relevant code that handles the request:

public String readXmlFromStream(InputSource anInputSource) {
    String tResult = null;
    try {
        XPathAccessor reader = new XPathAccessor(anInputSource);
        String type = reader.xp("rootElem", "reqType");
        if (type.startsWith("K")) {
            Schema schemaK = XElement.getSchema(this.getClass()
                .getResourceAsStream("/schema/K.xsd"));
            XPathAccessor validatingReader = new XPathAccessor(anInputSource, schemaK);
        ...

The last line throws the "Premature end of file" SAXParseException. The constructors of XPathAccessor look like this:

public XPathAccessor(InputSource aSource) throws SAXException, IOException, ParserConfigurationException {
    this(aSource, null);
} 

public XPathAccessor(InputSource aSource, Schema aSchema) 
    throws SAXException, IOException, ParserConfigurationException {

    Validator validator = null;
    builder = factory.newDocumentBuilder();
    document = builder.parse(aSource);
    if (aSchema != null) {
        validator = aSchema.newValidator();
        validator.validate(new DOMSource(document));
    }
} 

Curiously in case of the Servlet no files are involved at all. Everything is in memory, so "Premature end of file" is not too helpful anyway. The solution to this mess can be found - sometimes it turns out to be helpful - in the API documentation for the InputSource:

An InputSource object belongs to the application: the SAX parser shall never modify it in any way (it may modify a copy if necessary). However, standard processing of both byte and character streams is to close them on as part of end-of-parse cleanup, so applications should not attempt to re-use such streams after they have been handed to a parser.

This last sentence is the clue: Because the InputSource has been used to find out the type of request, it cannot be used again for the validating XPathAccessor. In that light the error message at least makes a little sense: The underlying stream has been read to its end and been closed, so one might call that the "end of file"; and because the (2nd) XPathAccessor has just tried to read it from the start, "premature" might be a valid qualifier...

Knowing that also explained why the test suite worked fine; it read the XML contents into a String, which another set of overloaded constructors for XPathAccessor can accept. Of course strings can be read as often as you like, so no problems there.

As the docs do not give an immediate hint, I hope someone finds this post to save him/her some time.

55 comments:

Dave said...

Great explanation! Took me a while to find it, but that's exactly what I was running into. I didn't realize I had to make copies of my input stream for each call to the SAX parser.

meawww said...

Thanks for a clear explanation! I was caught by this exception yesterday and have spent all day struggling with it. This morning google took me to your blog (lucky me), and now I understand what's going on in my code :)

katis said...

Thx for thi solution - really saved me some time :)

Roberts said...

jep it saved me some time! :)

tolya said...

Thanks a lot! It really helped.

Ganesh said...

Thanks a lot, its really helpful.

Mike said...

great! Thanks a lot.

Anonymous said...

poso malakes eiste re?
arxidia mas eipes re papara

Anonymous said...

thx Great explanation!

Anonymous said...

Thanks a lot! Great explanation!

Anonymous said...

Thanks a lot, that's exactly the problem we were facing as we allowed a concurrent modification of the underlying xml.

vivek said...
This comment has been removed by the author.
Anonymous said...

Thanks a lot.. Thats exactly the problem I was facing for my DTD which I read in a static block as a static variable and then even though the first XML parsing went fine, the second XMl import using the same DTD stream failed due to the stream being closed by first parse.

sudipta said...

Great explanation. Very very tnaks to you and to Google too, to take me to your blog.

Anand said...

Thanks! It Helped me!

Just created two streams from the same source, and used at different places...and magic!

Joost said...

Thanks for this explanation. It will save me loads of time figuring out myself.

KaJun said...

Thanks! This helped me with my troubleshooting, although I was getting the same exception on the xml prolog.

sudhin philip said...

Thanks a lot!!!! This is some thing really great! My kudos to you..

Anonymous said...

Thank you for taking the time to share what this cryptic error message means. You probably just saved me days of trouble!

Anonymous said...

Hi,
I'm facing same error. I have two methods one invokes webservice with HttpClient and gets XML in response and returns response as inputStream. Second method takes InputStream as an parameter and parses xml. In the first method I'm not able to see XML as a response. Any pointers to resolve this exception?

Anonymous said...

cool tip, helped a lot, thanks!

Solsticiu said...

Thanks a lot Daniel. It was really helpful

Anonymous said...

Good explanation!

Thanks a lot!

Koteswar said...

Good Explanation

Wang Geng's space said...

Cool. very helpful

Frank T. said...

Thank you for that details.

Tried to (re)parse an InputStream and recieved similar exception.

Based on your hints I solved this issue by parsing the InputStream for the top level node (/*). Further parsing may be done relative to top level node, e.g. ./Data/@value.

Thanks - Frank

Daniel said...

TA mate, saved me some time!

Anonymous said...

Many thanks :)

Anonymous said...

THANKS!

Anonymous said...

Daniel
That was really informative.
But i m kinda stuck in a same situation and not able to resolve.
If you could provide me with some pointers on it.
http://www.coderanch.com/t/546316/Web-Services/java/Getting-Exception-Premature-End-file

Thanks a lot

Satish said...

wow! very nice explanation!
Thanks.

Anonymous said...

Thanks!

You just saved me some debugging time. Receive some well-deserved karma points from here.

Gerben said...

Already wasted alot of time fixing my problem, but you defo saved me from more wasted time :)

karthik said...

thnx, it really was very useful

Anonymous said...

Thanks. Really worked and saved time. Hope others find their way to this explanation.

Justin Johnson said...

Four years later and you're still saving asses. Thanks a lot!

raghu varma said...

I agree with above comment. Thanks for sharing.

Anonymous said...

Great explanation! Thanks!

Anonymous said...

Great explanation! Thanks!

Anonymous said...

Saved me and my company some time!

Anonymous said...

Another one to add to the list of thank you's :)

Anonymous said...

Sorry but I did not get it, What could be the solution ??? thanks in advance.

Anonymous said...

Thanks a lot

Clement Levallois said...

still useful in Oct 2012....

Ribhu Bhaskar said...

This was really useful !!

Ribhu Bhaskar

radha said...

The exact problem I was facing. This post helped me fix it fast. Thanks

codesurfer said...

Very useful post indeed, thanks!

kiran said...

Thanks, I was able to resolve an issue due to this post.

kiran said...

Thanks, I was able to resolve a similar issue.

Anonymous said...

I never forget your name Daniel :) good article .

Robert Walker said...

i had code like so
out = response.getWriter(); Then i did my transform with
transformer.transform(xmlStreamSource, new StreamResult(out)); followed by what i though was harmless. [out.flush() and out.close()]. this article was a lifesaver. oddly though, it was working for over year as is then started going belly up and didn't know how to fix, removing those out.flush/close did the trick.

Vipin said...

Thanks Daniel. It saved my lot much time. My code was like -
return (xPath.evaluate(xPathExpression, new InputSource(new StringReader(body))));

I just made the like -
InputSource inputSource = new InputSource(new StringReader(body));
(xPath.evaluate(xPathExpression, inputSource));

Kartik said...

This was quite useful.. Thanks!

Naat said...

I’m impressed, I must say. Rarely do I encounter a blog that’s both educative and interesting, and without a doubt, you've hit the nail on the head. The problem is an issue that too few folks are speaking intelligently about. I'm very happy that I came across this in my search for something concerning this.

Anonymous said...

really helpful. Thanks to your blog I won't bang my head against the wall with this problem.