Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayIndexOutOfBoundsException in com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer #207

Closed
fschopp opened this issue Jul 30, 2015 · 12 comments
Milestone

Comments

@fschopp
Copy link

fschopp commented Jul 30, 2015

The following code demonstrates a bug in jackson-core, version 2.6.0, in the hash table implementation of com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer. From a quick glance, it looks to me as if the "primary hash information area" _hashArea has a spillover area that is not accounted for properly in the String array _names.

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.testng.annotations.Test;

import java.lang.reflect.Field;
import java.lang.reflect.Method;
import java.nio.charset.StandardCharsets;
import java.util.Map;

/**
 * Simple test case for demonstrating bug in class {@link ByteQuadsCanonicalizer}.
 *
 * <p>In some cases, it is possible to work around this bug by disabling the
 * {@link JsonFactory.Feature#CANONICALIZE_FIELD_NAMES} feature. In that case
 * {@link com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper#constructParser(int, com.fasterxml.jackson.core.ObjectCodec, ByteQuadsCanonicalizer, com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer, int)}
 * creates a {@link com.fasterxml.jackson.core.json.ReaderBasedJsonParser} instead of a
 * {@link com.fasterxml.jackson.core.json.UTF8StreamJsonParser}.
 */
public class UTF8ByteStreamTest {
    private static final int SEED = -523743345;

    private static void injectReproducibleSeed(ObjectMapper objectMapper) throws Exception {
        JsonFactory jsonFactory = objectMapper.getFactory();
        // As a workaround, uncomment the following line.
        // jsonFactory.disable(JsonFactory.Feature.CANONICALIZE_FIELD_NAMES);
        Field byteSymbolCanonicalizerField = JsonFactory.class.getDeclaredField("_byteSymbolCanonicalizer");
        byteSymbolCanonicalizerField.setAccessible(true);

        Method factoryMethod = ByteQuadsCanonicalizer.class.getDeclaredMethod("createRoot", int.class);
        factoryMethod.setAccessible(true);
        byteSymbolCanonicalizerField.set(jsonFactory, factoryMethod.invoke(null, SEED));
    }

    @Test
    public void testRead() throws Exception {
        ObjectMapper objectMapper = new ObjectMapper();
        injectReproducibleSeed(objectMapper);
        StringBuilder stringBuilder = new StringBuilder();
        stringBuilder.append("{\n");
        stringBuilder.append("    \"expectedGCperPosition\": null");
        for (int i = 0; i < 60; ++i) {
            stringBuilder.append(",\n    \"").append(i + 1).append("\": null");
        }
        stringBuilder.append("\n}");
        objectMapper.readValue(stringBuilder.toString().getBytes(StandardCharsets.UTF_8), Map.class);
    }
}
@cowtowncoder
Copy link
Member

Thank you for reporting this. Surprising that the problem can be reproduced this easily.

cowtowncoder added a commit that referenced this issue Jul 31, 2015
@cowtowncoder
Copy link
Member

Looks like this is a degenerate case with very unusual number of collisions; but that also seems strange given that it is a straight-sequence of number strings.

@cowtowncoder
Copy link
Member

Ok. The real bug was in check to see where is the end of spillover area: was assuming hash array end is it, but this is not correct -- optional area exists for long names.

But I also changed short-name hash shuffling slightly to produce better distribution; test case had 50% spill rate which is way too high. With changes much more modest spillover rate, and also slightly improves hashing wrt existing tests.

@cowtowncoder cowtowncoder added this to the 2.2.1 milestone Aug 1, 2015
@tlrx
Copy link
Contributor

tlrx commented Sep 4, 2015

I also hit this exception with 2.6.1:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 512
    at com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer.addName(ByteQuadsCanonicalizer.java:822)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.addName(UTF8StreamJsonParser.java:2340)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.findName(UTF8StreamJsonParser.java:2185)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._parseName(UTF8StreamJsonParser.java:1691)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:740)
    at org.elasticsearch.common.xcontent.json.JsonXContentParser.nextToken(JsonXContentParser.java:53)

I have a test that basically creates strings like this:

{ "1" : "test" }
{ "2" : "test" }
{ "3" : "test" }
...
{ "200" : "test" }

and the test just parses each string using jsonFactory.createParser(is). The test always fail, but if I set jsonFactory.disable(JsonFactory.Feature.CANONICALIZE_FIELD_NAMES); the test succeed.

@pwebbitrs
Copy link

We have the same issue. For us, this is quite a critical problem.

I’ve done a bit of analysis on this ( it may not be complete but here it is anyway).

The ByteQuadCanonicalizer has a flag called _needsRehash which determines when they need to size up the internal _names array.

  • Prior to adding the offending string via addName, _needsRehash = false. Size of the names is 128 (full) (clearly it does require a rehash)
  • In addName() at this point the 2.6.1 ‘fix’ kicks in and sets needRehash = true
  • However at the end of this method we have:
this._names[offset >> 2] = name;
++this._count;
this._verifyNeedForRehash();
return name;

_verifyNeedForRehash only serves to flip the rehash flag. And even if it did, it is too late because the array accesor here blows up (first line of code above). This all occurs before an actual rehash can be performed (hence the blow up), so nothing prevents the exception from occuring (certainly not on this code path anyway).

I hope this helps. :)

@tlrx
Copy link
Contributor

tlrx commented Sep 4, 2015

@pwebbitrs I came to a similar conclusion: I also tested with 2.7.0-SNAPSHOT with forcing rehash() call in the ̀_verifySharing()` and it also works.

@cowtowncoder
Copy link
Member

(apologies for slow response -- I was on vacation for past 2 weeks, just came back now)

@tlrx Thank you for reporting this. Would it be possible to share your test code? Seems like my fix was incomplete and I would like to verify the (new) fix.

@cowtowncoder
Copy link
Member

@tlrx I don't doubt the problem exists, but unfortunately I can not reproduce this with test described above. There is probably just some minor variation on usage. Will try to modify the test on my end, but help would be appreciated.

@tlrx
Copy link
Contributor

tlrx commented Sep 9, 2015

@cowtowncoder thanks for your response!

I apologize, I wasn't able to provide a test to reproduce the issue... The failing test TestMergeMapperTests.testConcurrentMergeTest() is a bit complex: it uses concurrency, compression and custom wrappers around core Jackson parsers, making it difficult to reproduce it in a simple test.

I'll try again to remove custom code and reproduce it using Jackson classes only. In the meanwhile, I'll be happy to test a new fix if that can help.

@tlrx
Copy link
Contributor

tlrx commented Sep 9, 2015

@cowtowncoder OK, with a little more love I managed to reproduce the issue. I created a dedicated Java project with jackson-core-2.6.1.jar as lib and the class:

package com.company;

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonGenerator;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;

import java.io.ByteArrayInputStream;
import java.io.IOException;

public class Main {

    public static void main(String[] args) throws IOException {
        JsonFactory jsonFactory = new JsonFactory();
        jsonFactory.configure(JsonParser.Feature.ALLOW_UNQUOTED_FIELD_NAMES, true);
        jsonFactory.configure(JsonGenerator.Feature.QUOTE_FIELD_NAMES, true);
        jsonFactory.configure(JsonParser.Feature.ALLOW_COMMENTS, true);
        jsonFactory.configure(JsonFactory.Feature.FAIL_ON_SYMBOL_HASH_OVERFLOW, false); // this trips on many mappings now...
        // Field names canonicalization must be disabled with jackson 2.6.1
        // see https://github.com/FasterXML/jackson-core/issues/207
        //jsonFactory.configure(JsonFactory.Feature.CANONICALIZE_FIELD_NAMES, false);



        System.out.println("--> using Jackson version " + jsonFactory.version());

        for (int i = 0; i < 200; i++) {
            final String content = "{ \"" + Integer.toString(i) + "\" : \"test\" }";

            ByteArrayInputStream is = new ByteArrayInputStream(content.getBytes("UTF-8"));
            try {
                JsonParser parser = jsonFactory.createParser(is);
                try {
                    JsonToken token = parser.nextToken();
                    if (token != JsonToken.START_OBJECT) {
                        throw new IllegalStateException("Malformed content");
                    }

                    token = parser.nextToken();
                    if (token != JsonToken.FIELD_NAME) {
                        throw new IllegalStateException("Malformed content");
                    }
                } finally {
                    parser.close();
                }
            } finally {
                is.close();
            }
        }
        System.out.println("Test OK");
    }
}

It fails with Oracle jdk1.8.0_20 with the stacktrace:

--> using Jackson version 2.6.1
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 512
    at com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer.addName(ByteQuadsCanonicalizer.java:822)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.addName(UTF8StreamJsonParser.java:2340)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.findName(UTF8StreamJsonParser.java:2185)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._parseName(UTF8StreamJsonParser.java:1686)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:740)
    at com.company.Main.main(Main.java:39)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)

When the same code is included as a unit test in jackson-core project (master & 2.6 branch) it works...

@cowtowncoder
Copy link
Member

Thanks!

I suspect #216 is a dup; if so, will mark the fix under it.

@cowtowncoder
Copy link
Member

@tlrx I am able to reproduce this with given test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants