Sunday 16 November 2014

Java: equals and hashCode performance

I wanted to benchmark KλudJe's equals/hashCode implementation against common alternatives. I used JMH 1.2 to measure performance.

POJO

A POJO with four properties was used as a test source:

public abstract class Pojo {
    protected final long id;
    protected final String data;
    protected final java.time.Instant time;
    protected final int count;

//constructors etc.

Subclasses were then created with overridden equals and hashCode methods using a variety of approaches.

Equality and Hashcode APIs

API Equals Hashcode
None in-line code generated by IDEA in-line code generated by IDEA
JDK 8 java.util Objects.equals Objects.hash
Apache Commons Lang EqualsBuilder HashCodeBuilder
KλudJe Meta.equals Meta.hashCode

Guava's Objects type was also considered but there were no measurable differences between it and the java.util.Objects equivalents.

Code style preferences and tolerance of boilerplate varies from developer to developer and according to criticality of performance. Sample implementation code for the measured APIs:

IDE-Generated Code
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;

        ControlPojo pojo = (ControlPojo) o;

        if (count != pojo.count) return false;
        if (id != pojo.id) return false;
        if (data != null ? !data.equals(pojo.data) : pojo.data != null) return false;
        if (time != null ? !time.equals(pojo.time) : pojo.time != null) return false;

        return true;
    }

    @Override
    public int hashCode() {
        int result = (int) (id ^ (id >>> 32));
        result = 31 * result + (data != null ? data.hashCode() : 0);
        result = 31 * result + (time != null ? time.hashCode() : 0);
        result = 31 * result + count;
        return result;
    }
JDK
    @Override
    public boolean equals(Object o) {
        return (o == this)
                || (o instanceof UtilPojo && equals((UtilPojo) o));
    }

    private boolean equals(UtilPojo pojo) {
        return (id == pojo.id)
                && count == pojo.count
                && Objects.equals(data, pojo.data)
                && Objects.equals(time, pojo.time);
    }

    @Override
    public int hashCode() {
        return Objects.hash(id, data, time, count);
    }
Apache Commons Lang
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof ApacheBuilderPojo)) return false;

        ApacheBuilderPojo pojo = (ApacheBuilderPojo) o;

        return new EqualsBuilder()
                .append(id, pojo.id)
                .append(data, pojo.data)
                .append(time, pojo.time)
                .append(count, pojo.count)
                .isEquals();
    }

    @Override
    public int hashCode() {
        return new HashCodeBuilder()
                .append(id)
                .append(data)
                .append(time)
                .append(count)
                .toHashCode();
    }
KλudJe
    private static final Meta<KludjePojo> META = Meta.<KludjePojo>meta().longs($ -> $.id)
            .ints($ -> $.count)
            .objects($ -> $.data, $ -> $.time);

    @Override
    public boolean equals(Object o) {
        return META.equals(this, o);
    }

    @Override
    public int hashCode() {
        return META.hashCode(this);
    }

Benchmarking

The benchmark tested the equals/hashCode contract of the POJO implementations.

I was primarily interested in measuring throughput and potential garbage collection overheads so executed JMH with:

java -jar target/benchmarks.jar -prof gc

Output:

Result: 125862.208 ±(99.9%) 1826.486 ops/s [Average]
  Statistics: (min, avg, max) = (90944.150, 125862.208, 135807.393), stdev = 7733.456
  Confidence interval (99.9%): [124035.722, 127688.694]

Result           "@gc.count.profiled": 295.000 counts [Sum]
Result              "@gc.count.total": 74.000 counts [Maximum]
Result            "@gc.time.profiled": 3632.000 ms [Sum]
Result               "@gc.time.total": 952.000 ms [Maximum]

# Run complete. Total time: 00:32:47

Benchmark                                        Mode  Samples       Score      Error   Units
d.EqualsBenchmark.control                       thrpt      200  171684.016 ± 2094.186   ops/s
d.EqualsBenchmark.control:@gc.count.profiled    thrpt      200       0.000 ±      NaN  counts
d.EqualsBenchmark.control:@gc.count.total       thrpt      200       0.000 ±      NaN  counts
d.EqualsBenchmark.control:@gc.time.profiled     thrpt      200       0.000 ±      NaN      ms
d.EqualsBenchmark.control:@gc.time.total        thrpt      200       0.000 ±      NaN      ms
d.EqualsBenchmark.util                          thrpt      200  125862.208 ± 1826.486   ops/s
d.EqualsBenchmark.util:@gc.count.profiled       thrpt      200     295.000 ±      NaN  counts
d.EqualsBenchmark.util:@gc.count.total          thrpt      200      74.000 ±      NaN  counts
d.EqualsBenchmark.util:@gc.time.profiled        thrpt      200    3632.000 ±      NaN      ms
d.EqualsBenchmark.util:@gc.time.total           thrpt      200     952.000 ±      NaN      ms
d.EqualsBenchmark.apache                        thrpt      200   46369.334 ±  449.429   ops/s
d.EqualsBenchmark.apache:@gc.count.profiled     thrpt      200     704.000 ±      NaN  counts
d.EqualsBenchmark.apache:@gc.count.total        thrpt      200     154.000 ±      NaN  counts
d.EqualsBenchmark.apache:@gc.time.profiled      thrpt      200    6939.000 ±      NaN      ms
d.EqualsBenchmark.apache:@gc.time.total         thrpt      200    1865.000 ±      NaN      ms
d.EqualsBenchmark.kludje                        thrpt      200   65240.597 ±  756.239   ops/s
d.EqualsBenchmark.kludje:@gc.count.profiled     thrpt      200       0.000 ±      NaN  counts
d.EqualsBenchmark.kludje:@gc.count.total        thrpt      200       0.000 ±      NaN  counts
d.EqualsBenchmark.kludje:@gc.time.profiled      thrpt      200       0.000 ±      NaN      ms
d.EqualsBenchmark.kludje:@gc.time.total         thrpt      200       0.000 ±      NaN      ms

Conclusion

To get the best performance, let your IDE generate the code. If you're willing to suffer some performance degradation to get cleaner code consider the throughput and garbage collection overheads of your chosen API.

This isn't a particularly comprehensive benchmark so take the results with a grain of salt.

Code

The benchmark code is on GitHub:

git clone https://github.com/mcdiae/allequals.git

No comments:

Post a Comment

All comments are moderated