JPA entities contain an id field (or property) annotaded with @Id
. For this post I will assume a single field, but compound ids are possible as well.
Entity objects are business objects, so it is generally accepted that we will have to supply equals
and hashCode
for entity classes. Let me point out here, that it’s very unlikely for our business code to call any of these methods directly, because we don’t normally need explicit comparisons of entity objects in our business code, but we use collections which in turn may call equals
and hashcode
for looking up or adding entries.
We have several expectations regarding the semantics of equals
and collection operations expecially when looking at entity objects:
We have several expectations regarding the semantics of equals
and collection operations expecially when looking at entity objects:
- Objects based on the same database entry should be the same regarding
equals
and objects from different db rows should be different
a) including objects from different entity managers,
b) including detached objects,
c) including new (transient) objects,
d) even if some non-id attribute has been changed (they will end up in ‚their‘ db records!).
- Hashcode-based collections (e. g.
HashSet
) should behave friendly, i. e.
a) adding multiple objects should be possible – even for new (transient) objects,
b) the collection should remain intact even if contained objects get persisted into the db.
If the entity has a business id, i. e. the id attribute has some business meaning and is set by the constructor, the expectations can easily be met by using exactly the id attribute in equals
and hashCode
.
If you choose to compare all fields instead of just the id, your implementation breaks expectation 1.d). If you choose to not implement equals
and hashCode
at all and rather use the methods derived from Object
, your implementation breaks expectations 1.a) and 1.b).
So, as an intermediate conclusion, implement equals
and hashCode
in your entity classes and base them on just the id attribute(s):
@Entity public class Foo { @Id private String id; ... public Foo(String id, ...) { this.id = id; ... } public int hashCode() { return (this.id == null) ? 0 : this.id.hashCode(); } public boolean equals(Object obj) { if (this == obj) { return true; } if (obj == null) { return false; } if (getClass() != obj.getClass()) { return false; } Foo other = (Foo) obj; if (this.id == null) { return other.id == null; } return this.id.equals(other.id); }
Things get more complicated, if you want to use generated ids. JPA supports you in this with the annotation @GeneratedValue
which has to be placed on the id attribute in addition to @Id
. The generator works for integral number fields and it’s best to abstain from primitive types in order to have the additional value null
for unset ids. So you would use Integer
, Long
or BigInteger
depending on the amount of data entries expected:
@Entity public class Foo { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Integer id;
The problems with generated ids looking at identity and equality arise from the fact, that the entity manager will set the ids late – they may be populated not until the commit of the inserting transaction.
To begin with you will use the id field for comparison in equals
and in hashCode
as you did for business ids before. But if you do that in exactly the way shown above, you break expectation 2.a), because equals
will evaluate all new (transient) objects as equal, as their ids are all null
.
You can fix this by modifying equals
such that it returns false
if any of the compared ids are still unset:
public boolean equals(Object obj) { ... if (this.id == null) { return false; } return this.id.equals(other.id); }
While this now meets expectation 2.a), it still breaks expectation 2.b): If you add some new entities to a HashSet
and persist them afterwards, the collection will be corrupt due to the changed hash codes.
And bad news: There is no way to get around this, if using generated ids! In many application this still is no issue, because the program sequence „Create multiple objects“, „add them to a hash-based collection“, „persist the objects“, „use the collection“ is not very common and can be circumvented by persisting (and flushing) the objects before they get added to the collection.
Please take into account, that the usage of a hash-based collection may not be directly visible and instead be hidden as association collection in some other entity class.
What are your options, if you want to use generated ids and your business logic suffers from the hash-based collection problem discussed above? Well, the solution is to use an id populated early, i. e. in the constructor, which can be generated easily and – most advisable for performance – without hitting the database. java.util.UUID
offeres the desired functionalty. It supplies 128 bit integers commonly expressed as 36 character string (32 hex digits and 4 dashes). java.util.UUID
uses random numbers as base and offers a distribution which makes duplicates very unlikely. Other implementations exist, which use MAC addresses and random numbers.
@Entity public class Foo { @Id private String id; private String description; public Foo(String description) { this.id = UUID.randomUUID().toString(); ...
By using uuids you cherry-pick the advantages from having early set ids while still having generated values.
The size of uuids may be painful. They are roughly four times the size of an integer, but storage requirements depend on your database product. Remember that they are ids, so every foreign key in the db has the same size. If that is a problem, you can resort to having an early set uuid for supporting equals
and hashCode
, which is not annotated with @Id
. Instead you add another integer field as JPA generated id, i. a. annotated with @Id @GeneratedValue
. Now all tables contain a number as primary key as well as a non-primary uuid column, but all foreign keys are just numbers.
So, the subject of JPA ids, which seems easy at first glance, is far from that in reality. You may use the following recipe when designing entity classes:
- If your entity contains some identifying business attribute, take this as id. Let
equals
andhashCode
use exactly this attribute. All expectations expressed above will be met
(->ChemicalElement
in showcase). - If you don’t find a suitable business id, and …
- if the hash-based collection problem discussed above is no problem for your application, use a
@Id @GeneratedValue
annotated integer id. Letequals
andhashCode
use exactly this attribute, but modifyequals
such that unset ids render afalse
return value. All expectations expressed above will be met with the exception of 2.b)
(->LabExperiment3NonNullIdEquality
in showcase). - you want / have to use early set ids, use uuids.
- If db size does not really matter or your model contains just a few associations, use the uuids directly as JPA ids. Let
equals
andhashCode
use exactly this attribute. All expectations expressed above will be met
(->LabExperiment4UUIdEquality
in showcase). - If you care about the storage consumption of foreign keys in your database, use the uuid just for
equals
andhashCode
and add a separate@Id @GeneratedValue
id to your class. Letequals
andhashCode
use exactly the uuid attribute. All expectations expressed above will be met
(->LabExperiment5AddIdEquality
in showcase).
- If db size does not really matter or your model contains just a few associations, use the uuids directly as JPA ids. Let
- if the hash-based collection problem discussed above is no problem for your application, use a
There is a showcase on https://github.com/GEDOPLAN/jpa-equals-demo demonstrating the various options. You can find the referenced classes there.
See you in our trainings at Berlin, Bielefeld, Cologne or your site!
http://gedoplan-it-training.de/