Internals of HashMap
Default size of hashmap is 16
How is HashMap implemented?
PUT
When we do a put operation, the key is converted to hashcode. This helps in picking the index in which the key-value pair should be inserted in hashmap.
The hashcode is modded with size maximum capacity of the hashmap to get the bucket index in which it should be inserted.
When we try to insert another key-value pair, the hashcode is generated for the key. If the hashcode generated is pointing to same index in the HashMap, the new key-value pair will be added as a next node to the existing value in the index. It is like a LinkedList.
If same key is given with new value, then the it will generate the hashcode and gind the bucket index, then check if that key is already present in the bucket. If the key is already present, then it is replaced.
GET
When the key is given, it calculates the hashcode and uses that hashcode to find the index in hashmap bucket. Once it finds the bucket, it will check along the list and gives the value
CONTAINS_KEY
Why Hashmap maximum size is 1<<30
Thats cause the maximum size is expressed in integer. The range of integer is [β231to231β1]. If the maximum size allowed is 231, then it could cause integer overflow. So it is restricted to 230
How does HashMap allocate size when it is given by the user?
For example if the user has initialized the hashmap as:
Then it won't exactly allocate size of 1000. It will check what is the next highest power of 2 which is just greater than 1000. The next highest power of 2 which is greater than 1000 is 210, i.e. 1024. So when you ask for 1000 buckets space, it actually allocates you 1024 buckets space.
How is collision handled in HashMap?
This can be answered based on contract between hashcode and equals.
Equals Means Same Hash Code: If two objects are considered equal using the
equals()method (i.e.,obj1.equals(obj2) == true), then their hash codes returned byhashCode()must also be equal (i.e.,obj1.hashCode() == obj2.hashCode()).Unequal Objects Can Collide: Objects that are not equal according to
equals()can still have the same hash code. This is called a hash collision and while not ideal, it can happen due to hash function limitations.Keep Hash Code Consistent with Equality: When an object's state changes in a way that affects how it compares for equality with other objects, the object's hash code should ideally be updated to reflect this change. This ensures consistency when using the object in hash-based collections. (Java itself doesn't enforce this, so it's the programmer's responsibility to maintain consistency).
When collision occurs, concept of chaining is used. In the bin, new key-value map is chained i.e. appended as next node to the key-value pair already present.
How hashing happens in HashMap?
The above code represents how a hash is calculated for the key.
If key is null, the hash value is always zero
If the
keyis not null, it callskey.hashCode()to retrieve the object's default hash code. This value is stored in thehvariable.The code then performs a bitwise XOR operation between
handh >>> 16. This operation does the following:h >>> 16: Shifts the bits ofh16 positions to the right, effectively removing the lower 16 bits and keeping the upper 16 bits. - Gets the 16 MOST RIGHT SIGNIFICANT BITS^: XOR operation combines corresponding bits ofhand the shiftedh, resulting in a value where bits from both the upper and lower parts of the original hash code are mixed.This technique, known as hash spreading, helps to distribute hash codes more evenly, reducing the likelihood of collisions, especially in smaller hash tables.
What is load factor?
Load factor is a threshold value which determines when the hashmap size needs to be increased.
The default load factor value is 0.75
When the load factor threshold is reached, it constructs a new map with same mappings with increased capacity generally twice the size of original hashmap.
What is TREEIFY_THRESHOLD?
The bin count threshold for using a tree rather than list for a bin. Bins are converted to trees when adding an element to a bin with at least TREEIFY_THRESHOLD nodes. The value must be greater than 2 and should be at least 8 to mesh with assumptions in tree removal about conversion back to plain bins upon shrinkage.
By default, if the bin has 8 elements and a new node should be added, it is converted into a tree bin.
It creates a balanced binary tree, so that the lookup is at almost constant time.
References:
Last updated