How Does HashMap Work in Java

Okay Atalay
6 min readOct 17, 2020

Hi everyone. Today, I’ m going to talk about what hashmap is and how it works. I will use Java lang to examine.

All Maps store data in key →value pair. Key must be unique, but value does not have to be. If key exists and we are putting different value using the same key then existing value will be removed and new value will be stored.

Putting data into map

map.put(key, value); 
if key already exists, stored value will be replaced with new value. Then old value will be returned. Otherwise, new value will be added and null value will be returned. Because there was no old value.

Remove data

map.remove(key);
given key will be removed and value will be returned. If key does not exist, method returns null value.

Map is an interface and there are various implementation of it. Let’ s see them

Map Hierarchy

We will investigate HashMap implementation in this article. When we create HashMap, Bucket array will be created automatically. Its size is 16. We call them as buckets. It is almost the same thing with known array. Let’ s see below image to dream its appearance.

Buckets

Each Bucket has individual LinkedList to store elements. Let’ s see it.

LinkedList in Bucket

Let’ s put some elements into the hashmap to see how it is added into linked list. Lets see and run following code.

/*1*/ Map<Integer,Integer> map = new HashMap<>();
/*2*/ map.put(1,1);
/*3*/ map.put(2,2);
/*4*/ map.put(17,17);
/*5*/ map.put(18,18);

Line-1: Empty HashMap will be created.

Line-2: put method implementation will call hashCode() method of given key. Key is Integer in here and Integer hashCode returns the value. Value is “1” here. Bucket size is 16. HashMap has to find correct bucket to add given key → value pair. strategy is simple. bucketIndex = key.hashCode() % bucketSize. In this line, bucketIndex = 1 % 16 = 1. So buckets[1] will be used to add given key →value.

put(1,1)

Line-3: Bucket index is 2(2%16). buckets[2] is null and it will be instantiated a linked list and given data will be added.

put(2,2)

Line-4: Bucket index is 17%16 = 1. So, given key →value pair will be added into buckets[1]. It already has a linked list and there is only 1 element in the linked list. To make key unique, put method will iterate over the linked list to check whether key is stored or not.

put(17,17)

line-5: Bucket index is 18%16 = 2. So, given key →value pair will be added into buckets[2]. It already has a linked list and there is only 1 element in the linked list. To make key unique, put method will iterate over the linked list to check whether key is stored or not.

put(18,18)

I think map structure is almost clear now. However, you think that what happened when linked list size is getting increased. If we add 1600 key →value pairs into map, each bucket will have 100 elements(i=0 to 1599). Map will behave like a list. This is not acceptable. So map works differently and manages this situation. Are you wondering how? let’ s have a look. There is load-factor and its default value is 0.75f. Means that map bucket size will be doubled when stored element count reachs to load-factor*bucket-size. Threshold is 0.75*16 = 12. Means that when element count is 12, Bucket size will be 32 and threshold will be 24. As you think, All elements must be reindexed. Bucket size variable was used in formula to find bucketId and bucket size variable has been changed. Let’ s imagine we added 8 additional elements to above example and see the last output.

Element count is over 12

How to use load-factor and initial size

HashMap has 4 different constructor. You need to choose one of them according to your parameters. When you use default constructor, initialCapacity is set to 12 and loadFactor is set to 0.75f automatically.

new HashMap<>() // initialCapacity=12, loadFactor=0.75f
new HashMap<>(int initialCapacity) // loadFactor=0.75f
new HashMap<>(int initialCapacity, float loadFactor)
new HashMap<>(map) // do not need to focus this one

How to use any Class as a key

You may want to use your class as a key. It is not possible to use your class as a key without overriding hashCode method and equals method. hashCode method must return unique value for the object to make it unique. Let’ s do an example

public class User {
private String firstname;
private String lastname;

// setter-getter

@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((firstname == null) ? 0
: firstname.hashCode());
result = prime * result + ((lastname == null) ? 0
: lastname.hashCode());
return result;
}

@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
User other = (User) obj;
if (firstname == null) {
if (other.firstname != null)
return false;
}
else if (!firstname.equals(other.firstname))
return false;
if (lastname == null) {
if (other.lastname != null)
return false;
}
else if (!lastname.equals(other.lastname))
return false;
return true;
}
}

All IDEs have capability to generate hashCode and equals methods. I used Eclipse IDE to generate them. If you believe that just firstname is unique then return firstname.hashCode() in hashCode method to increase performance.

Summary

HashMap retrieves data almost constant time. we can say T(n) = O(1) in homogeneous distribution. At the worst case T(n)=O(n). However, Java detects this situation and converts your hashmap to treemap to increase throughput. TreeMap uses red-black tree implementation and T(n) = O(lgn). 2 base n. This capability was introduced by Java-8.

Initial size is important and it should be selected according to your projection. if your map size is getting increased fastly, you need to set initial size value to cover all your elements. Because, reindexing cost is so high.

load-factor should be selected according to the your application data set. If keys are distributed homogeneously, it may be 1.0f. If your keys are randomly generated and you know that it may hit most likely the same bucket then it would be good to set 0.5f. if you do not know keys, leave it as default(0.75f).

--

--