How HashMap Works Internally

By Ritesh Sahu - October 17, 2024

How HashMap Works Internally

A HashMap is a part of Java’s collection framework and implements the Map interface. It stores key-value pairs and allows for efficient data retrieval based on keys. The underlying structure of a HashMap is fascinating and involves several key concepts. Here’s an overview of how it works internally.

1. Hashing Mechanism

The core functionality of a HashMap revolves around hashing. When you insert a key-value pair into a HashMap, the following steps are executed:

Hash Function: The HashMap uses a hash function to compute an integer hash code from the key. This is done by invoking the hashCode() method of the key object.
Index Calculation: The hash code is then transformed into an index within the internal array. This transformation is done using the modulo operation:
$\text{index} = \text{hashCode} \% \text{capacity}$
where capacity is the current size of the internal array.

2. Internal Structure

A HashMap is backed by an array of buckets. Each bucket can hold multiple entries in case of hash collisions (when different keys hash to the same index). The two primary data structures that HashMap uses to manage these entries are:

Array: This is the main structure holding the references to the buckets.
Linked Lists: Each bucket can be a linked list (or a tree structure in Java 8 and above) of Node objects containing key-value pairs. The Node class typically looks like this:
```
java
static class Node<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;
    ...
}
```

3. Handling Collisions

When two keys produce the same hash index, a collision occurs. There are two main strategies for handling collisions:

Chaining: This is the traditional approach where each bucket contains a linked list of entries that hash to the same index. When a collision occurs, the new entry is simply added to the linked list in that bucket.
Tree Structure: Starting from Java 8, if a bucket contains more than a certain number of entries (default is 8), it converts the linked list into a balanced tree (specifically a red-black tree) to improve the efficiency of retrieval operations. This allows for faster access time in cases of high collision rates.

4. Load Factor and Rehashing

The HashMap uses a load factor to determine when to increase its capacity. The load factor is the ratio of the number of elements to the size of the array. The default load factor is 0.75, meaning that when 75% of the HashMap is filled, it will resize.

When resizing occurs, the following happens:

A new, larger array is created (typically double the size).
Each existing entry is rehashed to determine its new position in the new array.
This process is time-consuming (O(n)), which is why it’s important to choose an appropriate initial capacity.

5. Performance Characteristics

Time Complexity:
- Average case for get() and put() operations: O(1)
- Worst case (high collision, linked list): O(n)
- Worst case (high collision, tree structure): O(log n)
Space Complexity: O(n), where n is the number of entries in the HashMap.

6. Java HashMap Example

Here’s a simple example demonstrating how to create and use a HashMap in Java:

java
import java.util.HashMap;

public class HashMapExample {
    public static void main(String[] args) {
        // Create a HashMap
        HashMap<String, Integer> map = new HashMap<>();

        // Add key-value pairs
        map.put("Alice", 30);
        map.put("Bob", 25);
        map.put("Charlie", 35);

        // Retrieve values
        System.out.println("Alice's age: " + map.get("Alice")); // Outputs: Alice's age: 30

        // Check if a key exists
        if (map.containsKey("Bob")) {
            System.out.println("Bob's age: " + map.get("Bob")); // Outputs: Bob's age: 25
        }

        // Remove a key-value pair
        map.remove("Charlie");
        System.out.println("Charlie removed. Current size: " + map.size()); // Outputs: Charlie removed. Current size: 2
    }
}

Conclusion

Understanding the internal workings of a HashMap is crucial for efficient programming in Java. The combination of hashing, dynamic resizing, and collision handling mechanisms makes HashMap a powerful data structure for storing and retrieving key-value pairs efficiently. By grasping these concepts, developers can better utilize HashMap and make informed decisions about its use in their applications.

Search This Blog

Theprogrammersfirst | A technical portal.