Time to review some fundamentals in Java. This post talks about linked list in Java.
Let's start with the linked list. This wiki page has a nice description of linked list. At its core, a linked list is a collection of nodes and each node may point to a different node. It's called a list because we can start from the first node and walk through the whole collection. Java has a built-in implementation of linked list, which is LinkedList
.
In theory, add
and remove
operation should have O(1) time complexity because we just need to update the links. Compared to an array, when we add or remove an element in the middle, we don't need to move the remaining part of the list. However, there is a nuance here. In order to obtain the O(1) time complexity, we need to have the reference to the node in the middle of the list. This is not the case for LinkedList
. For example, when we want to remove an element in the middle of a LinkedList
, we need to scan the list and find the element first, this process has O(N) complexity.
Of course, we can implement a linked list ourselves and what we need is a hash table that tracks the reference of the nodes in the list. This is actually what I did when I was trying to have a linked list implementation that supports O(1) remove
. The funny thing is I did try to search online but couldn't find an immediate solution and I thought it would take less time if I implemented it myself.
Usually, when we write code at this low level, it means something is wrong. How is it possible that such data structure is unavailable in Java?
I was eventually told that the data structure I was looking for is LinkedHashSet
.
Here is the description from the official document:
Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set. (An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.)
Now things get more interesting. Where is the list flavor of this data structure? Is it a set or a list? Apparently, it's a set, because it implements the Set
interface. However, Set
implements Collection
and it turns out Collection
is mapped to the list concept in a data-structure-course-context.
If I need to name one thing that is confusing, it would be the fact that "collection" does not imply an order (on the contrary, in the normal context, list does imply an order) but in this particular case LinkedHashSet
does maintain an order.
To be fair, Java documentation is very clear about the ordering aspect of Collection
The root interface in the collection hierarchy. A collection represents a group of objects, known as its elements. Some collections allow duplicate elements and others do not. Some are ordered and others unordered.
and it's also very clear about list:
An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
If we check the interface, one of the differences between Collection
and List
is that the add
method can have an index
parameter in the list interface. So LinkedHashSet
is not a list even if it's ordered because we cannot control where to insert a new element.
As I mentioned earlier, Collection
is closer to the concept of "list of things" in the daily context. It does not require its elements to have a specific order but at the same time, all collections have an implicit order when we iterate through the collection.
I don't remember how exactly I searched for a linked list implementation that supports O(1) remove operation. In retrospect, it seems obvious that
- I need "links" because, without links, it's very likely that we need to deal with array type of data structure and when we remove an element we need to move the rest of the array.
- I need a hash because I need to track the nodes
So, I should have searched for "link hash" and it turns out that I don't need a list!
Next time, when searching for data structure in Java, we should start with the following pages:
They should provide us with all the necessary building blocks.
----- END -----
©2019 - 2023 all rights reserved