How to do Upsert Operation in OpenSearch

We will learn how to do upsert operation in OpenSearch. There are couple of ways to do it, one using Index operation & another using Update operation. But how they work is different. So we will have to choose carefully based on our use case.

Upsert using Index operation:

Index operation by default does upsert. If document id is not present in OpenSearch, indexing operation will insert a new document. If OpenSearch already has a document with same id, indexing operation will update the document with the new one & increment document version.

But there is no partial update option here. Index operation replaces the whole document with the new document payload. Let’s say you indexed a document with 10 fields. Then you update same document with index operation. And in request payload, you sent only 2 fields which actually got updated. OpenSearch will completely replace the existing document with whatever sent in latest index operation. So you will have a document which only has 2 fields instead of 10.

Upsert using Update operation:

Update operation by default doesn’t support upsert. So if we try to use update operation where document doesn’t already exist in OpenSearch, we will get 404 error response with message “document missing”. This is a normal update operation HTTP request:

POST index1/_update/1
{
  "doc": {
    "title": "new title"
  }
}

To enable upsert on update operation, we would need to change above HTTP request as below:

POST index1/_update/1
{
  "doc": {
    "title": "new title"
  },
  "doc_as_upsert": true
}

The corresponding Java code to support upsert operation would look like below:

import java.io.IOException;
import java.util.Map;

import org.opensearch.client.opensearch.OpenSearchClient;
import org.opensearch.client.opensearch._types.OpenSearchException;
import org.opensearch.client.opensearch.core.UpdateRequest;
import org.opensearch.client.opensearch.core.UpdateResponse;

public class OpenSearchServiceImpl {

	private OpenSearchClient client = OpenSearchClientFactory.getInstance();

	public void doUpsert() throws OpenSearchException, IOException {

		String index = "index1";
		String docId = "1";

		// upsert document
		Map<String, Object> updatedField = Map.of("title", "new title");
		UpdateRequest<Map, Map> updateRequest = new UpdateRequest.Builder<Map, Map>().index(index).id(docId)
				.doc(updatedField).docAsUpsert(true).build();
		UpdateResponse<Map> updateResponse = client.update(updateRequest, Map.class);
		System.out.println("upsert: " + updateResponse.result().name() + " " + updateResponse.version());

	}
}

By the way, I already wrote a blog post on how to do Index & normal Update operations in OpenSearch using Java. You can check it out if needed.

Upsert in Update operation supports partial update. Let’s say we indexed one document with 10 fields. If we pass 2 fields in next update request, only those 2 fields will get updated in the existing document. Other 8 fields will remain same.

Another difference is Index operation supports external versioning of documents. Basically you can provide document version in request payload. But Update operation doesn’t support external versioning as of now. There is a workaround which I might explain in a future post.

So these are the main differences in functionalities between Index operation & Update operation. You can choose based on the use case you need to support.

Leave a Comment