How to Write OpenSearch Multi-Match Query in Java

Here we will see how to write OpenSearch multi-match query using Java SDK. OpenSearch multi-match query is supported for String type fields only. We will write code in Java to implement that. Also, we will write some Java code which can be used to emulate multi-match query for non-String data type fields like Integer, Long, Double or Boolean fields.

First thing of all, we need to have below dependencies in our build.gradle file. I have used the latest versions of the libraries available. You can use any other version too.

implementation 'org.opensearch.client:opensearch-rest-client: 2.3.0'
implementation 'org.opensearch.client:opensearch-java:2.0.0'

If you are using Maven instead of Gradle, you can use above libraries in pom.xml file.

multi-match query for String data type:

When we use any kind of String fields in OpenSearch (e.g. OpenSearch field with type text, keyword, search-as-you-type etc.), we can directly use multi-match query provided by OpenSearch API. This multi-match query only works with String type fields. Let’s say we have two fields “title” and “message” in an OpenSearch document. And we want to check if we can find a match for text “cat and dog” in any of these two fields. If we translate it to a CURL request, it would be similar to below:

GET /_search
{
  "query": {
    "multi_match" : {
      "query":    "cat and dog", 
      "fields": [ "title", "message" ] 
    }
  }
}

We can convert the same thing to a Java code as below:

import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import org.opensearch.client.opensearch.OpenSearchClient;
import org.opensearch.client.opensearch._types.OpenSearchException;
import org.opensearch.client.opensearch._types.query_dsl.MultiMatchQuery;
import org.opensearch.client.opensearch._types.query_dsl.Query;
import org.opensearch.client.opensearch.core.SearchRequest;
import org.opensearch.client.opensearch.core.SearchResponse;
import org.opensearch.client.opensearch.core.search.Hit;
public class OpenSearchServiceImpl {
private OpenSearchClient client = OpenSearchClientFactory.getInstance();
public void multiMatchSearchForStringFields() throws OpenSearchException, IOException {
List<String> fieldsToBeSearched = List.of("title", "message");
MultiMatchQuery multiMatchQuery = new MultiMatchQuery.Builder().fields(fieldsToBeSearched).query("cat and dog")
.build();
SearchRequest searchRequest = new SearchRequest.Builder().index("index-name").from(0).size(10)
.query(new Query.Builder().multiMatch(multiMatchQuery).build()).build();
SearchResponse<Map> searchResponse = client.search(searchRequest, Map.class);
List<Map> results = searchResponse.hits().hits().stream().map(Hit::source).collect(Collectors.toList());
}
}

By the way, there are different options for multi-match query like “best_fields“, “most_fields” etc. “best_fields” is the default. I think it is the most useful one for common use cases. It tries to match multiple fields with the same text & takes the field with the highest matching score. For example, in our case let’s say document1 has “cat” and “dog” in “title” field. And document2 has “cat” in “title” field & “dog” in “message” field. “best_fields” match will give higher score to document1 as it has a better match for a single field. I think that makes sense in most of the scenarios. On the other hand, “most_fields” option gives more priority on multiple fields matching the text. Anyways you can find more about these options in official ElasticSearch/OpenSearch documentation. We are here to write Java implementation. If you want to use any other option than default, you can write the Java code as below:

		MultiMatchQuery multiMatchQuery = new MultiMatchQuery.Builder().fields(fieldsToBeSearched).query("cat and dog")
				.type(TextQueryType.MostFields).build();

multi-match query for non-string data type:

There is no ready-made multi-match query provided by OpenSearch for non-string fields. So if you want to search multiple Integer, Long, Double or Boolean fields in OpenSearch document, you can’t use the same multi-match query mentioned above. As a workaround, we can write a match query for each individual field & then have a boolean query which will include all the match queries with should option. That way, if single field match succeeds, the matching document will be part of the overall results. It is basically “or” boolean query on the individual match queries. Let’s say we have a document with “min_amount” & “max_amount” which are double fields. And we want to check if any of these fields matches with 100. The CURL request with the search query would look similar to below:

GET _search
{
  "query": {
    "bool" : {
      "should" : [
        {"match": {"min_amount": 100}},
        {"match": {"max_amount": 100}}
      ]
    }
  }
}

For non-string fields there is no partial match. Field would be a exact match or there won’t be a match. If we convert this to Java code, it would look as below:

import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

import org.opensearch.client.opensearch.OpenSearchClient;
import org.opensearch.client.opensearch._types.FieldValue;
import org.opensearch.client.opensearch._types.OpenSearchException;
import org.opensearch.client.opensearch._types.query_dsl.BoolQuery;
import org.opensearch.client.opensearch._types.query_dsl.MatchQuery;
import org.opensearch.client.opensearch._types.query_dsl.Query;
import org.opensearch.client.opensearch.core.SearchRequest;
import org.opensearch.client.opensearch.core.SearchResponse;
import org.opensearch.client.opensearch.core.search.Hit;

public class OpenSearchServiceImpl {

	private OpenSearchClient client = OpenSearchClientFactory.getInstance();

	public void multiMatchSearchForNonStringFields() throws OpenSearchException, IOException {
		List<String> fieldsToBeSearched = List.of("min_amount", "max_amount");
		BoolQuery.Builder multiMatchQueryBuilder = new BoolQuery.Builder();
		for (String fieldToBeSearched : fieldsToBeSearched) {
			FieldValue fieldValue = new FieldValue.Builder().booleanValue(100).build();
			MatchQuery matchQuery = new MatchQuery.Builder().field(fieldToBeSearched).query(fieldValue).build();
			Query query = new Query.Builder().match(matchQuery).build();
			multiMatchQueryBuilder.should(query);
		}
		SearchRequest searchRequest = new SearchRequest.Builder().index("index-name").from(0).size(10)
				.query(new Query.Builder().bool(multiMatchQueryBuilder.build()).build()).build();
		SearchResponse<Map> searchResponse = client.search(searchRequest, Map.class);
		List<Map> matchedDocuments = searchResponse.hits().hits().stream().map(Hit::source)
				.collect(Collectors.toList());
	}

}

Here there is no “best_fields” or “most_fields” options or anything. It is a plain “or” condition. Suppose we want to put a higher priority on the “min_amount” field, we can boost the specific match query. Then “min_amount” field match will have higher score than “max_field” match. You can do that using the Java code below:

		for (String fieldToBeSearched : fieldsToBeSearched) {
			FieldValue fieldValue = new FieldValue.Builder().booleanValue(100).build();
			MatchQuery.Builder matchQueryBuilder = new MatchQuery.Builder().field(fieldToBeSearched).query(fieldValue);
			if(fieldToBeSearched.equals("min_amount")) {
				matchQueryBuilder.boost(2f);
			}
			Query query = new Query.Builder().match(matchQueryBuilder.build()).build();
			multiMatchQueryBuilder.should(query);
		}

Leave a Comment