Query Builder

Search is always the backbone of many functionalities in an AEM application . It becomes quite critical in Business scenarios to implement the most Optimized Query which fetches the best possible result. To perform search in AEM , Query Builder is highly recommended over simple SQL / XPATH query statements. The Query Builder , if used correctly, will solve all your query implementations and would be a handy way to Optimize your queries for better performance of the page. Through this Blogpost , I would explain the basics of Query builder and then would go to advanced concepts , focusing at each point how you may create any search scenario to Query Builder Predicate form. I hope this post will solve all your Search performances related hurdles in AEM.

What is Query Builder?

Query Builder is an API which can be used to create Search queries in JAVA content repository. It is extensible tool by which you may add/remove various predicates in a query using this API. The best way to create predicates is using the Query Builder Debugging Tool : /libs/cq/search/content/querydebug.html . Try to implement your Business use case in the Predicate form using this debugger, Optimize the query and then implement it in the code.

Anatomy of a Query:

The query description is a set of predicates which evaluate to an XPATH /JCR query in the backend. To understand more check the screenshot below:

Every Predicate is evaluated using a Predicate Evaluator. There are some in-built predicates in AEM. And you may always customize the predicates and use it as per your Business need. I will go in more details for creating new predicates later.

Implementation :

You may refer to the links : Adobe Doc or Use the API to implement your queries using a Query Builder.

Standard Predicates : Deep understanding of predicates is necessary if you want to Optimize any if your Search Query.

path : This is used to search under a particular hierarchy only.
- path.self=true : If true searches the subtree including the main node given in path, if false searches the subtree only.
- path.exact=true : If true exact path is matched, if false all descendants are included.
- path.flat=true : If true searches only the direct children .
type: It is used for searching for a particular nodetype only.
property: This is used to search for a specific property only.
- property.value : the property value to search . Mutilple values of a particular property could be given using property.N_value=X , where N is number from 1 to N.
- property.depth : The number of additional levels to search under a node. eg. if property.depth=2 then the property is searched under
```
(@jcr:title = 'foo' or */@jcr:title = 'foo' or */*/@jcr:title = 'foo' )
```
  - property.and : If multiple properties are present , by default an OR operator is applied. If you want an AND , you may use property.and=true
  - property.operation : “equals” for exact match (default), “unequals” for unequality comparison, “like” for using the jcr:like xpath function , “not” for no match , (value param will be ignored) or “exists” for existence match .(value can be true – property must exist).
fulltext: It is used to search terms for fulltext search
- fulltext.relPath : the relative path to search in (eg. property or subnode) eg. fulltext.relPath=jcr:content or fulltext.relPath=jcr:content/@cq:tags
daterange : This predicate is used to search a date property range.
- daterange.property : Specify a property which is searched.
- daterange.lowerBound : Fix a lower bound eg. 2010-07-25
- daterange.lowerOperation : “>” (default) or “>=”
- daterange.upperBound: Fix a lower bound eg. 2013-07-26
- daterange.upperOperation: “<” (default) or “<=”
relativedaterange: It is an extension of daterange which uses relative offsets to server time. It also supports 1s 2m 3h 4d 5w 6M 7y
- relativedaterange.lowerBound : Lower bound offset, default=0
- relativedaterange.upperBound : Upper bound Offset .
nodename: This is used to search exact nodenames for the result set. It allows few wildcards like: nodename=text* will search for any character or no character after text. nodename=text? will search for any character after text.
tagid: This predicate is used to search for a particular tag on a page. You may specify the exact tagid of a tag in this predicate
- tagid.property: this may be used to specify the path of node where tags are stored.
group: This predicate is used to create logical conditions in your query. You can create complex conditions using OR & AND operators in different groups. e.g:

path=/home/users type=rep:User
group.1_daterange.property=jcr:created group.1_daterange.lowerBound=2014-08-18 group.1_daterange.upperBound=2014-08-19
group.2_daterange.property=cq:lastModified group.2_daterange.lowerBound=2014-08-18 group.2_daterange.upperBound=2014-08-19
group.p.or=true

orderBy: This predicate is used to sort the result sets obtained in the query. e.g. orderby=@jcr:score or orderby=@jcr:content/cq:lastModified
- orderby.sort: You may define the sorting way for the search results e.g. desc for descending and “” for ascending.
- orderby:path : this can also be used to sort by path.
Refining the Results: In order to refine the results there are some parameters which could be leveraged:
- p.hits=full: Use this when you want to return all the properties in a node. Example
- p.hits=selective: Use this if you want to return selective properties in search result. Use this with
  p.properties=sling:resourceType jcr:primaryType Example
- p.nodedepth: Use this when you need properties of a node and its child nodes in the same search result. Use this with p.hits=full Example
- p.facets=true : This will be used to Search Facets based search for the assigned Query. If you want to calculate the count of tags which are present in your search result or you want to know how many templates for a particular page are there etc, you may go with Facets based search . Example

type=cq:Page
orderby=@jcr:score
orderby.sort=desc
1_property=jcr:content/cq:tags
2_property=jcr:content/cq:template
2_property.value=/apps/geometrixx/templates/contentpage
p.facets=true

Use this java code to extract Facets for your search result:

 Map<String, Facet> facets = result.getFacets();
 for (String key : facets.keySet()) {
 Facet facet = facets.get(key);
 if (facet.getContainsHit()) {
 for (Bucket bucket : facet.getBuckets()) {
 long count = bucket.getCount();
 Map<String, String> params = bucket.getPredicate().getParameters();
 for (String k : params.keySet()) {
 out.println("<br>k:"+k);
 }
 }
 }
 }

p.limit : Limits the number of search results fetched.
p.offset : Sets the offset for the search results
p.guesstotal : The purpose of p.guessTotal parameter is to return the appropriate number of results that can be shown by combining the minimum viable p.offset and p.limit values.

You may find more such predicates at here.

In most of the cases the standard predicates would solve your purpose of creating Queries for any business scenario. However sometimes we may need to Create Custom Predicates. I will tell you more about this later.

Custom Predicate Evaluators:

Broadly there are 2 kinds of Predicate Evaluators which can be used to create new predicates as per Business need.

XPath Predicate: This is used to create a Backend XPATH Query using the new custom predicates which can be defined as per need. Many of the inbuilt CQ predicates are XPATH predicates. Notice that in XPATH Predicate Evaluator the overriden method canXpath() should return true while canFilter() should return false. Use the below code snippet to create Custom Predicates :


import org.apache.felix.scr.annotations.Component;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.day.cq.search.Predicate;
import com.day.cq.search.eval.AbstractPredicateEvaluator;
import com.day.cq.search.eval.EvaluationContext;

/**
* &amp;amp;amp;amp;amp;amp;lt;code&amp;amp;amp;amp;amp;amp;gt;OriginPredicateEvaluator&amp;amp;amp;amp;amp;amp;lt;/code&amp;amp;amp;amp;amp;amp;gt; queries the Livecopy status of a page.

*This property is used to find the Livecopy status of the page.
*origin.value=disconnected gives the XPATH query as jcr:content/@jcr:mixinTypes='cq:LiveSyncCancelled'
*origin.value=locally gives the XPATH query as jcr:content/@jcr:mixinTypes!='cq:LiveSync'
*origin.value=inheritted gives the XPATH query as jcr:content/@jcr:mixinTypes='cq:LiveSync'
* @author hakhan
*/
@Component(metatype = false, factory = "com.day.cq.search.eval.PredicateEvaluator/origin")
public class OriginPredicateEvaluator extends AbstractPredicateEvaluator {
static final String PE_NAME = "origin";
static final String JCRCONTENT_JCRMIXIN = "jcr:content/@jcr:mixinTypes";

static final String PREDICATE_VALUE = "value";
static final String PREDICATE_LIVESYNCCANC = "'cq:LiveSyncCancelled'";
static final String PREDICATE_LIVESYNC = "'cq:LiveSync'";

static final String OP_EQUALS = "=";
static final String OP_NOT_EQUALS = "!=";

private static final Logger logger = LoggerFactory
.getLogger(OriginPredicateEvaluator.class);

@Override
public String getXPathExpression(Predicate predicate,
EvaluationContext context) {

String value = predicate.get(PREDICATE_VALUE);

StringBuilder sb = new StringBuilder();

if(value != null){
if (value.equalsIgnoreCase("inheritted")) {
sb.append(JCRCONTENT_JCRMIXIN).append(OP_EQUALS);
sb.append(PREDICATE_LIVESYNC);
}
if (value.equalsIgnoreCase("disconnected")) {
sb.append(JCRCONTENT_JCRMIXIN).append(OP_EQUALS);
sb.append(PREDICATE_LIVESYNCCANC);
}
if (value.equalsIgnoreCase("locally")) {
sb.append(JCRCONTENT_JCRMIXIN).append(OP_NOT_EQUALS);
sb.append(PREDICATE_LIVESYNC);
}
}

String xpath = sb.toString();

logger.debug("**********XPATH::**********" + xpath);

return xpath;
}
@Override
public boolean canXpath(Predicate predicate, EvaluationContext context) {
return true;
}

@Override
public boolean canFilter(Predicate predicate, EvaluationContext context) {
return false;
}
}

Filter Predicate : This predicate is used whenever you want to Filter out some results which are not needed in the end Search Result. Notice that in Filter Predicate Evaluator the overriden method canXpath() should return false while canFilter() should return true.


import javax.jcr.query.Row;

import org.apache.felix.scr.annotations.Component;
import org.apache.sling.api.resource.Resource;
import org.apache.sling.resource.collection.ResourceCollection;

import com.day.cq.search.Predicate;

@Component(metatype = false, factory = "com.day.cq.search.eval.PredicateEvaluator/samplepredicate")
public class SampleFilterPredicateEvaluator extends AbstractPredicateEvaluator {
public static final String SAMPLE = "samplepredicate";
@Override
public boolean includes(Predicate p, Row row, EvaluationContext context) {
if (!p.hasNonEmptyValue(SAMPLE)) {
return true;
}
/* Write some code logic here as per the condition:
Return true for a favourable Condition for keeping the entity in Search Results.
Return false for an unfavourable Condition for removing the entity from the Search Results.
*/

return false;
}
@Override
public boolean canXpath(Predicate predicate, EvaluationContext context) {
return false;
}

@Override
public boolean canFilter(Predicate predicate, EvaluationContext context) {
return true;
}

}

Improving Search Performance

By far this is the most important question of any project , and I am telling you its not that difficult. Just a few steps to follow and few things to be aware of and you will be able to optimize the Query to its utmost performance level.

Tune your AEM for indexing for appropriate nodes. LINK
Use the AEM Diagnosis tool for monitoring all queries
Build a Query with the maximum predicates possible for that node , as long as you reduce the Search pool. e.g. If you are searching for a component node with property= sling:resourceType , add nodename predicate too to make the search quicker.
Keep in consideration what you need in Search Results. If you need cq:Page , it would be bad idea to search for type=nt:unstructured.
Always check whether the results are upto the Business need, after the grouping and logic you apply in your Predicate based search.
Try to reduce the processing of the Search results as much as possible. e.g. Its better to use facets then to process the results again
Go for Custom Predicate Evaluators if you are not able to define your complex query using existing ones or if you think you may simplify the query to a greater level using custom predicates.
Depending on your application logic, if the result set is more, dont load all the results in DOM and go via partial load using p.limit and p.offset parameters.
If the search is for anonymous users and no permission sensitive search is needed, use p.guesstotal=true . The purpose of the p.guessTotal parameter is to return the appropiate number of results that can be shown by combining the minimum viable p.offset and p.limit values. Basically it stops the permission check for that session on each node of the result set and makes the Search query performance better.

References

17 thoughts on “Query Builder”

shakeeb (@saeyaz) says:

December 25, 2015 at 9:22 pm

Hi Hashim

thanks for sharing and it is really nice to see such good thing at one place. :)

LikeLiked by 1 person

Reply
Sreenath (@sree_nk) says:

April 11, 2017 at 4:34 am

Hi Hashim
i want to write a query which will query the root path but skip some sub paths

eg: i want assets which are under /content/dam but not under /content/dam/a/ and /content/dam/b/

LikeLike

Reply
- Hashim Khan says:
  
  April 11, 2017 at 6:30 am
  
  Hi ,
  Can you try to build up your query using these links –
  http://help-forums.adobe.com/content/adobeforums/en/experience-manager-forum/adobe-experience-manager.topic.html/forum__vquz-hi_sir_madamih.html
  http://stackoverflow.com/questions/22510025/how-do-i-add-a-where-not-to-a-querybuilder-query
  
  LikeLike
  
  Reply
  - Sreenath (@sree_nk) says:
    
    May 29, 2017 at 10:02 am
    
    this option is ther in SQL2 query. made use of that. thank you
    
    LikeLiked by 1 person
    
    Reply
Shameer Tarigonda says:

July 31, 2017 at 8:14 am

Hi Hashim,

I want to retrieve all the pages where two properties have same value. Since the value is not known here.Is it possible in xpath or sql2

For ex: SELECT p.* FROM [nt:unstructured] AS p WHERE ISDESCENDANTNODE(p,’/content/test/page’) AND p.[Porperty_name1] = p.[Property_name2]

LikeLike

Reply
- Hashim Khan says:
  
  August 8, 2017 at 4:18 pm
  
  Hi Shameer,
  Thats a good question. I think you can do it with Xpath and Query Builder too, but its a twisted way to do so. For such complex scenarios its better to keep it simple and go with SQL.
  
  if you still want to do, you can create a Filter Predicate, which will work on results of the search for Property1 and Property2 with any value and then Filter out the results with matching cases. Filter predicates are generally expensive thats why its better to go with simple SQL .
  
  LikeLike
  
  Reply
King Kohli (@iamkingkohli) says:

October 13, 2017 at 9:06 pm

Does query builder traverse more than 10,000 nodes. If so does it have high impact on the servers?

LikeLike

Reply
- Hashim Khan says:
  
  October 16, 2017 at 3:04 pm
  
  Yes it does have an impact. There should be proper indexing to fix it up. I will soon be publishing a blog for such cases.
  
  LikeLike
  
  Reply
WebFuse says:

March 12, 2018 at 7:30 am

Hi @Hashim,
Is there a way we can see the jcr:score value? I am facing an issue where I am sorting the query result based on jcr:score descending but the first two results are coming fine(expected) but the same kind of result is coming in the last of the result. Hence wanted to see what is the jcr:score of each of the results? Is there a way to retrieve this information?

LikeLike

Reply
- Hashim Khan says:
  
  March 12, 2018 at 3:18 pm
  
  Hi,
  jcr:score is an internal property of Lucene which is calculated using a complex mathematical formula. You can get more insight from the Similarity Class http://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/search/Similarity.html
  
  Although there isnt a direct way to obtain the jcr:score directly , but I think you can explore more by overriding the Class methods to retrieve a value –
  http://www.lucenetutorial.com/advanced-topics/scoring.html
  
  LikeLike
  
  Reply
Swathi Chowdhari says:

March 15, 2018 at 3:48 am

hi, I want to fetch assets whose metadata node doesnt have node child node say

asset > jcr:content> metadata>xnode

I want all those assets which doesnt have xnode, can i do it using any of the query techniques? or it can be done only via code?

LikeLike

Reply
- Hashim Khan says:
  
  March 15, 2018 at 11:58 am
  
  Hi,
  I don’t think there is a direct way OOTB to do so using existing predicates. If its only 1-time task you can use Groovy to filter and check the results.
  Another good approach is to use Filter Predicates. There you should search for all dam: assets which have metadata node and then Filter out those results which doesn’t have xnode child node present. This is clean but expensive as the result set will be processed later after the Search is performed.
  
  Check the post above for “samplepredicate” or use this
  https://github.com/Adobe-Consulting-Services/acs-aem-samples/blob/master/bundle/src/main/java/com/adobe/acs/samples/search/querybuilder/impl/SampleFilteringPredicateEvaluator.java
  
  LikeLike
  
  Reply
Ajay Paul says:

June 21, 2018 at 1:49 pm

Hi,
I am writing a fullText query in a way that for each locale node(example: /content/consumer/en-us), it should search for every node and subnode and return the path String which does not have example “en-us” (example: /content/consumer/de-de) .
and it should do this for every locale:

Till now I could only do till this much:

/jcr:root/content/consumer/en-us//element(*, nt:base)
[
(jcr:contains(., ‘/content/consumer/’))
]

Alternatively tried with query builder:
however, Strangely, Search Result is giving me comparatively far less number of results than direct xpath query in crxde ?

LikeLike

Reply
- Hashim Khan says:
  
  June 28, 2018 at 5:44 pm
  
  What is the Query builder query you are trying to use ?
  
  LikeLike
  
  Reply
  - Ajay Paul says:
    
    June 29, 2018 at 1:16 pm
    
    Hi Hashim,
    
    Here is my full requirement:
    I am writing a query to search all pathfield for each locale, which does not belong to that locale.
    For example: under /content/consumer/ja-jp/, I should be able to get all the authored pathfields which has a non ja-jp in it. Like each pathfeld with: /content/consumer/en-us/…/.. and this I should be able to do for every locale.
    
    To do that:
    The query builder code is for a fullText search.
    I am almost there, except of the method hit.getPath() I am only able to fetch the current Node of the authored text and not the text itself. Which makes the condition always true.
    
    Whereas, if I could get the actual authored text, it would point to the node, where searched text could be found.
    
    following is my code:
    
    @Override
    protected void doPost(final SlingHttpServletRequest request, final SlingHttpServletResponse response) {
    String fullTextPath = StringUtils.EMPTY;
    String subNodeName = StringUtils.EMPTY;
    String locale= StringUtils.EMPTY;
    Node node = null;
    Resource resource = request.getResourceResolver().getResource(PATH);
    if (resource != null) {
    node = resource.adaptTo(Node.class);
    }
    ResourceResolver resourceResolver = request.getResourceResolver();
    
    try {
    NodeIterator list = node.getNodes();
    
    while (list.hasNext()) {
    Node currentSubNode = list.nextNode();
    subNodeName = currentSubNode.getPath();
    locale= extractLocaleNodeName(subNodeName);
    fullTextPath = PATH + locale;
    
    Map map = new HashMap();
    map.put(TYPE_PREDICATE, “nt:base”);
    map.put(PATH_PREDICATE, subNodeName);
    map.put(FULLTEXT_SEARCH_PREDICATE, fullTextPath);
    map.put(“p.excerpt”, “true”);
    map.put(SEARCH_LIMIT_PREDICATE, “-1”);
    
    if (StringUtils.isNotBlank(locale)) {
    Query query = queryBuilder.createQuery(PredicateGroup.create(map),
    resourceResolver.adaptTo(Session.class));
    SearchResult result = query.getResult();
    
    for (Hit hit : result.getHits()) {
    
    locale= locale.replaceAll(PATH_REPLACEMENT, StringUtils.EMPTY);
    if (checkNegativeLookAhead(hit.getPath(), locale))
    // or (checkNegativeLookAhead(hit.getExcerpt(), locale))
    {
    continue;
    } else {
    LOG.info(“Negative HITS for locale” + locale+ “: ” + hit.getPath());
    }
    
    }
    }
    
    }
    
    } catch (RepositoryException e) {
    e.printStackTrace();
    }
    }
    
    private static boolean checkNegativeLookAhead(String resultPath, String locale) {
    String differentLocale = StringUtils.EMPTY;
    
    Pattern pattern = Pattern.compile(“\\/(?:[a-z]+)(-)(?:[a-z]+)\\/”, Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(resultPath);
    if (matcher.find()) {
    differentLocale = matcher.group();
    }
    locale= locale.replaceAll(PATH_REPLACEMENT, StringUtils.EMPTY);
    differentLocale = differentLocale.replaceAll(PATH_REPLACEMENT, StringUtils.EMPTY);
    if (differentLocale.equalsIgnoreCase(locale)) {
    return true;
    }
    return false;
    
    }
    
    private static String extractLocaleNodeName(String subNodeName) {
    String locale = StringUtils.EMPTY;
    Pattern pattern = Pattern.compile(“\\/(?:[a-z]+)(-)((?:[a-z]+))”, Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(subNodeName);
    if (matcher.find()) {
    locale = matcher.group();
    }
    return locale;
    }
    
    }
    
    LikeLike
    
    Reply
    - Hashim Khan says:
      
      July 2, 2018 at 11:28 am
      
      Hi Ajay,
      Your logic and approach to the problem statement are correct. But I believe the more efficient would be to use Filter predicate. Step 1 should be to find all the nodes with the specified property where the path is defined. The second step should be to write a Filter Predicate to choose only those results which don’t match the current locale. You can write your pattern search there.
      https://hashimkhan.in/2015/12/02/query-builder/ Read this for Filter Predicate.
      
      LikeLike
      
      Reply
Karthik P says:

October 25, 2018 at 12:31 pm

Hi Hashim,

There is an image called “car.jpeg” in both publish instances, but in one of the publish instances when I execute a query in querydebug console which is “fulltext=car.jpeg” , i’m not getting results eventhough the image exists in publish instance.. I’m unable to find the cause for it.

LikeLike

Reply