the result was computed repeat when in a search with two aggregations in one field #90

hengfeiyang · 2022-01-19T11:37:50Z

the search query like this:

query := bluge.NewBooleanQuery().AddMust(bluge.NewMatchAllQuery())
searchRequest := bluge.NewTopNSearch(1, query).WithStandardAggregations()
searchRequest.AddAggregation("my-agg-min", aggregations.Min(search.Field("rating")))
searchRequest.AddAggregation("my-agg-avg", aggregations.Avg(search.Field("rating")))
searchRequest.AddAggregation("my-agg-sum", aggregations.Sum(search.Field("rating")))

in this case, my-agg-sum result value large than the real value 3 multiple. equal to rating field appear times. it seems computed repeat.

The text was updated successfully, but these errors were encountered:

mschoch · 2022-01-19T13:33:17Z

Yes, you are correct, that if you compute multiple aggregations on the same field, the code we have to try and accumulate the list of field we care about will contain duplicates.

Do you know if this actually causes problematic behavior? Like, do we do extra work 3 times because it appears multiple times in that list?

I'm fine with the fix, but I think we should try to add a test case to illustrate the problematic behavior.

mschoch · 2022-01-19T20:09:17Z

I did remember one more detail related to this.

One concern I had at the time I originally wrote this part, was that because aggregations can be nested, enumerating the fields required involves enumerating the fields required for all child aggregations as well. I didn't like the idea that we de-duplicate this list at every level, especially when we immediately merge it with another list, and the have de-duplicate that again. Based on that thinking, I figured, so what if this method returns duplicates, if any caller is going to use this list to do real work, they should do one de-duplication pass at the very end.

So, I'm pretty sure that's why this method does not do any de-duplication today. It would help to know more precisely how this current behavior is problematic, as maybe there is some other solution...

hengfeiyang · 2022-01-20T01:17:50Z

i saw the flow on aggregation, one documentMatch will deeply flow in every nested aggregation, so one field should be compute one time, this is still a problem.
i fixed in my pull cetainly will de-duplicate at every level, i will try to find other better solution.

hengfeiyang mentioned this issue Jan 19, 2022

fix: repeat compute in case one field aggregation multiple #91

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the result was computed repeat when in a search with two aggregations in one field #90

the result was computed repeat when in a search with two aggregations in one field #90

hengfeiyang commented Jan 19, 2022

mschoch commented Jan 19, 2022

mschoch commented Jan 19, 2022

hengfeiyang commented Jan 20, 2022

the result was computed repeat when in a search with two aggregations in one field #90

the result was computed repeat when in a search with two aggregations in one field #90

Comments

hengfeiyang commented Jan 19, 2022

mschoch commented Jan 19, 2022

mschoch commented Jan 19, 2022

hengfeiyang commented Jan 20, 2022