Multi row range filters? #344

alpuy · 2021-05-04T19:54:15Z

I have the following data table:
columns = {
"meas_point_key": {"cf": "rowkey", "col": "key1", "type": "string", "length":"8"},
"date_key": {"cf": "rowkey", "col": "key2", "type": "string", "length":"14"},
"magnitude_key": {"cf": "rowkey", "col": "key3", "type": "string", "length":"2"},
"meas_int_key": {"cf": "rowkey", "col": "key4", "type": "string", "length":"1"},
"source_key": {"cf": "rowkey", "col": "key5", "type": "string"},
"date": {"cf": "IV", "col": "D", "type": "bigint"},
"file": {"cf": "IV", "col": "F", "type": "string"},
"last_update_date": {"cf": "IV", "col": "L", "type": "bigint"},
"magnitude": {"cf": "IV", "col": "M", "type": "bigint"},
"meas_int": {"cf": "IV", "col": "MI", "type": "bigint"},
"meas_point": {"cf": "IV", "col": "MP", "type": "bigint"},
"source": {"cf": "IV", "col": "S", "type": "bigint"},
"value": {"cf": "IV", "col": "V", "type": "double"},
"last_update_val": {"cf": "IV", "col": "LAV", "type": "bigint"},
"val_det": {"cf": "IV", "col": "VD", "type": "string"},
"val_res": {"cf": "IV", "col": "VR", "type": "string"}
}

and i want to scan based on the rowkey with the following filters:

df = df1.where((df1.meas_point_key.isin(meter_list_B.value) ) & (df1.magnitude_key == "13") & (df1.date_key >= '01588302000000') & (df1.date_key <= '01593572400000') & (df1.meas_int_key == '1'))

where meter_list_B is a broadcasted list of string values, this list contains about 15000 values.

Is this query optimal? because i think that because of the time it is taking it is not an optimal scan.

Are MultiRowRangeFilters used in shc?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi row range filters? #344

Multi row range filters? #344

alpuy commented May 4, 2021 •

edited

Loading

Multi row range filters? #344

Multi row range filters? #344

Comments

alpuy commented May 4, 2021 • edited Loading

alpuy commented May 4, 2021 •

edited

Loading