Describe the bug
When no window frame is specified in the python implementation, we default to unbounded preceeding to current row. If we are to follow PostgreSQL implementation then we should set this value when order_by is specified and otherwise default to unbounded preceeding to unbounded following.
To Reproduce
from datafusion import SessionContext, WindowFrame, col, lit, functions as F
import pyarrow as pa
ctx = SessionContext()
# create a RecordBatch and a new DataFrame from it
batch = pa.RecordBatch.from_arrays(
[pa.array([1.0, 10.0, 20.0])],
names=["a"],
)
df = ctx.create_dataframe([[batch]])
window_frame = WindowFrame("rows", None, None)
df = df.select(col("a"), F.window("avg", [col("a")]).alias('no_frame'), F.window("avg", [col("a")], window_frame=window_frame).alias('with_frame'))
df.show()
Produces:
DataFrame()
+------+--------------------+--------------------+
| a | no_frame | with_frame |
+------+--------------------+--------------------+
| 1.0 | 1.0 | 10.333333333333334 |
| 10.0 | 5.5 | 10.333333333333334 |
| 20.0 | 10.333333333333334 | 10.333333333333334 |
+------+--------------------+--------------------+
Expected behavior
When order_by is not specified, default to unbounded preceeding to unbounded following.
Additional context
The offending line of code appears to be here:
https://github.com/apache/datafusion-python/blob/main/src/functions.rs#L230
Describe the bug
When no window frame is specified in the python implementation, we default to unbounded preceeding to current row. If we are to follow PostgreSQL implementation then we should set this value when
order_byis specified and otherwise default to unbounded preceeding to unbounded following.To Reproduce
Produces:
Expected behavior
When
order_byis not specified, default to unbounded preceeding to unbounded following.Additional context
The offending line of code appears to be here:
https://github.com/apache/datafusion-python/blob/main/src/functions.rs#L230