-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rolling join #378
Comments
This will require considerable work as we currently only support equi-joins. Might be best to do on top of lazy cross joins. |
I like the syntax proposed in #557 (comment). How about something similar for rolling joins:
This would select, for each "event" record, just that one "day" record right before the event. This seems to be implemented efficiently for data.table, and my feeling is that a data.frame implementation is easier for this operation than for generic inequality constraints. I'm not sure which SQL engines implement this, but I can imagine a generic solution that creates a temporary column so that a "normal" inequality join can be used. |
LOCF (last observation carried forward) is also used. The result of the operation is different if you allow fuzz in the "left" or in the "right" table. How about:
(This is different from |
Moving all future discussion to #2240 |
Ooops, sorry, script ran mildy amoc |
Would it be possible to add the rolling join feature to dplyr?
Rolling join, known also as last observation carried forward (LOCF), is an inequality join of two tables.
The example below contains two data frames:
What we need is match the most recent price to every record of sales log Ds. This is where data.table comes to rescue with a neat one-liner.
Session info:
The text was updated successfully, but these errors were encountered: