spark时间序列数据分析.pdf
文本预览下载声明
Time Observation
4/10/1990 23:54:12 4.5
4/10/1990 23:54:13 5.5
4/10/1990 23:54:14 6.6
4/10/1990 23:54:15 7.8
4/10/1990 23:54:16 3.3
Time Something Something Else
4/10/1990 23:54:12 4.5 100.4
4/10/1990 23:54:13 5.5 101.3
4/10/1990 23:54:14 6.6 450.2
4/10/1990 23:54:15 7.8 600
4/10/1990 23:54:16 3.3 2000
●
●
●
●
○
●
○
Time
vec = datestr(busdays( 1/2/01,1/9/01,weekly))
vec =
05-Jan-2001
12-Jan-2001
●
○
●
○
●
○
●
○
SELECT buyerid, saletime, qtysold,
LAG(qtysold,1) OVER (order by buyerid, saletime) AS prev_qtysold
FROM sales WHERE buyerid = 3 ORDER BY buyerid, saletime;
buyerid | saletime | qtysold | prev_qtysold
+++
3 | 2008-01-16 01:06:09 | 1 |
3 | 2008-01-28 02:10:01 | 1 | 1
3 | 2008-03-12 10:39:53 | 1 | 1
3 | 2008-03-13 02:56:07 | 1 | 1
3 | 2008-03-29 08:21:39 | 2 | 1
3 | 2008-04-27 02:39:01 | 1 | 2
windowSpec = \
Window
.partitionBy(df[ category]) \
.orderBy(df[ revenue].desc()) \
显示全部