现在是基于spark streaming 窗口的操作,10s 第一个批次传入数据
zhu01,bei01,20180516144035
zhu02,bei02,20180516144130
zhu03,bei03,20180516144235
20s 第二个批次
zhu01,bei01,20180516144035
zhu04,bei04,20180516144240
zhu05,bei05,20180516144255
zhu02,bei02,20180516144130
现在是基于spark streaming 窗口的操作,10s 第一个批次传入数据
zhu01,bei01,20180516144035
zhu02,bei02,20180516144130
zhu03,bei03,20180516144235
20s 第二个批次
zhu01,bei01,20180516144035
zhu04,bei04,20180516144240
zhu05,bei05,20180516144255
zhu02,bei02,20180516144130
使用updateState记录对象状态,以用来判断新老数据,参看这里的http://www.hainiubl.com/topics/197例子 updateStateByKey_last_update