如何依托AWS IoT IoT 中获取商业价值¦‚何... · 2017-06-29 ·...

Post on 01-Aug-2020

33 views 0 download

Transcript of 如何依托AWS IoT IoT 中获取商业价值¦‚何... · 2017-06-29 ·...

AWS中国(北京)区域由光环新网运营

如何依托AWS IoT,从IoT应用中获取商业价值

张羽 (Alfred Zhang),AWS解决方案架构师Alfred Zhang, Solutions Architect, AWS China

2016年6月20日June 20th, 2017

AWS中国(北京)区域由光环新网运营

议程

• 理解AWS平台提供的不同类型的IoT相关的数据服务

• 如何使用AWS平台将数据转化成有效的决策和行动

• 如何选择不同类型的数据服务与AWS IoT进行集成

AWS中国(北京)区域由光环新网运营

数据驱动IoT应用

AWS中国(北京)区域由光环新网运营

从10到500亿到10000亿,万物互联势不可挡

AWS中国(北京)区域由光环新网运营

2025年,IoT有潜力创造4万亿到11万亿的经济价值

AWS中国(北京)区域由光环新网运营

跟随万物互联的步伐,踏入大数据的时代

“人”,“物”和“服务”在不断生成数据,包括:

• 状态信息

• 传感器读数

• 用户交互信息

• 执行器控制信息

• 运营事件

• ……

AWS中国(北京)区域由光环新网运营

IoT所面临的最大挑战

采集 -> 存储 -> 处理/分析 -> 使用/展现 -> 行动• 采集设备的数据,获得可行的决策,帮助业务的优化。

Greengrass

AWS IoT

AWS中国(北京)区域由光环新网运营

三类数据分析模式

Retrospectiveanalysis and reporting

面向过去

AWS中国(北京)区域由光环新网运营

三类数据分析模式

Here-and-nowreal-time processing and dashboards

面向现在

Retrospectiveanalysis and reporting

面向过去

AWS中国(北京)区域由光环新网运营

三类数据分析模式

Here-and-nowreal-time processing and dashboards

面向现在

Retrospectiveanalysis and reporting

面向过去

Predictionsto enable smart applications

面向未来

AWS中国(北京)区域由光环新网运营

三类数据分析模式

Here-and-nowreal-time processing and dashboards

• Amazon Kinesis • AWS Lambda• Amazon DynamoDB• Amazon EC2

Retrospectiveanalysis and reporting

• Amazon Redshift • Amazon RDS • Amazon S3• Amazon EMR

Predictionsto enable smart applications

• Amazon Machine Learning• Apache MXNet on AWS

AWS中国(北京)区域由光环新网运营

IoT应用的主要特性

AWS中国(北京)区域由光环新网运营

IoT需要实时快速的处理响应

• 在实时传感数据中侦测数据模式• 针对多个事件发现关联关系• 丰富实时数据的相关信息

用途• 触发快速响应• 优化业务运营• 改善用户体验

Here-and-nowreal-time processing and dashboards

面向现在

AWS中国(北京)区域由光环新网运营

IoT需要以历史数据为依据

Retrospectiveanalysis and reporting

面向过去

• 为当前事件提供历史数据分析依据• 基于历史数据进行趋势分析

用途• 从历史数据中发现知识和经验• 支撑高效的分析和报表,精确掌握业务运营状况• 支撑海量数据高精度的计量计费

AWS中国(北京)区域由光环新网运营

IoT需要实时快速的处理响应

• 依据事件进行模式侦测匹配• 通过数据的分布,发现潜在的规律,帮

助自动化建立规则

用途:• 预测未来可能发生的事件

• 发现潜在问题• 期望用户干预

• 制定可行决策:下一步做什么

Predictionsto enable smart applications

面向未来

AWS中国(北京)区域由光环新网运营

样例

AWS中国(北京)区域由光环新网运营

气温 / 环境 传感器

• 若干空调控制单元,每个控制单元3个传感器

• 持续秒级传输温度、湿度、气压更新数据

• 无缝接入云端

• 消息传输的可靠性需求适中

AWS中国(北京)区域由光环新网运营

来源于设备的样本消息

{"temperature" : "100","humidity" : "92","pressure" : "8”

}

AWS中国(北京)区域由光环新网运营

• 所有设备启用证书认证• 采用MQTT协议传输消息• 一个设备一个主体:

rooms/ac/${deviceID}

AWS中国(北京)区域由光环新网运营

有了传感数据,我们能做些什么呢?

-当前有多少传感器处于活动状态?

-当前的温度和昨天或者去年相同时间的温度基本相符?

-温度随着时间是如何变化的?

-气压和温度之间是怎样的关系?

-当前传感器读数是否可疑?

-当前传感器是否存在损坏状况?

-今天我们是否需要更换着装?

AWS中国(北京)区域由光环新网运营

AWS IoT

AWS中国(北京)区域由光环新网运营

AWS IoT: 安全的设备接入

• 多种协议支持的消息网关

数以10亿计的设备或应用可以同时接入,

支持MQTT、HTTP或WebSockets协议接口,

并且还会融入更多主流接口。

• 高度弹性化的Pub Sub Broker

以零配置的方式,高度弹性化的,

完成从1个设备到10亿个设备的链接开通。

• 安全为先

设备和AWS IoT平台之间通过X509 Certs和

TLS v1.2建议双向认证的安全通道

AWS中国(北京)区域由光环新网运营

AWS IoT: AWS服务的一大门户

• Device RegistryCloud alter-ego of a physical device. Persists metadata about the device.• Device ShadowsApps and devices can access “RESTful” Shadow (state) that is in sync with the device• Rules and ActionsMatch patterns and take actions to send data to other AWS services or republish

AWS中国(北京)区域由光环新网运营

AWS IoT规则引擎

• Rules Engine evaluates inbound messages published into AWS IoT, transforms and delivers to the appropriate endpoint based on business rules. • External Endpointscan be reached via AWS Lambda and Amazon SNS, etc.

AWS中国(北京)区域由光环新网运营

规则引擎的灵活性

SQL-like syntaxWhere operatorsInline functionsActions

AWS中国(北京)区域由光环新网运营

用于回溯性分析的数据处理

AWS中国(北京)区域由光环新网运营

数据的采集

• Devices set up as Things in Device Registry• Each device sends data as JSON via MQTT • One MQTT topic per device: rooms/ac/{deviceID}• Each device has a certificate and access rights to use its topic

AWS中国(北京)区域由光环新网运营

数据的存储

• Move all (?) incoming data into permanent storage• Make data available for later analysis:

• Reporting• Billing / Metering• Explorative Analysis• Machine Learning

AWS中国(北京)区域由光环新网运营

方法:

• Set up a Rule to Capture & Transform Incoming Data• Define an Action to Store the Data• Query & Analyze the Stored Data

AWS中国(北京)区域由光环新网运营

1) Set up a Rule to capture all sensor readings

AWS中国(北京)区域由光环新网运营

2) Define an Action to Store the Data

应该选择哪一种存储方案?

AWS中国(北京)区域由光环新网运营

Storage Options

AWS中国(北京)区域由光环新网运营

Storage Options: Amazon S3

• Actions can directly write into (JSON) files on S3• Very simple to configure, just provide bucket name• Results in 1 file per event

• Lots of small files can be hard to handle• Inefficient when processing with Hadoop/ Amazon EMR

or when importing into Redshift• Useful when you have a very low frequency of events,

e.g. when you only want to log outliers to S3

AWS中国(北京)区域由光环新网运营

Storage Options: Amazon S3

• Buffer data using Amazon Kinesis or Amazon Kinesis Firehose to get fewer, larger files

• Buffering, compression & output to S3 is built into Firehose – no other infrastructure needed!

• Kinesis Connector Library can be extended to perform transformation, filter or serialize data

• Additional Control over Buffering & Output Formats • Added complexity: Requires Amazon EC2 workers

running Kinesis Connector Library

AWS中国(北京)区域由光环新网运营

Storage Options: Amazon Redshift

• Actions can forward data Amazon Kinesis Firehose

• Buffering & output to Redshift is built into Firehose

• Very easy to setup • Fully managed

• Use Amazon Kinesis as an alternative • More control: Use Kinesis Connector Library to

perform transformation, filter or serialize data • Added complexity: Requires Kinesis Connector

Library etc. to execute on Amazon EC2

AWS中国(北京)区域由光环新网运营

Storage Options: Amazon DynamoDB

• Actions can directly write into Amazon DynamoDB• Creates one row per event, can define:

• Hash Key, Range Key and attributes to store• Hash Key = deviceID, Range Key=timestamp…

• Very simple to configure, just provide table & field names

• Adding GSIs and LSIs provides additional flexibility and enables different queries

AWS中国(北京)区域由光环新网运营

Storage Options: Amazon DynamoDB

AWS中国(北京)区域由光环新网运营

Storage Options: Amazon DynamoDB

• AWS Lambda function provides additional flexibility:• Transform data• Write into different/multiple tables• Enrich data with contextual information pulled in from

other sources

• Only able to process one event at a time! (i.e., AWS Lambda –when called from AWS IoT– cannot aggregate events before writing to DynamoDB)

AWS中国(北京)区域由光环新网运营

3) Query & Analyze the Stored Data

如何处理分析数据?

AWS中国(北京)区域由光环新网运营

3) Query & Analyze the Stored Data

AWS中国(北京)区域由光环新网运营

3) Query & Analyze the Stored Data

AWS中国(北京)区域由光环新网运营

Recommendations

• Want to run a lot of queries constantly? Use Kinesis Firehose to write into Amazon Redshift

• Need fast lookups, e.g., in Rules or Lambda functions? write into DynamoDB, add indices if necessary

• Have a need for heavy queries but not always-on? Use Kinesis Firehose & S3, process with Amazon EMR.

AWS中国(北京)区域由光环新网运营

样例

AWS中国(北京)区域由光环新网运营

1) Set up a Rule to capture all sensor readings

AWS中国(北京)区域由光环新网运营

2) Pump Data through Firehose into Redshift

AWS中国(北京)区域由光环新网运营

3) Analyze Data using Amazon QuickSight

AWS中国(北京)区域由光环新网运营

实时监控和实时响应

AWS中国(北京)区域由光环新网运营

目标

• 当有大幅气温变化的时,发出报警消息

• 采集当前传感读数,并进行可视化分析

AWS中国(北京)区域由光环新网运营

1) 设定规则,对传感读数做出响应

AWS中国(北京)区域由光环新网运营

1)设定规则,对传感读数做出响应

• AWS IoT Rules • only have access to the current event• cannot take contextual information into account

Consider passing all the data to the Action for evaluation.

AWS中国(北京)区域由光环新网运营

2) Process the Data

处理数据的最佳实践?

AWS中国(北京)区域由光环新网运营

Processing Options

AWS中国(北京)区域由光环新网运营

Processing Options

• 一次调用处理一个单一事件(非批量操作)

• 结合上下文信息,借助其他数据源丰富消息数据

• 执行各类转换操作

• 可任意运行Node.js或Java程序

• 无需管理底层基础设施!

AWS中国(北京)区域由光环新网运营

Processing Options

• Great for alerts: Sends push notifications, emails and SMS

• Call other systems via Http Post or Webhooks(AWS or On-Premises)

• SNS Topics support multiple subscribers, incl. AWS Lambda and Amazon SQS

AWS中国(北京)区域由光环新网运营

Processing Options

• Great when events arrive with varying frequency

• Buffer data for asynchronous processing• Ensure that no event data is lost• SNS Topics support multiple subscribers, incl.

AWS Lambda and Amazon SQS• Easily deploy SQS workers on AWS Elastic

Beanstalk (or Amazon EC2)

AWS中国(北京)区域由光环新网运营

Processing Options

• Provides access to a "rolling window" of event data • Scalable, can consume events from a multitude of different rules / topics / devices • Supports many independent, concurrent readers (&writers) • Multiple processing options:

AWS中国(北京)区域由光环新网运营

Processing Options

• Scalable way to connect many different systems to the stream of events, e.g., custom KCL code, Complex Event Processing (CEP) products • Amazon Kinesis is a hub for all stream processing needs

AWS中国(北京)区域由光环新网运营

Example:

1. Read last N events from stream 2. Determine maximum and rate of increase since beginning 3. Decide if alert should be sent

AWS中国(北京)区域由光环新网运营

Recommendations

• Only care about individual events? Invoke an AWS Lambda Function via Rule / Action

• For sliding window analysis and more flexibility Stream into Kinesis and Run AWS Lambda function

• Use Amazon Kinesis as a Hub for all incoming events.

AWS中国(北京)区域由光环新网运营

3) Visualize the Current Metrics

• Managed Amazon Elasticsearchas a service

• Easy & fast indexing of data, well suited for lookups on streaming data

• Easy to use visualization / dashboards using Kibana

AWS中国(北京)区域由光环新网运营

智能预测应用

AWS中国(北京)区域由光环新网运营

Machine learning and smart devices

• Machine learning is the technology that automatically finds patterns in your data and uses them to make predictions for new data points as they become available

AWS中国(北京)区域由光环新网运营

Machine learning and smart devices

• Machine learning is the technology that automatically finds patterns in your data and uses them to make predictions for new data points as they become available

• Your devices + machine learning = smart devices

AWS中国(北京)区域由光环新网运营

IoT Use Cases for Machine Learning

• Find potential problems by looking for patterns• Identify engines that are about to break down• Predict when supplies will run out• Spot sensors that report implausible data

• Predict next movement / direction of a connected vehicle• Based on driving parameters & observations from other cars• Predict traffic jams before they occur

AWS中国(北京)区域由光环新网运营

Amazon Machine Learning

• Real-time predictions (and batch)• Training & evaluation of machine learning

models• Picks the right model & parameters, helps

build training data

AWS中国(北京)区域由光环新网运营

Basic Approach

1. Collect / build training data• Take past data for sensor readings (temperature, humidity,

pressure), not the deviceIDor timestamp, as input• Target: we define which readings are 'correct' or incorrect and add

the target variable's value to the training data.

AWS中国(北京)区域由光环新网运营

Basic Approach

2. Train a Machine Learning Model

AWS中国(北京)区域由光环新网运营

Basic Approach

3. Create a real-time prediction endpoint for the model

AWS中国(北京)区域由光环新网运营

Basic Approach

4. Get predictions for events as they come in

AWS中国(北京)区域由光环新网运营

Basic Approach

1. Collect / build training data• Determine input variables & target• Evaluate the data to pick the target value for each set of inputs in the

data2. Train a Machine Learning Model

• Builds a model based on the information in the training data3. Create a real-time prediction endpoint for the model

• Outputs a prediction based on the input variables provided4. Get predictions for events as they come in

AWS中国(北京)区域由光环新网运营

Example Use Case: Filter out bad readings

1. Create a training data set based on past data & human evaluation of the data

• i.e., manually review the data and mark incorrect values2. Train a Amazon ML model on this data to predict which

combinations are (in)correct3. Invoke ML model on incoming data to predict correctness4. Alert staff via Amazon SNS push notification

AWS中国(北京)区域由光环新网运营

Lambda Function

AWS中国(北京)区域由光环新网运营

Lambda Function

AWS中国(北京)区域由光环新网运营

Lambda Function

AWS中国(北京)区域由光环新网运营

Recommendations

• Rely on past data / context rather than defining 'rules'

Use Amazon Machine Learning for an easy start

Let real-time predictions drive reaction to patterns in events

AWS中国(北京)区域由光环新网运营

总结和展望

AWS中国(北京)区域由光环新网运营

总结:AWS IoT + 大数据 + 机器学习

AWS中国(北京)区域由光环新网运营

展望: 未来的技术方向?

• 自动化事件处理响应:结合历史数据的分析,比对实时数据的特征,进行预测性分析,并基于预测性分析进行智能化干预

• Semi-supervised learning,Unsupervised learning,Reinforcement learning,Deep learning…

AWS中国(北京)区域由光环新网运营

问答

AWS中国(北京)区域由光环新网运营

Thank You!