Kinesis
Real-time data streaming and visualization as a service
Essentials
Benefits
- Real-time processing - Continuous collect and build applications that build the data as it's generated 
 
- Parallel Processing - Multiple Kinesis consumer applications can be processing the same incoming data streaming concurrently 
 
- Durable - Kinesis synchronously replicates the streaming data across three data centers within a single AWS region and preserves the data for up to 7 days (24 hours by default) 
 
- Scales - Can stream from as little as a few megabytes to several terabytes per hour 
 
When to Use
- Gaming - Collect gaming data such as player actions and feed the data into the gaming platform, for example a reactive environment based off of real-time actions of the player 
 
- Real-time analytics - Collect IOT from many sources and high amounts of frequency and process it using Kinesis to gain insights as data arrives in your environment 
 
- Application alerts - Build a Kinesis application that monitors application logs in real-time and trigger events based off the data 
 
- Log/Event Data collection - Log data from any number of devices and use Kinesis applications to continuously process the incoming data, power real-time dashboards and store the data in S3 when completed 
 
- Mobile data capture - Mobile apps can push data to Kinesis from countless number of devices which makes the data available as soon as it is produced. 
 
Kinesis Video Streams
- Stream video to AWS 
- Real-time or batch video processing and analytics 
Kinesis Data Streams
- Ingest data from many sources 
- Real-time data processing applications 
- Kinesis connector - EMR 
- Data processed in sequence 
- Server side encryption 
Kinesis Firehose
- Load streaming data to S3, Redshift, Elasticsearch, Splunk 
Kinesis Data Analytics
- Run SQL queries against Data Streams or Firehose 
- Send results to output Data Stream or Firehose 
Data Streams Components
- Stream - contains one or more Shards 
- Shards (processing power) - 1 MB/sec data input and 2MB/sec data output 
- Distribute data to shards using a Partition Key 
 
- Producers (data creators) 
- Consumers (data consumers) 
Producers
- Devices that produce and send data to Kinesis 
- You build producers to continuously input data into a Kinesis stream 
- Can include (but not limited to) - IoT Sensors 
- Mobile devices (cell phones) 
 
- You can have literally thousands of different producers and scale based on need - More data you want to process, the more "shards" you add to your Kinesis stream 
- Each "shard" can process 2MB of read data per second, and 1MB of write data per second 
 
- Kinesis Data Streams API - PutRecord, PutRecords 
 
- Kinesis Producer Library - Java library that sends data to stream 
 
- Kinesis Agent - Stream files from Linux Servers 
 
Consumer
- Consume the stream's data 
- This is done concurrently (multiple consumers can consume the same data at the same time.) 
- Kinesis Consumer Library - Java Library (EC2) - wrapper for other languages 
- Launches a consumer for each shard 
- Autoscaling 
 
- Lambda can read stream data 
- Kinesis Connector for EMR 
Last updated
Was this helpful?
