Kinesis

Real-time data streaming and visualization as a service

Essentials

Benefits

  • Real-time processing

    • Continuous collect and build applications that build the data as it's generated

  • Parallel Processing

    • Multiple Kinesis consumer applications can be processing the same incoming data streaming concurrently

  • Durable

    • Kinesis synchronously replicates the streaming data across three data centers within a single AWS region and preserves the data for up to 7 days (24 hours by default)

  • Scales

    • Can stream from as little as a few megabytes to several terabytes per hour

When to Use

  • Gaming

    • Collect gaming data such as player actions and feed the data into the gaming platform, for example a reactive environment based off of real-time actions of the player

  • Real-time analytics

    • Collect IOT from many sources and high amounts of frequency and process it using Kinesis to gain insights as data arrives in your environment

  • Application alerts

    • Build a Kinesis application that monitors application logs in real-time and trigger events based off the data

  • Log/Event Data collection

    • Log data from any number of devices and use Kinesis applications to continuously process the incoming data, power real-time dashboards and store the data in S3 when completed

  • Mobile data capture

    • Mobile apps can push data to Kinesis from countless number of devices which makes the data available as soon as it is produced.

Kinesis Video Streams

  • Stream video to AWS

  • Real-time or batch video processing and analytics

Kinesis Data Streams

  • Ingest data from many sources

  • Real-time data processing applications

  • Kinesis connector - EMR

  • Data processed in sequence

  • Server side encryption

Kinesis Firehose

  • Load streaming data to S3, Redshift, Elasticsearch, Splunk

Kinesis Data Analytics

  • Run SQL queries against Data Streams or Firehose

  • Send results to output Data Stream or Firehose

Data Streams Components

  • Stream - contains one or more Shards

  • Shards (processing power)

    • 1 MB/sec data input and 2MB/sec data output

    • Distribute data to shards using a Partition Key

  • Producers (data creators)

  • Consumers (data consumers)

Producers

  • Devices that produce and send data to Kinesis

  • You build producers to continuously input data into a Kinesis stream

  • Can include (but not limited to)

    • IoT Sensors

    • Mobile devices (cell phones)

  • You can have literally thousands of different producers and scale based on need

    • More data you want to process, the more "shards" you add to your Kinesis stream

    • Each "shard" can process 2MB of read data per second, and 1MB of write data per second

  • Kinesis Data Streams API

    • PutRecord, PutRecords

  • Kinesis Producer Library

    • Java library that sends data to stream

  • Kinesis Agent

    • Stream files from Linux Servers

Consumer

  • Consume the stream's data

  • This is done concurrently (multiple consumers can consume the same data at the same time.)

  • Kinesis Consumer Library

    • Java Library (EC2) - wrapper for other languages

    • Launches a consumer for each shard

    • Autoscaling

  • Lambda can read stream data

  • Kinesis Connector for EMR

Last updated

Was this helpful?