Amazon Kinesis | Load Data Streams to Data Stores&Analytics

Prepare and load real-time data streams into data stores and analytics tools

Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores and analytics tools. It can capture, transform, and load streaming data into Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and Splunk, enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today. It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration. It can also batch, compress, transform, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security.

Setup:

  1. Sign-up for an AWS account and setup the AWS Firehose delivery stream(s).

  2. Sign-up for a PubNub account, if you don't have one already.

  3. Import the Integration Block into your PubNub account. You can choose to add this to an existing application or create a new one.

  4. Configure the Integration Block module code, connecting it to your AWS account:

    1. Create a Secret in PubNub Vault, named AWS_ACCESS_KEY_ID, with value equal to your AWS Access Key ID for this service.

    2. Create a Secret in PubNub Vault, named AWS_SECRET_KEY, with value equal to your AWS Secret Key for this service.

    3. Create a Secret in PubNub Vault, named AWS_REGION, with value equal to your AWS Region for the target queue .

    4. Create a Secret in PubNub Vault, named AWS_ACCOUNT_NUMBER, with value equal to your AWS Account ID for this service.

  5. Start the Integration Block module.

  6. Try a test message (you can click on Publish to do this).

Walkthrough:

This Integration Block is setup to listen to a specific PubNub channel (CHANNEL-to-send-to-FIREHOSE) but you can configure it to listen to all channels or a subset, using wildcards, e.g. chat.*. Thus, you can route message data sent through PubNub over to an Amazon Kinesis Firehose for large scale data ingest and analytics use cases, for example. The target delivery stream name can be set in the functions code or controlled through message metadata. In publish metadata, the customization properties are set as part of a JSON object specific for the function named aws-kinesis-firehose-v1.

  1. TARGET_QUEUE_NAME: name of the queue that the message should be sent to

Example publish metadata:

{"functions": {"aws-kinesis-firehose-v1": {"input": {"TARGET_STREAM_NAME": "pn_test"}}}}

Talk to an expert