Working with Firehoses

Firehoses provide a queue-like mechanism to collate event data from multiple devices and make it available from a single access point. Event data can describe when:

This event data can potentially be from thousands or millions of devices. Devices authenticated with the BlackBerry IoT Platform can consume event data from a firehose. Because of the potential for enormous amounts of event data, devices typically store the data locally (i.e., into a database) and process the event data at a later time. The processing can be for business analytics, report generation, data backups, etc. Event data in the firehose is usually available for two days, but it depends on the size and amount of data being streamed to the firehose.

A firehose's subscriptions determine the event data that flows through it. Each firehose is identified by its universally unique identifier (UUID), so not only do other devices require the correct capability to consume event data from a firehose, but they must know the UUID of the firehose to consume from. For more information about how to create firehoses and subscriptions, see Manage firehoses.

The following illustration shows how firehoses get event data from subscriptions, and how event data can be consumed by other devices.

Firehoses, subscriptions, and consumer

Before you work with firehoses, the device must have (or inherit) the required firehose capabilities. Here are some examples of the firehose capabilities that you require to perform certain tasks:

For more information about the available capabilities for using firehoses, see System-defined capabilities.

Create subscriptions

After you create a firehose, you can add one or more subscriptions to it. Subscriptions specify what event data is published or streamed to the firehose. For each subscription, you must configure the scope and the type of information that's streamed to the firehose. If you don't add at least one subscription to a firehose, you won't see any event data from it. To add a subscription to a firehose, the firehose.subscription.create capability is required on the application or device you want to publish or stream information from and the firehose.update capability is required on firehose.

The scope of a subscription can be one of the following:

The types of event data streamed from devices can include:

If you wanted all event data streamed for a given scope, you can use a wildcard (*). The default behavior is to stream all event data when you specify data, file, or the life-cycle of the specified device or application. For data types, you can create filters that stream only event data that match a specified regular expression (RegEx). For example, if you had the following data:

You could use a regular expression of car.dash* to have all event data that matches car.dashboard streamed to the firehose. If you wanted all changes to car data, you could use .*car.

If you have multiple subscriptions that stream the exact same event data to the firehose, it's only streamed once to the firehose. For example, if you added subscription A, which streams event data for car.dash.speed and car.dash.gas, and subscription B streams car.dash.gas, you would see car.dash.speed and car.dash.gas once for that event datum (i.e., car.dash.gas isn't repeated).

It's important to mention that the firehose.attach capability controls only whether the device can consume event data from the firehose. There are no controls in place that determine whether the user has permission on that data from a firehose. It's possible for devices that are allowed to consume from a firehose to see event data for devices they haven't been granted read capabilities to. You are responsible for ensuring that when you create a subscription that sensitive data isn't inadvertently made visible.

For more information about how to create subscriptions, see Create a subscription.

Consume from a firehose

A device can consume information from a firehose when it is:

To attach to and consume data from a firehose, devices can issue either a single hanging GET request or multiple hanging GET requests. Multiple hanging GET requests are useful when you want the event data split between responses for scalability. A hanging GET request remains open until the device closes the network connection (or it's closed by some other means). Each time you receive data, you get an HTTP 200 response. If no data is received, the platform sends keep-alive messages approximately every five minutes to prevent the hanging GET from closing. The keep-alive messages are simply \r\n characters. As long as the hanging GET remains open, event data, your device receives event data as it becomes available in the firehose.

It's important to mention that the BlackBerry IoT Platform ensures that event data streamed to different devices are independent of each other. This design means that each device (as identified by its UUID) has its own complete copy of event data from the same firehose.

Payload of a firehose

The schema for event data from a firehose is as follows:

{
   token:
   {
       orgId:        "UUID",
       appId:        "UUID",
       deviceId:     "UUID",
       eventType:    "String",
       eventSubType: "String",
       eventMeta:    "JSON object"
   }
}
where:

Acknowledge event data

In addition to using a hanging GET to consume event data from a firehose, you can acknowledge event data so that you don't need to re-consume event data that you previously received from the firehose. If you don't acknowledge any of the event data that's received on a device, the next time you reconnect with a hanging GET request, you'll need to start from the beginning of the firehose queue (oldest event data). If you don't acknowledge event data, it means you may get event data that you previously received when you reconnect.

Devices use a PUT request to acknowledge event data. The acknowledgment lets the platform know a new starting point of where to start sending event data if the device later reconnects. The starting point on a firehose is always the oldest data that's available on it (or beginning of the queue) under these circumstances:

When you use a PUT request, you specify the UUID of the firehose and a JSON object that specifies the token (which is a base64-encoded string) of the event data to acknowledge. There's a unique token value associated with each event data. The token is used to identify the event data in the firehose to use as a starting point to start reading from if the device needs to reconnect.

This starting point is maintained by the BlackBerry IoT Platform. Whether you choose to use one or multiple GET requests, the BlackBerry IoT Platform tracks what event data a device has acknowledged. The platform will resend event data that haven't been acknowledged the next time the device needs to reconnect.

The event data that's consumed from the firehose is guaranteed to be kept in the same order as the data that been subscribed to the firehose. This guarantee is important because it means when you acknowledge received event data, you are also acknowledging any event data that was received before that acknowledged event data for a given GET request. For this reason, you must send separate PUT requests for each hanging GET request you have open.

Note: When you acknowledge event data from device, it doesn't affect the event data being received from another device.

Example sequence to consuming event data from a firehose

Here's an example sequence of a device using a hanging GET request to consume event data from a firehose. For simplicity, the example describes a single hanging GET request. You can issue multiple hanging GET requests to distribute the event data that's received on your device. For more information to perform parallel consumption of event data, see Consume from a firehose using parallel requests

The subscription is set up to stream life cycle events for an application entity. The example below, shows event data whenever an application is created. The illustration below shows how to send an acknowledgment using a token to specify a new starting point on the firehose.

In this example, there are five events that are sent and the fifth event data received is acknowledged. The specific event data to acknowledge is application-specific (i.e., when you want to acknowledgment the data is determined by what makes sense for your requirements, such as the number of event data you want to read or a time interval before the game ends).

Note: For illustrative purposes below, simple strings to represent the starting point for event data. For example, base64-encoded token 1` instead of the base64-encoded value that looks like this:

eyJhZGRyZXNzIjoiZWEzZWQ4MzAtZjI5YS0xMWU0LThhNjQtMGQ5ZmU4YTRhNTE3IiwiZGF0YSI6eyJSQlNlcSI6MSwicGFydGl0aW9ucyI6eyIxIjowLCI1IjowfX19

Consuming from the firehose

  1. The device issues a hanging GET request to consume data from the firehose.
    curl  -k -X GET https://bbryiot.com/api/1/firehoses/[firehose UUID]/data
          -H "Authorization: Bearer [access token]"
       
  2. When the device starts receiving data, every fifth event data is acknowledged. So let's say that the device consumes the following event data from the firehose:
      {
        "token": "[base64-encoded token 1]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "4bd8fa8a-f378-11e4-8d65-6f6f5a7ffa2b",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "7bd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
                "created_on": 1430865952740,
                "description": "Device 1 description",
                "identifier": "Device1",
                "name": "Device1_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    {
        "token": "[base64-encoded token 2]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "5bd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "8bd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2c",
                "created_on": 1430865952940,
                "description": "Device 2 description",
                "identifier": "Device2",
                "name": "Device2_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    {
        "token": "[base64-encoded token 3]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "6bd8fa8a-f378-11e4-8d65-6f6f5a7ffa2d",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "9bd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
                "created_on": 1430865953140,
                "description": "Device 3 description",
                "identifier": "Device3",
                "name": "Device3_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    {
        "token": "[base64-encoded token 4]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "7cd8fa8a-f378-11e4-8d65-6f6f5a7ffa2e",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "ffd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
                "created_on": 1430865953340,
                "description": "Device 4 description",
                "identifier": "Device4",
                "name": "Device4_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    {
        "token": "[base64-encoded token 5]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "89d8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "ffd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
                "created_on": 1430865953540,
                "description": "Device 5 description",
                "identifier": "Device5",
                "name": "Device5_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    
  3. After receiving the fifth event data, the device acknowledges the event data that it had received using a PUT request. The PUT request specifies the firehose UUID and the base64-encoded token as a JSON value. The token acknowledges a specific event data was received, but also indicates that everything before that event data was received. Here's what the PUT request would look like:
    curl -k -X PUT https://bbryiot.com/api/1/firehoses/[firehose UUID]/token
         -d {\"token\":\"[base64-encoded token 5]\"}
         -H "Authorization:Bearer [access token]"
         -H "Content-Type:application/json"
        
  4. When the BlackBerry IoT Platform receives the PUT request, the starting point is moved based on the specified token in the previous step.
  5. After more data is received, but before the device can issue another PUT request to acknowledge the event data that was received, the device restarts and the GET request closes. In this situation the BlackBerry IoT Platform doesn't receive an acknowledgment for the following event data that was sent:
        {
        "token": "[base64-encoded token 6]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "8cd8fa8a-f378-11e4-8d65-6f6f5a7ffa2e",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "ffd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
                "created_on": 1430865953740,
                "description": "Device 4 description",
                "identifier": "Device4",
                "name": "Device4_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    {
        "token": "[base64-encoded token 7]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "99d8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "ffd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
                "created_on": 1430865953940,
                "description": "Device 5 description",
                "identifier": "Device5",
                "name": "Device5_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    
  6. To consume from the firehose again, the device issues a hanging GET request again as follows:
    curl  -k -X GET https://bbryiot.com/api/1/firehoses/[firehose UUID]/data
          -H "Authorization: Bearer [access token]"
       
  7. The server starts sending event data from the last acknowledged event data and resends data that was previously received by the device. The device should have application-logic to handle any duplicate data that's resent. Here what the resent data would look like:
        {
        "token": "[base64-encoded token 6]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "8cd8fa8a-f378-11e4-8d65-6f6f5a7ffa2e",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "ffd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
                "created_on": 1430865953740,
                "description": "Device 4 description",
                "identifier": "Device4",
                "name": "Device4_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    {
        "token": "[base64-encoded token 7]",
        "data": {
            "orgId": "428cc410-6110-11e4-a94b-373082eea594",
            "appId": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
            "deviceId": "99d8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
            "eventType": "lifecycle",
            "eventSubType": "create",
            "eventMeta": {
                "id": "ffd8fa8a-f378-11e4-8d65-6f6f5a7ffa2f",
                "app_id": "92bea246-f369-11e4-8d65-6f6f5a7ffa2f",
                "created_on": 1430865953940,
                "description": "Device 5 description",
                "identifier": "Device5",
                "name": "Device5_firehoseapp",
                "org_id": "428cc410-6110-11e4-a94b-373082eea594",
            }
        }
    }
    

Consume from a firehose using multiple GET requests

The BlackBerry IoT Platform can distribute event data between multiple hanging GET requests from the same device (identified by its UUID). The BlackBerry IoT Platform supports up to eight hanging GET requests from the same device. It's important to mention that a device doesn't have to be a single physical piece of hardware, but it can be a cluster of machines where each machine can issue its own hanging GET request, but using the same device UUID. Depending on the number of subscriptions and the amount of data being fed to the firehose by subscribed devices, it may be necessary to process multiple GET requests in this manner to handle the large amounts of data and to meet any necessary performance requirements you may have. For example, you might get so much data from a firehose that you require multiple physical machines (using the same device UUID to connect to the BlackBerry IoT Platform) to read from the network and write that data to a database in a timely manner.

Note: The access token, which you use in the Authorization Header in the GET request, identifies the device to the BlackBerry IoT Platform.

It's important to recognize that there isn't a difference in the event data that you receive if you issue single hanging GET request or multiple hanging GET requests; you will still get all the data. The advantage of using multiple GET requests simultaneously is that it allows your device to scale appropriately as the event data increases. In rare situations (i.e., network connection closes due to network connectivity), you might get duplicate event data.

The flow for handling the GET requests is identical to how you using a single GET response as shown in the illustration below.

Handling simultaneous GET requests

  1. The device sends three parallel hanging GET requests.
  2. The BlackBerry IoT Platform determines that the hanging GET requests come from the same device and distributes the event data accordingly.
  3. The distributed event data is sent to the device as multiple responses. The device receives the event data and completes the required processing of the event data from the GET responses.
  4. For each hanging GET request, separate PUT requests are sent to acknowledge the event data that was received.
  5. When the platform receives the PUT request, the specified token is used to update the starting point for the event data on the platform.

Considerations for using multiple GET requests

Here are a few considerations when you issue multiple, simultaneous GET requests from a single device: