Skip to content

Instantly share code, notes, and snippets.

@evnm
Created November 11, 2015 21:19
Show Gist options
  • Save evnm/909df05ed7ecf6ed71ba to your computer and use it in GitHub Desktop.
Save evnm/909df05ed7ecf6ed71ba to your computer and use it in GitHub Desktop.
README for a hypothetical stream-consumption command-line tool

scat

A stream consumption tool.

Status: Vaporware

Why?

In 2015, stream-processing is the it-girl of software architecture. This sub-field is driven forward by a growing number of large-scale, analysis-oriented storage systems developed by open source communities and SaaS companies alike. Tooling, on the other hand, feels as though it's fallen by the wayside.

A "stream-processing engine" seems to comprise a clustered set of servers running a complicated application in a datacenter somewhere. Treating the engine as a black box, programs write data to it and are able to read values back, presumably via a vendor-provided client library.

But what if we want to inspect values passing through the stream? An SDK provides programmatic access, but is a long way off from out-of-the-box observability into the operation of a system. Rarely are legitimately-useful command-line programs provided for debugging and consuming streams.

This developer feels as if we're constructing a pristine edifice with limited regard for how occupants are supposed to interact with the new environment day-to-day.

Enter scat, a proposed command-line tool for interacting with stream-processing systems. Along similar lines as existing tools such as kafkacat and aws kinesis get-records, a developer uses scat to read stream contents directly from a terminal. In this way, it can be thought of as cat for streams-processing systems.

Plan

scat will initially target Amazon Kinesis.

A lofty goal of this project is for the tool to be adaptable to multiple stream-processing systems (e.g. Kafka, Flume, WhateverMQ).

Examples

$ scat -h
usage: scat -s <stream-name> [-r <aws-region>] [-q <seqno>] [-n <interval>]

Print messages from a user-events Kinesis stream in the us-west-2 region to stdout in real time:

$ scat -s user-events -r us-west-2

Print messages from a Kinesis stream starting at a given sequence number to stdout:

$ scat -s user-events -r us-west-2 -n 21267647932558653966460912964485513216

Filter GZIP'd messages from a Kinesis stream and write them to a file:

$ scat -s user-events -r us-west-2 | gunzip | grep 'user_id:1583331' > /var/log/my-user-events.log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment