Ship a streaming pipeline before lunch.
One binary. YAML config. Run it against a print sink locally, swap in Postgres in production.
A performant, extensible streaming runtime written in Rust using Apache Arrow and Apache DataFusion. Define pipelines in YAML. Hit production in seconds.
sources:
raw.transactions:
type: kafka
topic: raw.event.transaction
transforms:
large_transactions:
type: sql
primary_key: id
sql: |
SELECT *
FROM raw.transactions
WHERE amount > 1000
sinks:
pg.large_transactions:
from: large_transactions
type: postgres
schema: public
table: large_transactions
primary_key: idStreamling skips distributed shuffles, complex joins, and stateful aggregations. Instead it offers a streaming engine for application developers connecting datastores, enriching events with HTTP, and powering user-facing products.
WebAssembly-sandboxed JS/TS transforms run a process(input) function per record. Or use full DataFusion SQL with hundreds of built-in functions.
Most app data lives behind APIs. Streamling uses HTTP for enrichment transforms and webhook sinks, not just Kafka.
Postgres, ClickHouse, and Kafka are core sinks. Schema and tables are created automatically; upsert semantics are tracked through the engine.
No coordinator or ZooKeeper. Tens of thousands of messages per second on 0.5 of a CPU core.
Streamling makes opinionated trade-offs so application teams don't have to babysit infrastructure. Less to learn. Less to operate.
No distributed shuffle or coordinator. Scale out horizontally with Kafka consumer groups. Each instance picks up its own partitions, naturally.
Operators are stateless by default. A pluggable state backend stores small bits of metadata, like Kafka offsets, in SQLite or Postgres.
Markers flow source → transforms → sinks. Sinks flush before sources commit. Combined with sink-level upserts, you get effectively-once in most pipelines.
Tens of thousands of messages per second on half a CPU core. Each operator is a pull-based DataFusion ExecutionPlan, passing Arrow RecordBatch data over Tokio streams.
Need to filter against a list of IDs? Point a SQL transform at a Postgres-backed dynamic table. Update the lookup data without restarting the pipeline.
Live inspection of in-flight data. Print and blackhole sinks for debugging. Instant startup. OpenTelemetry built in. validate mode for CI and agentic tools.
Kafka, Postgres, ClickHouse, HTTP, SQL, TypeScript. These are just the built-in connectors, more are available as plugins.
sources:
ethereum.raw_blocks:
type: kafka
topic: mainnet.raw.blocks
# Offsets committed only after sinks flush.process(input) function.U256/I256 support. Upserts.ReplacingMergeTree upserts. Gzip compression.sources:
ethereum.raw_blocks:
type: kafka
topic: mainnet.raw.blocks
# Offsets committed only after sinks flush.Sources, transforms, sinks, UDFs and topology preprocessors all share a single, simple trait. If the system you want to talk to has a Rust SDK, a plugin is a few dozen lines.
abi_stableRecordBatch over FFIuse streamling_plugin::{register_plugin_sink, SinkPlugin};
use streamling_core::{Result, PluginError, CheckpointEpoch};
use arrow::record_batch::RecordBatch;
pub struct SqsSink { /* ... */ }
#[async_trait]
impl SinkPlugin for SqsSink {
async fn initialize(&self) -> Result<(), PluginError> {
// open connection, register schema, etc.
Ok(())
}
async fn process_batch(&self, data: RecordBatch) -> Result<(), PluginError> {
// your sink logic — react to the incoming RecordBatch
self.client.send_batch(data).await?;
Ok(())
}
async fn process_checkpoint_marker(&self, epoch: CheckpointEpoch)
-> Result<(), PluginError> { /* prepare */ Ok(()) }
async fn process_checkpoint_finalizer(&self, epoch: CheckpointEpoch)
-> Result<(), PluginError> { /* commit */ Ok(()) }
}
register_plugin_sink!("sqs", SqsSink);Streamling has run in production for months as the engine for Goldsky's flagship product. It powers thousands of pipelines moving blockchain data for some of the largest teams in crypto.
Streamling powers Turbo, our newest product, which in turn powers products at Polymarket, Stripe, Phantom, and many more. Since launch, teams have built thousands of pipelines on it.
One binary. YAML config. Run it against a print sink locally, swap in Postgres in production.