Theodolite’s Stream Processing Benchmarks
Theodolite comes with 4 application benchmarks, which are based on typical use cases for stream processing within microservices. For each benchmark, a corresponding load generator is provided. Currently, Theodolite provides benchmark implementations for Apache Kafka Streams, Apache Flink, Hazelcast Jet and Apache Beam (with Samza and Flink).
Theodolite’s benchmarks (labeled UC1–UC4) represent some sort of event-driven microservice performing Industrial Internet of Things data analytics. Specifically, they are derived from a microservice-based research software for analyzing industrial power consumption data streams (the Titan Control Center).
Stream processing engine | UC1 | UC2 | UC3 | UC4 |
---|---|---|---|---|
Apache Kafka Streams | ✓ | ✓ | ✓ | ✓ |
Apache Flink | ✓ | ✓ | ✓ | ✓ |
Hazelcast Jet | ✓ | ✓ | ✓ | ✓ |
Apache Beam (Samza/Flink) | ✓ | ✓ | ✓ | ✓ |
Installation
When installing Theodolite with Helm and the default configuration, also our stream processing benchmarks are automatically installed.
This can be verified by running kubectl get benchmarks
, which should yield something like:
NAME AGE STATUS
uc1-beam-flink 2d20h Ready
uc1-beam-samza 2d20h Ready
uc1-flink 2d20h Ready
uc1-hazelcastjet 2d16h Ready
uc1-kstreams 2d20h Ready
uc2-beam-flink 2d20h Ready
uc2-beam-samza 2d20h Ready
uc2-flink 2d20h Ready
uc2-hazelcastjet 2d16h Ready
uc2-kstreams 2d20h Ready
uc3-beam-flink 2d20h Ready
uc3-beam-samza 2d20h Ready
uc3-flink 2d20h Ready
uc3-hazelcastjet 2d16h Ready
uc3-kstreams 2d20h Ready
uc4-beam-flink 2d20h Ready
uc4-beam-samza 2d20h Ready
uc4-flink 2d20h Ready
uc4-hazelcastjet 2d16h Ready
uc4-kstreams 2d20h Ready
Alternatively, all benchmarks can also be found at GitHub and installed manually with kubectl apply -f <benchmark-yaml-file>
. Additionally, you would need to package the benchmarks’ Kubernetes resources into a ConfigMap by running:
kubectl create configmap <configmap-name-required-by-benchmark> --from-file <directory-with-benchmark-resources>
See the install-configmaps.sh script for examples.
Running Benchmarks
To run a benchmark, you need to create and apply an Execution
YAML file as described in the running benchmarks documentation.
Some preliminary results of our benchmarks can be found in our publication:
- S. Henning and W. Hasselbring. “Theodolite: Scalability Benchmarking of Distributed Stream Processing Engines in Microservice Architectures”. In: Big Data Research 25. 2021. DOI: 10.1016/j.bdr.2021.100209.
Control the Number of Load Generator Instances
Depending on the load to be generated, the Theodolite benchmarks create multiple load generator instances.
Per default, a single instance will generate up to 150 000 messages per second.
If higher loads are to be generated, accordingly more instances are deployed.
However, the actual load that can be generated by a single load generator instance depends on the cluster configuration and might be lower.
To change the maximum number of messages per instance, run the following commands.
Set the MAX_RECORDS_PER_INSTANCE
variable to the number of messages a single instance can generate in your cluster (use our Grafana dashboard to figure out that value).
export MAX_RECORDS_PER_INSTANCE=150000 # Change to your desired value
kubectl patch benchmarks uc1-beam-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc1-beam-samza --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc1-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc1-hazelcastjet --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc1-kstreams --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-beam-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-beam-samza --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-hazelcastjet --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc2-kstreams --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-beam-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-beam-samza --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-hazelcastjet --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc3-kstreams --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-beam-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-beam-samza --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-flink --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-hazelcastjet --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"
kubectl patch benchmarks uc4-kstreams --type json --patch "[{op: replace, path: /spec/loadTypes/0/patchers/1/properties/loadGenMaxRecords, value: $MAX_RECORDS_PER_INSTANCE}]"