Beam

Apache Beam originated from Google’s Dataflow Model in 2014. In 2016, Google donated Dataflow Model. Later with other community members’ contribution and improvement, it became Apache Beam.

Logo

../_images/apache_beam-small.png

Website

https://beam.apache.org/

Repository

https://github.com/apache/beam

Byline

Apache Beam is a unified model for defining both batch and streaming data-parallel processing pipelines, as well as a set of language-specific SDKs for constructing pipelines and Runners for executing them on distributed processing backends, including Apache Flink, Apache Spark, Google Cloud Dataflow and Hazelcast Jet.

License

Apache 2.0

Project age

8 years 5 months

Backers

Apache (Governed by), Google (Creator)

Lastest News (2022-08-23)

2.41.0 We are happy to present the new 2.41.0 release of Beam. This release includes both improvements and new functionality. For more … more

Size score (1 to 10, higher is better)

9.0

Trend score (1 to 10, higher is better)

8.25

Education Resources

URL

Resource Type

Description

https://beam.apache.org/documentation/

Documentation

Official project documentation.

Git Commit Statistics

Statistics computed using Git data through November 30, 2022.

Statistic

Lifetime

Last 12 Months

Commits

90,444

27,503

Lines committed

40,810,039

8,936,097

Unique committers

1,410

300

Core committers

23

22

../_images/apache_beam-monthly-commits.png

Similar Projects

Project

Size Score

Trend Score

Byline

Flink

9.25

7.0

Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities.

Hudi

6.75

8.0

Hudi is a rich platform to build streaming data lakes with incremental data pipelines on a self-managing database layer, while being optimized for lake engines and regular batch processing.

NiFi

8.5

5.0

Apache NiFi supports highly configurable directed graphs of data routing, transformation, and system mediation logic.

Storm

6.5

3.0

Storm is a distributed realtime computation system.