RayDP

Logo

../_images/intel-bigdata_oap-raydp-small.png

Website

N/A

Repository

https://github.com/Intel-bigdata/oap-raydp

Byline

Distributed data processing library on Ray by running popular big data frameworks like Apache Spark on Ray. RayDP seamlessly integrates with other Ray libraries to make it simple to build E2E data analytics and AI pipeline.

License

Apache 2.0

Project age

2 years 7 months

Backers

Intel (Creator)

Lastest News (2022-12-02)

RayDP-0.6.0 Highlights: Support Ray 1.9.0 - 2.1.0; Support Spark 3.1 - 3.3; Spark master node affinity; Updated PyTorch and Tensorflow … more

Size score (1 to 10, higher is better)

2.0

Trend score (1 to 10, higher is better)

5.75

Education Resources

URL

Resource Type

Description

https://github.com/oap-project/raydp/blob/master/README.md

ReadMe

Documentation in ReadMe file

Git Commit Statistics

Statistics computed using Git data through November 30, 2022.

Statistic

Lifetime

Last 12 Months

Commits

3,606

285

Lines committed

1,102,681

66,709

Unique committers

25

14

Core committers

6

7

../_images/intel-bigdata_oap-raydp-monthly-commits.png

Similar Projects

Project

Size Score

Trend Score

Byline

Analytics Zoo

4.25

1.75

Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray

Dask

6.75

4.75

Parallel computing with task scheduling.

HPCC

6.0

6.25

HPCC Systems (High Performance Computing Cluster) is an open source, massive parallel-processing computing platform for big data processing and analytics.

Modin

5.0

7.5

Speed up your Pandas workflows by changing a single line of code

Pig

4.0

5.0

Apache Pig is a platform to create programs on top of Apache Hadoop.