Data Hut™ — Open Source Project Directory
Curated insights on the most popular data science and data engineering projects. By combining machine learning techniques with expert knowledge, we help you to understand the open source landscape and to pick the best software for your needs.
Data Hut News (July 18, 2022): The July update is here! In addition to the latest project statistics, we have a new project, Shap, a key library for explainability in machine learning.
For site and project updates, follow us on Twitter: @datahutai
Projects by Category¶
Category |
Description |
Projects |
---|---|---|
Tools for transforming and analyzing the largest data sets. |
23 |
|
Tools for statistical analysis and machine learning. |
77 |
|
Data repositories |
39 |
|
Processing data as networks of interconnected nodes. |
25 |
To jump to a project directly or find by keyword, use the search page or the search box above.
Popular Communities and Project Backers¶
Community |
Website |
Description |
---|---|---|
Apache is the world’s largest open source foundation with over 300 top-level projects. |
||
As the world’s largest social network, FaceBook has created and sponsored a wide range of open source projects. |
||
As a multinational technology company, Google has created and sponsored over 2,000 open source projects in a wide range of areas, from programming languages to UI frameworks to machine learning. |
||
NumFocus is a 501(3)c public charity founded in 2012 to provide a fiscal umbrella for many open source software projects that have become essential for science and research. NumFocus sponsored projects benefit from a range of services including fiscal, legal, and operational. |
Latest News¶
Date |
Topic |
Description |
---|---|---|
2022-07-15 |
Cortex 1.13.0 This release contains 112 contributions from 51 contributors. Thank you! Some notable new features in … more |
|
2022-07-14 |
v2.10.0 Added metric FBetaVerboseMeasure which extends FBetaMeasure to ensure compatibility with logging plugins and … more |
|
2022-07-12 |
v3.4.0 spaCy v3.4 brings typing and speed improvements along with new vectors for English CNN pipelines and new … more |
|
2022-07-04 |
v0.2.0 Added 3 popular model architectures from literature (GATv2, GraphSAGE, and GCN) and a commonly-used MPNN … more |
|
2022-07-01 |
v2.3 Add the –instance-id flag to influxd runtime to add the _instance_id tag to remote replications metrics. Helps … more |
|
2022-06-28 |
PyTorch 1.12: TorchArrow, Functional API for Modules and nvFuser, are now available We are excited to announce the … more |
|
2022-06-24 |
v1.1 Minimum required version of Python is now 3.7; Removed dependency on pystan==2.19.1.1, which is no longer … more |
|
2022-06-23 |
1.23.0 The NumPy 1.23.0 release continues the ongoing work to improve the handling and promotion of dtypes, increase … more |
|
2022-06-22 |
0.20.0 Minor release - improvements to README more |
|
2022-06-21 |
v0.25.0 Reducescatter for NCCL, MPI and Gloo, AMD GPU XLA Op implementation, Spark Estimator improvements, … more |
|
2022-06-20 |
3.1.0 Over forty enhancements and bug fixes in the 3.1 release. more |
|
2022-06-13 |
v0.12.0 Time series classification: deep learning based algorithms, port of sktime-dl into sktime; forecasting data … more |