Useful packages written in any framework / language. For Python-specific packages, see Python.
forklift— Toolkit for working with Hadoop Sequence files.
- Podman — Podman is a daemonless container engine for developing, managing, and running OCI Containers on your Linux System. Containers can either be run as root or in rootless mode. Simply put:
- Singularity — Singularity is a container solution with a focus on building reproducible software stacks and running them most efficiently on existing HPC, scientific, compute farm and even enterprise architectures.
- Docker — Docker makes deploying your entire development environment easier and portable than many other container software.
- Pachyderm — A Containerized, Version-Controlled Data Lake.
- Apache Beam
- AWS Glue
- Google Cloud Dataflow
ag— A code-searching tool similar to ack, but faster.
fwupdmgr— CLI tool for getting firmware updates.
mitmproxy— An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
sshuttle— Transparent proxy server that works as a poor man's VPN.
xpra- Screen for X11.
argo— Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Argo Workflows is implemented as a Kubernetes CRD.
GitLab CI— Kubernetes executor currently ignores the Dockerfile entrypoint (see
kubeflow— Kubeflow is a free and open-source machine learning platform designed to enable using machine learning pipelines to orchestr. Sponsored by Google. Kubeflow pipelines builds on top of Argo.
MLflow— An open source platform for the machine learning lifecycle. Sponsored by Databricks and Microsoft. MLflow pipelines build on top of the Kubernetes Jobs API. Has more mature experiment tracking than
Tecton— Feature store for enterprise data science workloads.
Tekton— Tekton is a powerful and flexible open-source framework for creating CI/CD systems, allowing developers to build, test, and deploy across cloud providers and on-premise systems.