Skip to content

Awesome Software

Useful packages written in any framework / language. For Python-specific packages, see Python.

Big data

  • forklift — Toolkit for working with Hadoop Sequence files.

Containers

  • Podman — Podman is a daemonless container engine for developing, managing, and running OCI Containers on your Linux System. Containers can either be run as root or in rootless mode. Simply put: alias docker=podman.
  • Singularity — Singularity is a container solution with a focus on building reproducible software stacks and running them most efficiently on existing HPC, scientific, compute farm and even enterprise architectures.
  • Docker — Docker makes deploying your entire development environment easier and portable than many other container software.

Data storage

Data pipelines

  • Pachyderm — A Containerized, Version-Controlled Data Lake.
  • Apache Beam
  • AWS Glue
  • Google Cloud Dataflow

Developer tools

  • Pagerduty
  • Sentry

Shell

  • ag — A code-searching tool similar to ack, but faster.
  • fwupdmgr — CLI tool for getting firmware updates.
  • mitmproxy — An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
  • sshuttle — Transparent proxy server that works as a poor man's VPN.
  • xpra - Screen for X11.

MLOps

  • argo — Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Argo Workflows is implemented as a Kubernetes CRD.
  • GitLab CI — Kubernetes executor currently ignores the Dockerfile entrypoint (see gitlab-org/gitlab-runner#4125).
  • kubeflow — Kubeflow is a free and open-source machine learning platform designed to enable using machine learning pipelines to orchestr. Sponsored by Google. Kubeflow pipelines builds on top of Argo.
  • MLflow — An open source platform for the machine learning lifecycle. Sponsored by Databricks and Microsoft. MLflow pipelines build on top of the Kubernetes Jobs API. Has more mature experiment tracking than Kubeflow.
  • Tecton — Feature store for enterprise data science workloads.
  • Tekton — Tekton is a powerful and flexible open-source framework for creating CI/CD systems, allowing developers to build, test, and deploy across cloud providers and on-premise systems.

Web

  • h5ai - Modern HTTP web server index.
  • monolith - Save entire web page as a single html document.
  • http-prompt
  • httpie
  • mycli — MySQL CLI.
  • ngrok