1. Machine Learning - Part I

Reproduced from GitHub https://github.com/

A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php

Further resources:

Table of Contents

Frameworks and Libraries Tools

Credits

APL

General-Purpose Machine Learning

C

General-Purpose Machine Learning

Computer Vision

C++

Computer Vision

General-Purpose Machine Learning

Natural Language Processing

Speech Recognition

Sequence Analysis

Gesture Detection

Common Lisp

General-Purpose Machine Learning

Clojure

Natural Language Processing

General-Purpose Machine Learning

Deep Learning

Data Analysis

Data Visualization

Interop

Misc

Extra

Crystal

General-Purpose Machine Learning

Elixir

General-Purpose Machine Learning

Natural Language Processing

Erlang

General-Purpose Machine Learning

Fortran

General-Purpose Machine Learning

Data Analysis / Data Visualization

Go

Natural Language Processing

General-Purpose Machine Learning

Spatial analysis and geometry

Data Analysis / Data Visualization

Computer vision

Reinforcement learning

Haskell

General-Purpose Machine Learning

Java

Natural Language Processing

General-Purpose Machine Learning

Speech Recognition

Data Analysis / Data Visualization

Deep Learning

Javascript

Natural Language Processing

Data Analysis / Data Visualization

General-Purpose Machine Learning

Misc

Demos and Scripts

Julia

General-Purpose Machine Learning

Natural Language Processing

Data Analysis / Data Visualization

Misc Stuff / Presentations

Lua

General-Purpose Machine Learning

Demos and Scripts

Matlab

Computer Vision

Natural Language Processing

General-Purpose Machine Learning

Data Analysis / Data Visualization

.NET

Computer Vision

Natural Language Processing

General-Purpose Machine Learning

Data Analysis / Data Visualization

Objective C

General-Purpose Machine Learning

OCaml

General-Purpose Machine Learning

Perl

Data Analysis / Data Visualization

General-Purpose Machine Learning

Perl 6

Data Analysis / Data Visualization

General-Purpose Machine Learning

PHP

Natural Language Processing

General-Purpose Machine Learning

Python

Computer Vision

Natural Language Processing

General-Purpose Machine Learning

Data Analysis / Data Visualization

Misc Scripts / iPython Notebooks / Codebases

Neural Networks

Kaggle Competition Source Code

Reinforcement Learning

Ruby

Natural Language Processing

General-Purpose Machine Learning

Data Analysis / Data Visualization

Misc

Rust

General-Purpose Machine Learning

R

General-Purpose Machine Learning

Data Manipulation | Data Analysis | Data Visualization

SAS

General-Purpose Machine Learning

Data Analysis / Data Visualization

Natural Language Processing

Demos and Scripts

Scala

Natural Language Processing

Data Analysis / Data Visualization

General-Purpose Machine Learning

Scheme

Neural Networks

Swift

General-Purpose Machine Learning

TensorFlow

General-Purpose Machine Learning

Tools

Neural Networks

Misc

Credits

 


2. Machine Learning with Python - Part II

This curated list contains 840 awesome open-source projects with a total of 2.8M stars grouped into 32 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome!

mage_man  Discover other best-of lists or create your own.
mailbox  Subscribe to our newsletter for updates and trending projects.

Contents

Explanation

Machine Learning Frameworks

Back to top

General-purpose machine learning and deep learning frameworks.

Tensorflow (1st_place_medal44 · star 160K) - An Open Source Machine Learning Framework for Everyone. Apache-2 PyTorch (1st_place_medal39 · star 47K) - Tensors and Dynamic neural networks in Python with strong GPU.. BSD-3 PySpark (1st_place_medal38 · star 29K) - Apache Spark Python API. Apache-2 scikit-learn (1st_place_medal37 · star 45K) - scikit-learn: machine learning in Python. BSD-3 StatsModels (1st_place_medal36 · star 6.1K) - Statsmodels: statistical modeling and econometrics in Python. BSD-3Keras (1st_place_medal35 · star 51K) - Deep Learning for humans. MIT XGBoost (1st_place_medal35 · star 21K) - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or.. Apache-2LightGBM (1st_place_medal35 · star 12K) - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT,.. MITMXNet (2nd_place_medal34 · star 19K) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning.. Apache-2 Theano (2nd_place_medal34 · star 9.4K) - Theano is a Python library that allows you to define, optimize, and.. BSD-3PyFlink (2nd_place_medal33 · star 16K) - Apache Flink Python API. Apache-2pytorch-lightning (2nd_place_medal33 · star 12K) - The lightweight PyTorch wrapper for high-performance.. Apache-2 Fastai (2nd_place_medal32 · star 21K) - The fastai deep learning library. Apache-2 jax (2nd_place_medal32 · star 12K) - Composable transformations of Python+NumPy programs: differentiate,.. Apache-2Thinc (2nd_place_medal32 · star 2.2K) - A refreshing functional take on deep learning, compatible with your favorite.. MITCatboost (2nd_place_medal31 · star 5.8K) - A fast, scalable, high performance Gradient Boosting on Decision.. Apache-2Chainer (2nd_place_medal31 · star 5.5K) - A flexible framework of neural networks for deep learning. MITPaddlePaddle (2nd_place_medal30 · star 15K) - PArallel Distributed Deep LEarning: Machine Learning.. Apache-2 TFlearn (2nd_place_medal30 · star 9.5K) - Deep learning library featuring a higher-level API for TensorFlow. MIT Vowpal Wabbit (2nd_place_medal30 · star 7.5K) - Vowpal Wabbit is a machine learning system which pushes the.. BSD-3Turi Create (2nd_place_medal28 · star 10K) - Turi Create simplifies the development of custom machine learning.. BSD-3Sonnet (2nd_place_medal28 · star 8.8K) - TensorFlow-based neural network library. Apache-2 dyNET (2nd_place_medal28 · star 3.2K) - DyNet: The Dynamic Neural Network Toolkit. Apache-2tensorpack (3rd_place_medal27 · star 6K · chart_with_downwards_trend) - A Neural Net Training Interface on TensorFlow, with focus.. Apache-2 Ignite (3rd_place_medal27 · star 3.5K) - High-level library to help with training and evaluating neural.. BSD-3 Jina (3rd_place_medal27 · star 2.5K) - An easier way to build neural search on the cloud. Apache-2Flax (3rd_place_medal27 · star 1.5K) - Flax is a neural network ecosystem for JAX that is designed for.. Apache-2 jaxCNTK (3rd_place_medal26 · star 17K · zzz) - Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit. MITskorch (3rd_place_medal26 · star 3.8K) - A scikit-learn compatible neural network library that wraps.. BSD-3  mlpack (3rd_place_medal26 · star 3.6K) - mlpack: a scalable C++ machine learning library --. BSD-3Ludwig (3rd_place_medal25 · star 7.6K) - Ludwig is a toolbox that allows to train and evaluate deep.. Apache-2 xLearn (3rd_place_medal25 · star 2.9K · zzz) - High performance, easy-to-use, and scalable machine learning (ML).. Apache-2Neural Network Libraries (3rd_place_medal24 · star 2.4K) - Neural Network Libraries. Apache-2ktrain (3rd_place_medal24 · star 760) - ktrain is a Python library that makes deep learning and AI more.. Apache-2 tensorflow-upstream (3rd_place_medal24 · star 550) - TensorFlow ROCm port. Apache-2 SHOGUN (3rd_place_medal23 · star 2.8K) - Unified and efficient Machine Learning. BSD-3einops (3rd_place_medal23 · star 2.6K) - Deep learning operations reinvented (for pytorch, tensorflow, jax and.. MITfklearn (3rd_place_medal23 · star 1.3K) - fklearn: Functional Machine Learning. Apache-2mace (3rd_place_medal21 · star 4.3K) - MACE is a deep learning inference framework optimized for mobile.. Apache-2Neural Tangents (3rd_place_medal21 · star 1.3K) - Fast and Easy Infinite Neural Networks in Python. Apache-2ThunderSVM (3rd_place_medal20 · star 1.3K) - ThunderSVM: A Fast SVM Library on GPUs and CPUs. Apache-2Haiku (3rd_place_medal20 · star 1K) - JAX-based neural network library. Apache-2Torchbearer (3rd_place_medal20 · star 590) - torchbearer: A model fitting library for PyTorch. MIT Objax (3rd_place_medal19 · star 580) - Objax is a machine learning framework that provides an Object.. Apache-2 jaxelegy (3rd_place_medal17 · star 180) - Elegy is a framework-agnostic Trainer interface for the Jax.. Apache-2  jaxThunderGBM (3rd_place_medal16 · star 580) - ThunderGBM: Fast GBDTs and Random Forests on GPUs. Apache-2NeoML (3rd_place_medal13 · star 570) - Machine learning framework for both deep learning and traditional.. Apache-2Show 7 hidden projects...

Data Visualization

Back to top

General-purpose and task-specific data visualization libraries.

Matplotlib (1st_place_medal41 · star 13K) - matplotlib: plotting with Python. Python-2.0Seaborn (1st_place_medal37 · star 8.2K) - Statistical data visualization using matplotlib. BSD-3Plotly (1st_place_medal35 · star 9.1K) - The interactive graphing library for Python (includes Plotly Express). MITdash (1st_place_medal34 · star 14K) - Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required. MITBokeh (1st_place_medal33 · star 15K) - Interactive Data Visualization in the browser, from Python. BSD-3pyecharts (2nd_place_medal31 · star 11K) - Python Echarts Plotting Library. MIT wordcloud (2nd_place_medal31 · star 7.9K) - A little word cloud generator in Python. MITAltair (2nd_place_medal31 · star 6.5K) - Declarative statistical visualization library for Python. BSD-3UMAP (2nd_place_medal30 · star 4.6K) - Uniform Manifold Approximation and Projection. BSD-3bqplot (2nd_place_medal30 · star 3K) - Plotting library for IPython/Jupyter notebooks. Apache-2 PyQtGraph (2nd_place_medal30 · star 2.3K) - Fast data visualization and GUI tools for scientific / engineering.. MITpandas-profiling (2nd_place_medal29 · star 6.9K) - Create HTML profiling reports from pandas DataFrame.. MIT  VisPy (2nd_place_medal29 · star 2.6K) - High-performance interactive 2D/3D data visualization library. BSD-3 Graphviz (2nd_place_medal29 · star 940) - Simple Python interface for Graphviz. MITdatashader (2nd_place_medal28 · star 2.4K) - Quickly and accurately render even the largest data. BSD-3HoloViews (2nd_place_medal28 · star 1.8K) - With Holoviews, your data visualizes itself. BSD-3 Cufflinks (2nd_place_medal27 · star 2.1K) - Productivity Tools for Plotly + Pandas. MIT PyVista (2nd_place_medal27 · star 720) - 3D plotting and mesh analysis through a streamlined interface for the.. MIT data-validation (2nd_place_medal27 · star 530) - Library for exploring and validating machine learning.. Apache-2  Perspective (3rd_place_medal26 · star 3.3K) - Streaming pivot visualization via WebAssembly. Apache-2 missingno (3rd_place_medal26 · star 2.7K) - Missing data visualization module for Python. MITpythreejs (3rd_place_medal26 · star 710) - A Jupyter - Three.js bridge. BSD-3 Facets Overview (3rd_place_medal25 · star 6.5K) - Visualizations for machine learning datasets. Apache-2 Chartify (3rd_place_medal25 · star 2.8K) - Python library that makes it easy for data scientists to create.. Apache-2HyperTools (3rd_place_medal25 · star 1.6K) - A Python toolbox for gaining geometric insights into high-dimensional.. MIThvPlot (3rd_place_medal25 · star 360) - A high-level plotting API for pandas, dask, xarray, and networkx built on.. BSD-3openTSNE (3rd_place_medal24 · star 760) - Extensible, parallel implementations of t-SNE. BSD-3PandasGUI (3rd_place_medal23 · star 2.1K) - A GUI for Pandas DataFrames. MIT python-ternary (3rd_place_medal23 · star 400) - Ternary plotting library for python with matplotlib. MITD-Tale (3rd_place_medal22 · star 2.1K) - Visualizer for pandas data structures. ❗️LGPL-2.1  Multicore-TSNE (3rd_place_medal22 · star 1.5K · zzz) - Parallel t-SNE implementation with Python and Torch.. BSD-3 Pandas-Bokeh (3rd_place_medal22 · star 630) - Bokeh Plotting Backend for Pandas and GeoPandas. MIT vega (3rd_place_medal22 · star 300) - IPython/Jupyter notebook module for Vega and Vega-Lite. BSD-3 Sweetviz (3rd_place_medal20 · star 1.4K) - Visualize and compare datasets, target values and associations, with one.. MITlets-plot (3rd_place_medal20 · star 520) - An open-source plotting library for statistical data. MITjoypy (3rd_place_medal20 · star 320) - Joyplots in Python with matplotlib & pandas. MITHiPlot (3rd_place_medal19 · star 2K) - HiPlot makes understanding high dimensional data easy. MITanimatplot (3rd_place_medal19 · star 360) - A python package for animating plots build on matplotlib. MITPyWaffle (3rd_place_medal18 · star 400 · zzz) - Make Waffle Charts in Python. MITAutoViz (3rd_place_medal18 · star 310) - Automatically Visualize any dataset, any size with a single line of.. Apache-2FiftyOne (3rd_place_medal18 · star 220) - Visualize, create, and debug image and video datasets.. Apache-2   data-describe (3rd_place_medal14 · star 270) - datadescribe: Pythonic EDA Accelerator for Data Science. Apache-2nx-altair (3rd_place_medal14 · star 160 · zzz) - Draw interactive NetworkX graphs with Altair. MIT Show 6 hidden projects...

Text Data & NLP

Back to top

Libraries for processing, cleaning, manipulating, and analyzing text data as well as libraries for NLP tasks such as language detection, fuzzy matching, classification, seq2seq learning, conversational AI, keyword extraction, and translation.

spaCy (1st_place_medal37 · star 20K) - Industrial-strength Natural Language Processing (NLP) in Python. MITtransformers (1st_place_medal36 · star 42K) - Transformers: State-of-the-art Natural Language.. Apache-2  gensim (1st_place_medal35 · star 12K) - Topic Modelling for Humans. ❗️LGPL-2.1nltk (1st_place_medal34 · star 9.7K) - Suite of libraries and programs for symbolic and statistical natural.. Apache-2AllenNLP (1st_place_medal32 · star 9.8K) - An open-source NLP research library, built on PyTorch. Apache-2 fairseq (1st_place_medal31 · star 11K) - Facebook AI Research Sequence-to-Sequence Toolkit written in Python. MIT ChatterBot (1st_place_medal31 · star 11K · zzz) - ChatterBot is a machine learning, conversational dialog engine.. BSD-3sentencepiece (1st_place_medal31 · star 4.9K) - Unsupervised text tokenizer for Neural Network-based text.. Apache-2fastText (1st_place_medal30 · star 22K · zzz) - Library for fast text representation and classification. MITflair (1st_place_medal30 · star 10K) - A very simple framework for state-of-the-art Natural Language Processing.. MIT snowballstemmer (1st_place_medal30 · star 480) - Snowball compiler and stemming algorithms. BSD-3TextBlob (2nd_place_medal29 · star 7.6K) - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech.. MITtorchtext (2nd_place_medal29 · star 2.7K · chart_with_downwards_trend) - Data loaders and abstractions for text and NLP. BSD-3 Rasa (2nd_place_medal28 · star 11K) - Open source machine learning framework to automate text- and voice-.. Apache-2 OpenNMT (2nd_place_medal28 · star 4.9K) - Open Source Neural Machine Translation in PyTorch. MIT sentence-transformers (2nd_place_medal28 · star 4.4K) - Sentence Embeddings with BERT & XLNet. Apache-2 Tokenizers (2nd_place_medal28 · star 4.3K) - Fast State-of-the-Art Tokenizers optimized for Research and.. Apache-2Dedupe (2nd_place_medal28 · star 2.9K) - A python library for accurate and scalable fuzzy matching, record.. MITphonenumbers (2nd_place_medal28 · star 2.6K) - Python port of Google's libphonenumber. Apache-2DeepPavlov (2nd_place_medal26 · star 5.1K) - An open source library for deep learning end-to-end dialog.. Apache-2 ftfy (2nd_place_medal26 · star 2.9K) - Fixes mojibake and other glitches in Unicode text, after the fact. MITGluonNLP (2nd_place_medal26 · star 2.2K) - Toolkit that enables easy text preprocessing, datasets loading.. Apache-2 TextDistance (2nd_place_medal26 · star 1.9K) - Compute distance between sequences. 30+ algorithms, pure python.. MITtextacy (2nd_place_medal26 · star 1.6K) - NLP, before and after spaCy. Apache-2jellyfish (2nd_place_medal26 · star 1.4K) - a python library for doing approximate and phonetic matching of.. BSD-2TensorFlow Text (2nd_place_medal26 · star 700) - Making text a first-class citizen in TensorFlow. Apache-2 CLTK (2nd_place_medal26 · star 650) - The Classical Language Toolkit. MITinflect (2nd_place_medal26 · star 490) - Correctly generate plurals, ordinals, indefinite articles; convert numbers.. MITParlAI (2nd_place_medal25 · star 7K) - A framework for training and evaluating AI models on a variety of.. MIT PyText (2nd_place_medal25 · star 6.1K) - A natural language modeling framework based on PyTorch. BSD-3 stanza (2nd_place_medal25 · star 5.3K · chart_with_downwards_trend) - Official Stanford NLP Python Library for Many Human Languages. Apache-2vaderSentiment (2nd_place_medal25 · star 2.9K · zzz) - VADER Sentiment Analysis. VADER (Valence Aware Dictionary.. MITspark-nlp (2nd_place_medal25 · star 2K) - State of the Art Natural Language Processing. Apache-2 haystack (2nd_place_medal25 · star 1.5K) - End-to-end Python framework for building natural language search.. Apache-2pyahocorasick (2nd_place_medal25 · star 590) - Python module (C extension and plain python) implementing Aho-.. BSD-3T5 (3rd_place_medal24 · star 3.2K) - Code for the paper Exploring the Limits of Transfer Learning with a.. Apache-2 Sumy (3rd_place_medal24 · star 2.5K) - Module for automatic summarization of text documents and HTML pages. Apache-2fastNLP (3rd_place_medal24 · star 2K) - fastNLP: A Modularized and Extensible NLP Framework. Currently still.. Apache-2pytorch-nlp (3rd_place_medal24 · star 1.9K) - Basic Utilities for PyTorch Natural Language Processing (NLP). BSD-3 scattertext (3rd_place_medal24 · star 1.5K · chart_with_upwards_trend) - Beautiful visualizations of how language differs among.. Apache-2sense2vec (3rd_place_medal24 · star 1.2K) - Contextually-keyed word vectors. MITspacy-transformers (3rd_place_medal24 · star 920) - Use pretrained transformers like BERT, XLNet and GPT-2.. MIT spacySciSpacy (3rd_place_medal24 · star 850) - A full spaCy pipeline and models for scientific/biomedical documents. Apache-2Ciphey (3rd_place_medal23 · star 6.5K) - Automatically decrypt encryptions without knowing the key or cipher,.. MITflashtext (3rd_place_medal23 · star 4.7K · zzz) - Extract Keywords from sentence or Replace keywords in sentences. MITneuralcoref (3rd_place_medal23 · star 2.2K) - Fast Coreference Resolution in spaCy with Neural Networks. MITpySBD (3rd_place_medal23 · star 290) - pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence.. MITtextgenrnn (3rd_place_medal22 · star 4.3K · zzz) - Easily train your own text-generating neural network of any.. MIT fast-bert (3rd_place_medal22 · star 1.5K) - Super easy library for BERT based NLP models. Apache-2PyTextRank (3rd_place_medal22 · star 1.5K · chart_with_downwards_trend) - Python implementation of TextRank for phrase extraction and.. MITFARM (3rd_place_medal22 · star 1.1K) - Fast & easy transfer learning for NLP. Harvesting language models.. Apache-2 DeepMatcher (3rd_place_medal21 · star 3.5K · zzz) - Python package for performing Entity and Text Matching using.. BSD-3gpt-2-simple (3rd_place_medal21 · star 2.5K) - Python package to easily retrain OpenAI's GPT-2 text-.. MIT Texar (3rd_place_medal21 · star 2.1K · zzz) - Toolkit for Machine Learning, Natural Language Processing, and.. Apache-2 NLP Architect (3rd_place_medal20 · star 2.6K) - A model library for exploring state-of-the-art deep learning.. Apache-2NeMo (3rd_place_medal20 · star 2.5K) - NeMo: a toolkit for conversational AI. Apache-2 DELTA (3rd_place_medal20 · star 1.4K) - DELTA is a deep learning based natural language and speech.. Apache-2 Sockeye (3rd_place_medal20 · star 990) - Sequence-to-sequence framework with a focus on Neural Machine.. Apache-2 YouTokenToMe (3rd_place_medal20 · star 720) - Unsupervised text tokenizer focused on computational efficiency. MITfinetune (3rd_place_medal20 · star 630) - Scikit-learn style model finetuning for NLP. MPL-2.0  Texthero (3rd_place_medal19 · star 2.1K) - Text preprocessing, representation and visualization from zero to hero. MITtextpipe (3rd_place_medal19 · star 280) - Textpipe: clean and extract metadata from text. MITKashgari (3rd_place_medal18 · star 2K) - Kashgari is a production-level NLP Transfer learning framework.. Apache-2 Camphr (3rd_place_medal18 · star 330) - spaCy plugin for Transformers , Udify, ELmo, etc. Apache-2 spacyskift (3rd_place_medal18 · star 210) - scikit-learn wrappers for Python fastText. MIT Translate (3rd_place_medal15 · star 680) - Translate - a PyTorch Language Library. BSD-3 VizSeq (3rd_place_medal15 · star 310) - An Analysis Toolkit for Natural Language Generation (Translation,.. MITOpenNRE (3rd_place_medal14 · star 3K) - An Open-Source Package for Neural Relation Extraction (NRE). MITTransferNLP (3rd_place_medal14 · star 290 · zzz) - NLP library designed for reproducible experimentation.. MIT NeuralQA (3rd_place_medal14 · star 180) - NeuralQA: A Usable Library for Question Answering on Large Datasets with.. MITtextvec (3rd_place_medal13 · star 170) - Text vectorization tool to outperform TFIDF for classification tasks. MIT Show 11 hidden projects...

Image Data

Back to top

Libraries for image & video processing, manipulation, and augmentation as well as libraries for computer vision tasks such as facial recognition, object detection, and classification.

Pillow (1st_place_medal39 · star 8.3K) - The friendly PIL fork (Python Imaging Library). ❗️PILtorchvision (1st_place_medal36 · star 8.6K) - Datasets, Transforms and Models specific to Computer Vision. BSD-3 scikit-image (1st_place_medal33 · star 4.2K) - Image processing in Python. BSD-2imgaug (1st_place_medal31 · star 11K · zzz) - Image augmentation for machine learning experiments. MITimageio (1st_place_medal31 · star 840) - Python library for reading and writing image data. BSD-2opencv-python (2nd_place_medal30 · star 1.8K) - Automated CI toolchain to produce precompiled opencv-python,.. MITWand (2nd_place_medal30 · star 1.1K) - The ctypes-based simple ImageMagick binding for Python. MITFace Recognition (2nd_place_medal29 · star 39K) - The world's simplest facial recognition api for Python.. MIT MoviePy (2nd_place_medal29 · star 7.3K) - Video editing with Python. MITPyTorch Image Models (2nd_place_medal28 · star 7.9K · chart_with_upwards_trend) - PyTorch image models, scripts, pretrained weights --.. Apache-2 Albumentations (2nd_place_medal28 · star 7.5K) - Fast image augmentation library and easy to use wrapper.. MIT Kornia (2nd_place_medal28 · star 3.7K) - Open Source Differentiable Computer Vision Library for PyTorch. Apache-2 imutils (2nd_place_medal28 · star 3.6K) - A series of convenience functions to make basic image processing.. MITImageHash (2nd_place_medal28 · star 1.9K) - A Python Perceptual Image Hashing Module. BSD-2imageai (2nd_place_medal27 · star 6K) - A python library built to empower developers to build applications and.. MITGluonCV (2nd_place_medal27 · star 4.6K) - Gluon CV Toolkit. Apache-2 detectron2 (2nd_place_medal26 · star 15K) - Detectron2 is FAIR's next-generation platform for object.. Apache-2 InsightFace (2nd_place_medal26 · star 8.7K) - Face Analysis Project on MXNet. MIT MMDetection (2nd_place_medal25 · star 14K) - OpenMMLab Detection Toolbox and Benchmark. Apache-2 PyTorch3D (2nd_place_medal25 · star 4.6K) - PyTorch3D is FAIR's library of reusable components for deep.. MIT facenet-pytorch (2nd_place_medal25 · star 1.9K) - Pretrained Pytorch face detection (MTCNN) and recognition.. MIT mahotas (2nd_place_medal25 · star 670) - Computer Vision in Python. MITAugmentor (3rd_place_medal24 · star 4.3K · zzz) - Image augmentation library in Python for machine learning. MITmtcnn (3rd_place_medal24 · star 1.4K) - MTCNN face detection implementation for TensorFlow, as a PIP package. MIT Face Alignment (3rd_place_medal23 · star 4.7K) - 2D and 3D Face alignment library build using pytorch. BSD-3 CellProfiler (3rd_place_medal23 · star 550) - An open-source application for biological image analysis. BSD-3segmentation_models (3rd_place_medal22 · star 3K · zzz) - Segmentation models with pretrained backbones. Keras.. MIT vidgear (3rd_place_medal22 · star 1.7K) - High-performance cross-platform Video Processing Python framework.. Apache-2pyvips (3rd_place_medal22 · star 300) - python binding for libvips using cffi. MITImage Deduplicator (3rd_place_medal21 · star 3.4K) - Finding duplicate images made easy!. Apache-2 Image Super-Resolution (3rd_place_medal21 · star 2.6K) - Super-scale your images and run experiments with.. Apache-2 tensorflow-graphics (3rd_place_medal21 · star 2.4K) - TensorFlow Graphics: Differentiable Graphics Layers.. Apache-2 Classy Vision (3rd_place_medal21 · star 1.2K) - An end-to-end PyTorch framework for image and video.. MIT Torch Points 3D (3rd_place_medal21 · star 1.1K) - Pytorch framework for doing deep learning on point clouds. BSD-3 MMF (3rd_place_medal20 · star 4.2K) - A modular framework for vision & language multimodal research from.. BSD-3 image-match (3rd_place_medal20 · star 2.5K) - Quickly search over billions of images. Apache-2nude.py (3rd_place_medal20 · star 790) - Nudity detection with Python. MITCaer (3rd_place_medal20 · star 450) - A lightweight Computer Vision library. Scale your models, not boilerplate. MITvit-pytorch (3rd_place_medal18 · star 2.9K · hatching_chick) - Implementation of Vision Transformer, a simple way to.. MIT Norfair (3rd_place_medal18 · star 920) - Lightweight Python library for adding real-time 2D object tracking to.. BSD-3PaddleDetection (3rd_place_medal17 · star 2.3K) - Object detection and instance segmentation toolkit.. Apache-2 lightly (3rd_place_medal17 · star 430 · hatching_chick) - A python library for self-supervised learning on images. MIT pycls (3rd_place_medal15 · star 1.5K) - Codebase for Image Classification Research, written in PyTorch. MIT DE⫶TR (3rd_place_medal14 · star 6.4K) - End-to-End Object Detection with Transformers. Apache-2 PySlowFast (3rd_place_medal14 · star 3.4K) - PySlowFast: video understanding codebase from FAIR for.. Apache-2 Show 4 hidden projects...

Graph Data

Back to top

Libraries for graph processing, clustering, embedding, and machine learning tasks.

networkx (1st_place_medal33 · star 8.8K · chart_with_downwards_trend) - Network Analysis in Python. BSD-3PyTorch Geometric (1st_place_medal29 · star 10K · chart_with_upwards_trend) - Geometric Deep Learning Extension Library for PyTorch. MIT dgl (2nd_place_medal26 · star 6.8K) - Python package built to ease deep learning on graph, on top of existing.. Apache-2StellarGraph (2nd_place_medal25 · star 1.8K) - StellarGraph - Machine Learning on Graphs. Apache-2 Spektral (2nd_place_medal23 · star 1.7K) - Graph Neural Networks with Keras and Tensorflow 2. MIT ogb (2nd_place_medal22 · star 770) - Benchmark datasets, data loaders, and evaluators for graph machine learning. MITNode2Vec (2nd_place_medal22 · star 650) - Implementation of the node2vec algorithm. MITtorch-cluster (2nd_place_medal21 · star 340) - PyTorch Extension Library of Optimized Graph Cluster.. MIT AmpliGraph (2nd_place_medal20 · star 1.4K · zzz) - Python library for Representation Learning on Knowledge.. Apache-2 PyTorch-BigGraph (3rd_place_medal19 · star 2.7K) - Generate embeddings from large-scale graph-structured.. BSD-3 PyKEEN (3rd_place_medal19 · star 330) - A Python library for learning and evaluating knowledge graph embeddings. MITgraph-nets (3rd_place_medal18 · star 4.8K) - Build Graph Nets in Tensorflow. Apache-2 DeepGraph (3rd_place_medal18 · star 230) - Analyze Data with Pandas-based Networks. Documentation:. BSD-3 Paddle Graph Learning (3rd_place_medal17 · star 920) - Paddle Graph Learning (PGL) is an efficient and.. Apache-2 kglib (3rd_place_medal16 · star 400) - Grakn Knowledge Graph Library (ML R&D). Apache-2pytorch_geometric_temporal (3rd_place_medal16 · star 370) - A Temporal Extension Library for PyTorch Geometric. MIT GraphEmbedding (3rd_place_medal15 · star 1.8K) - Implementation and experiments of graph embedding algorithms. MIT Euler (3rd_place_medal14 · star 2.5K · zzz) - A distributed graph deep learning framework. Apache-2 AutoGL (3rd_place_medal14 · star 590 · hatching_chick) - An autoML framework & toolkit for machine learning on graphs. MIT OpenKE (3rd_place_medal13 · star 2.4K · zzz) - An Open-Source Package for Knowledge Embedding (KE). MITGraphVite (3rd_place_medal13 · star 860) - GraphVite: A General and High-performance Graph Embedding System. Apache-2Show 8 hidden projects...

Audio Data

Back to top

Libraries for audio analysis, manipulation, transformation, and extraction, as well as speech recognition and music generation tasks.

DeepSpeech (1st_place_medal31 · star 17K) - DeepSpeech is an open source embedded (offline, on-device).. MPL-2.0 Pydub (1st_place_medal30 · star 5.2K · chart_with_upwards_trend) - Manipulate audio with a simple and easy high level interface. MITMagenta (2nd_place_medal29 · star 16K) - Magenta: Music and Art Generation with Machine Intelligence. Apache-2 torchaudio (2nd_place_medal29 · star 1.3K · chart_with_upwards_trend) - Data manipulation and transformation for audio signal.. BSD-2 librosa (2nd_place_medal27 · star 4.3K) - Python library for audio and music analysis. ISCaudioread (2nd_place_medal26 · star 360) - cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.. MITspleeter (2nd_place_medal25 · star 16K) - Deezer source separation library including pretrained models. MIT pyAudioAnalysis (2nd_place_medal25 · star 3.8K) - Python Audio Analysis Library: Feature Extraction,.. Apache-2python-soundfile (2nd_place_medal25 · star 370) - SoundFile is an audio library based on libsndfile, CFFI, and.. BSD-3espnet (3rd_place_medal24 · star 3.5K) - End-to-End Speech Processing Toolkit. Apache-2python_speech_features (3rd_place_medal23 · star 1.9K) - This library provides common speech features for ASR.. MITtinytag (3rd_place_medal23 · star 440) - Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and.. MITPorcupine (3rd_place_medal22 · star 2.4K) - On-device wake word detection powered by deep learning. Apache-2DDSP (3rd_place_medal22 · star 1.8K) - DDSP: Differentiable Digital Signal Processing. Apache-2 kapre (3rd_place_medal21 · star 720) - kapre: Keras Audio Preprocessors. MIT Dejavu (3rd_place_medal20 · star 5.3K · zzz) - Audio fingerprinting and recognition in Python. MITTTS (3rd_place_medal20 · star 3.3K) - Deep learning for Text to Speech (Discussion forum:.. MPL-2.0Muda (3rd_place_medal17 · star 180 · zzz) - A library for augmenting annotated audio data. ISCJulius (3rd_place_medal14 · star 180 · hatching_chick) - Fast PyTorch based DSP for audio and 1D signals. MIT Show 4 hidden projects...

Geospatial Data

Back to top

Libraries to load, process, analyze, and write geographic data as well as libraries for spatial analysis, map visualization, and geocoding.

pydeck (1st_place_medal33 · star 8.5K) - WebGL2 powered geospatial visualization layers. MIT folium (1st_place_medal32 · star 5.2K) - Python Data. Leaflet.js Maps. MITgeopy (1st_place_medal32 · star 3.2K) - Geocoding library for Python. MITShapely (1st_place_medal32 · star 2.2K) - Manipulation and analysis of geometric objects. BSD-3GeoPandas (2nd_place_medal31 · star 2.5K) - Python tools for geographic data. BSD-3 pyproj (2nd_place_medal31 · star 580 · chart_with_upwards_trend) - Python interface to PROJ (cartographic projections and coordinate.. MITRasterio (2nd_place_medal30 · star 1.4K) - Rasterio reads and writes geospatial raster datasets. BSD-3Fiona (2nd_place_medal30 · star 780) - Fiona reads and writes geographic data files. BSD-3ipyleaflet (3rd_place_medal28 · star 1.1K · chart_with_upwards_trend) - A Jupyter - Leaflet.js bridge. MIT geojson (3rd_place_medal26 · star 600) - Python bindings and utilities for GeoJSON. BSD-3ArcGIS API (3rd_place_medal25 · star 980) - Documentation and samples for ArcGIS API for Python. Apache-2PySAL (3rd_place_medal25 · star 830) - PySAL: Python Spatial Analysis Library Meta-Package. BSD-3GeoViews (3rd_place_medal22 · star 330) - Simple, concise geographical visualization in Python. BSD-3EarthPy (3rd_place_medal20 · star 230) - A package built to support working with spatial data using open source.. BSD-3pymap3d (3rd_place_medal19 · star 180) - pure-Python (Numpy optional) 3D coordinate conversions for geospace ecef.. BSD-2Show 7 hidden projects...

Financial Data

Back to top

Libraries for algorithmic stock/crypto trading, risk analytics, backtesting, technical analysis, and other tasks on financial data.

zipline (1st_place_medal30 · star 14K) - Zipline, a Pythonic Algorithmic Trading Library. Apache-2yfinance (1st_place_medal30 · star 4.5K) - Yahoo! Finance market data downloader (+faster Pandas Datareader). Apache-2Alpha Vantage (1st_place_medal27 · star 3.2K) - A python wrapper for Alpha Vantage API for financial data. MITta (1st_place_medal27 · star 1.9K) - Technical Analysis Library using Pandas and Numpy. MITpyfolio (2nd_place_medal26 · star 3.6K · zzz) - Portfolio and risk analytics in Python. Apache-2empyrical (2nd_place_medal25 · star 740) - Common financial risk and performance metrics. Used by zipline and.. Apache-2Alphalens (2nd_place_medal24 · star 1.8K · zzz) - Performance analysis of predictive (alpha) stock factors. Apache-2IB-insync (2nd_place_medal24 · star 1.3K) - Python sync/async framework for Interactive Brokers API. BSD-2bt (2nd_place_medal24 · star 980) - bt - flexible backtesting for Python. MITffn (2nd_place_medal24 · star 800) - ffn - a financial function library for Python. MITEnigma Catalyst (3rd_place_medal23 · star 2K) - An Algorithmic Trading Library for Crypto-Assets in Python. Apache-2stockstats (3rd_place_medal23 · star 730) - Supply a wrapper ``StockDataFrame`` based on the.. BSD-3TensorTrade (3rd_place_medal21 · star 3K) - An open source reinforcement learning framework for training,.. Apache-2finmarketpy (3rd_place_medal20 · star 2.5K) - Python library for backtesting trading strategies & analyzing.. Apache-2Qlib (3rd_place_medal19 · star 4.6K) - Qlib is an AI-oriented quantitative investment platform, which aims to.. MIT tf-quant-finance (3rd_place_medal19 · star 2.5K) - High-performance TensorFlow library for quantitative.. Apache-2 Crypto Signals (3rd_place_medal18 · star 2.7K) - Github.com/CryptoSignal - #1 Quant Trading & Technical Analysis.. MITShow 6 hidden projects...

Time Series Data

Back to top

Libraries for forecasting, anomaly detection, feature extraction, and machine learning on time-series and sequential data.

Prophet (1st_place_medal28 · star 12K) - Tool for producing high quality forecasts for time series data that has.. MITtsfresh (1st_place_medal27 · star 5.5K) - Automatic extraction of relevant features from time series:. MIT sktime (1st_place_medal27 · star 3.7K) - A unified framework for machine learning with time series. BSD-3 pmdarima (2nd_place_medal26 · star 830) - A statistical library designed to fill the void in Python's time series.. MITtslearn (2nd_place_medal25 · star 1.5K) - A machine learning toolkit dedicated to time-series data. BSD-2 Streamz (2nd_place_medal24 · star 920) - Real-time stream processing for python. BSD-3GluonTS (2nd_place_medal23 · star 1.8K) - Probabilistic time series modeling in Python. Apache-2 Darts (2nd_place_medal22 · star 750) - A python library for easy manipulation and forecasting of time series. Apache-2STUMPY (3rd_place_medal20 · star 1.7K) - STUMPY is a powerful and scalable Python library for computing a Matrix.. BSD-3pyts (3rd_place_medal20 · star 890 · zzz) - A Python package for time series classification. BSD-3pytorch-forecasting (3rd_place_medal19 · star 830) - Time series forecasting with PyTorch. MITseglearn (3rd_place_medal19 · star 430) - Python module for machine learning time series:. BSD-3matrixprofile-ts (3rd_place_medal18 · star 620 · zzz) - A Python library for detecting patterns and anomalies.. Apache-2Auto TS (3rd_place_medal18 · star 190) - Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost.. Apache-2ADTK (3rd_place_medal17 · star 610 · zzz) - A Python toolkit for rule-based/unsupervised anomaly detection in time.. MPL-2.0tick (3rd_place_medal17 · star 320 · zzz) - Module for statistical learning, with a particular emphasis on time-.. BSD-3atspy (3rd_place_medal16 · star 340) - AtsPy: Automated Time Series Models in Python (by @firmai). MITShow 3 hidden projects...

Medical Data

Back to top

Libraries for processing and analyzing medical data such as MRIs, EEGs, genomic data, and other medical imaging formats.

Lifelines (1st_place_medal29 · star 1.6K) - Survival analysis in Python. MITNilearn (1st_place_medal29 · star 710) - Machine learning for NeuroImaging in Python. BSD-3 NIPYPE (1st_place_medal29 · star 560) - Workflows and interfaces for neuroimaging packages. Apache-2NiBabel (1st_place_medal29 · star 390) - Python package to access a cacophony of neuro-imaging file formats. MITMNE (2nd_place_medal27 · star 1.5K) - MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python. BSD-3DIPY (2nd_place_medal27 · star 390) - DIPY is the paragon 3D/4D+ imaging library in Python. Contains generic.. BSD-3Hail (2nd_place_medal24 · star 700) - Scalable genomic data analysis. MIT NIPY (2nd_place_medal23 · star 290) - Neuroimaging in Python FMRI analysis package. BSD-3MONAI (3rd_place_medal22 · star 1.8K) - AI Toolkit for Healthcare Imaging. Apache-2 DeepVariant (3rd_place_medal21 · star 2.2K) - DeepVariant is an analysis pipeline that uses a deep neural.. BSD-3 NiftyNet (3rd_place_medal21 · star 1.3K · zzz) - [unmaintained] An open-source convolutional neural.. Apache-2 Brainiak (3rd_place_medal19 · star 230) - Brain Imaging Analysis Kit. Apache-2Glow (3rd_place_medal19 · star 160) - An open-source toolkit for large-scale genomic analysis. Apache-2Medical Detection Toolkit (3rd_place_medal12 · star 910 · zzz) - The Medical Detection Toolkit contains 2D + 3D.. Apache-2 MedicalNet (3rd_place_medal11 · star 1.1K · zzz) - Many studies have shown that the performance on deep learning is.. MITShow 4 hidden projects...

Optical Character Recognition

Back to top

Libraries for optical character recognition (OCR) and text extraction from images or videos.

Tesseract (1st_place_medal30 · star 3.5K) - Python-tesseract is an optical character recognition (OCR) tool.. Apache-2EasyOCR (1st_place_medal28 · star 11K) - Ready-to-use OCR with 80+ supported languages and all popular writing.. Apache-2OCRmyPDF (2nd_place_medal27 · star 4K) - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to.. MPL-2.0tesserocr (2nd_place_medal26 · star 1.4K) - A Python wrapper for the tesseract-ocr API. MITPaddleOCR (2nd_place_medal24 · star 11K) - Awesome multilingual OCR toolkits based on PaddlePaddle.. Apache-2 attention-ocr (3rd_place_medal21 · star 840) - A Tensorflow model for text recognition (CNN + seq2seq with.. MIT keras-ocr (3rd_place_medal20 · star 780) - A packaged and flexible version of the CRAFT text detector and.. MIT calamari (3rd_place_medal19 · star 790) - Line based ATR Engine based on OCRopy. Apache-2doc2text (3rd_place_medal18 · star 1.2K) - Detect text blocks and OCR poorly scanned PDFs in bulk. Python module.. MITMozart (3rd_place_medal10 · star 240 · hatching_chick) - An optical music recognition (OMR) system. Converts sheet.. Apache-2 Show 1 hidden projects...

Data Containers & Structures

Back to top

General-purpose data containers & structures as well as utilities & extensions for pandas.

pandas (1st_place_medal40 · star 29K) - Flexible and powerful data analysis / manipulation library for.. BSD-3 numpy (1st_place_medal38 · star 17K) - The fundamental package for scientific computing with Python. BSD-3h5py (1st_place_medal36 · star 1.5K) - HDF5 for Python -- The h5py package is a Pythonic interface to the HDF5.. BSD-3Arrow (2nd_place_medal35 · star 7.5K) - Apache Arrow is a cross-language development platform for in-memory.. Apache-2xarray (2nd_place_medal32 · star 2K) - N-D labeled arrays and datasets in Python. Apache-2numexpr (2nd_place_medal31 · star 1.6K) - Fast numerical array expression evaluator for Python, NumPy, PyTables,.. MITTinyDB (2nd_place_medal29 · star 4.1K) - TinyDB is a lightweight document oriented database optimized for your.. MITKoalas (2nd_place_medal29 · star 2.7K) - Koalas: pandas API on Apache Spark. Apache-2  Bottleneck (2nd_place_medal29 · star 580) - Fast NumPy array functions written in C. BSD-2Modin (2nd_place_medal28 · star 5.8K) - Modin: Speed up your Pandas workflows by changing a single line of.. Apache-2 PyTables (2nd_place_medal28 · star 1K) - A Python package to manage extremely large amounts of data. BSD-3datasketch (3rd_place_medal27 · star 1.4K) - MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog,.. MITzarr (3rd_place_medal26 · star 660) - An implementation of chunked, compressed, N-dimensional arrays for Python. MITbcolz (3rd_place_medal25 · star 910) - A columnar data container that can be compressed. BSD-3Arctic (3rd_place_medal24 · star 2.2K) - Arctic is a high performance datastore for numeric data. ❗️LGPL-2.1swifter (3rd_place_medal24 · star 1.6K) - A package which efficiently applies any function to a pandas.. MIT Pandaral·lel (3rd_place_medal24 · star 1.4K) - A simple and efficient tool to parallelize Pandas.. BSD-3  Vaex (3rd_place_medal23 · star 5.9K) - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualize and.. MITdatatable (3rd_place_medal21 · star 1.2K) - A Python package for manipulating 2-dimensional tabular data.. MPL-2.0StaticFrame (3rd_place_medal21 · star 220) - Immutable and grow-only Pandas-like DataFrames with a more explicit.. MITfletcher (3rd_place_medal20 · star 210) - Pandas ExtensionDType/Array backed by Apache Arrow. MIT Bounter (3rd_place_medal17 · star 900 · zzz) - Efficient Counter that uses a limited (bounded) amount of memory.. MITPandaPy (3rd_place_medal14 · star 470) - PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x.. MIT Show 5 hidden projects...

Data Loading & Extraction

Back to top

Libraries for loading, collecting, and extracting data from a variety of data sources and formats.

Faker (1st_place_medal36 · star 12K) - Faker is a Python package that generates fake data for you. MITxlrd (1st_place_medal34 · star 1.9K) - Please use openpyxl where you can... BSD-3xmltodict (1st_place_medal32 · star 4.3K · zzz) - Python module that makes working with XML feel like you are.. MITTensorFlow Datasets (1st_place_medal32 · star 2.7K) - TFDS is a collection of datasets ready to use with.. Apache-2 python-magic (1st_place_medal32 · star 1.8K) - A python wrapper for libmagic. MITTablib (2nd_place_medal31 · star 3.9K) - Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c. MITsmart-open (2nd_place_medal30 · star 2K) - Utils for streaming large files (S3, HDFS, gzip, bz2...). MITDatasets (2nd_place_medal29 · star 6.9K) - The largest hub of ready-to-use NLP datasets for ML models with.. Apache-2pandas-datareader (2nd_place_medal29 · star 1.9K) - Extract data from a wide range of Internet sources.. BSD-3 snorkel (3rd_place_medal28 · star 4.5K · chart_with_upwards_trend) - A system for quickly generating training data with weak.. Apache-2csvkit (3rd_place_medal28 · star 4.5K) - A suite of utilities for converting to and working with CSV, the king of.. MITtabulator-py (3rd_place_medal26 · star 200) - Python library for reading and writing tabular data via streams. MITIntake (3rd_place_medal25 · star 530) - Intake is a lightweight package for finding, investigating, loading and.. BSD-2SDV (3rd_place_medal21 · star 360) - Synthetic Data Generation for tabular, relational and time series data. MITdatatest (3rd_place_medal21 · star 240) - Tools for test driven data-wrangling and data validation. Apache-2Show 8 hidden projects...

Web Scraping & Crawling

Back to top

Libraries for web scraping, crawling, downloading, and mining as well as libraries.

link best-of-web-python - Web Scraping ( star 1.1K · hatching_chick) - Collection of web-scraping and crawling libraries.

Data Pipelines & Streaming

Back to top

Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.

Celery (1st_place_medal39 · star 17K · chart_with_upwards_trend) - Asynchronous task queue/job queue based on distributed message passing. BSD-3Airflow (1st_place_medal36 · star 21K · chart_with_upwards_trend) - Platform to programmatically author, schedule, and monitor.. Apache-2joblib (1st_place_medal35 · star 2.4K) - Computing with Python functions. BSD-3rq (1st_place_medal33 · star 7.6K) - Simple job queues for Python. BSD-3luigi (2nd_place_medal32 · star 14K) - Luigi is a Python module that helps you build complex pipelines of batch.. Apache-2Beam (2nd_place_medal32 · star 4.6K) - Unified programming model to define and execute data processing.. Apache-2Prefect (2nd_place_medal30 · star 6K) - The easiest way to automate your data. Apache-2dbt (2nd_place_medal29 · star 2.7K) - dbt (data build tool) enables data analysts and engineers to transform.. Apache-2faust (2nd_place_medal28 · star 5.4K) - Python Stream Processing. BSD-3Kedro (2nd_place_medal28 · star 3.6K) - A Python framework for creating reproducible, maintainable and modular.. Apache-2Dagster (2nd_place_medal27 · star 3K) - A data orchestrator for machine learning, analytics, and ETL. Apache-2mrjob (2nd_place_medal27 · star 2.5K) - Run MapReduce jobs on Hadoop or Amazon Web Services. Apache-2petl (2nd_place_medal27 · star 860) - Python Extract Transform and Load Tables of Data. MITPyFunctional (2nd_place_medal26 · star 1.8K) - Python library for creating data pipelines with chain functional.. MITHub (3rd_place_medal25 · star 2.7K) - Fastest unstructured dataset management for TensorFlow/PyTorch... MPL-2.0  TFX (3rd_place_medal25 · star 1.4K) - TFX is an end-to-end platform for deploying production ML pipelines. Apache-2 Great Expectations (3rd_place_medal24 · star 3.9K) - Always know what to expect from your data. Apache-2streamparse (3rd_place_medal23 · star 1.4K) - Run Python in Apache Storm topologies. Pythonic API, CLI.. Apache-2bonobo (3rd_place_medal23 · star 1.4K) - Extract Transform Load for Python 3.5+. Apache-2Optimus (3rd_place_medal23 · star 980) - Agile Data Preparation Workflows madeeasy with dask, cudf,.. Apache-2 pysparkling (3rd_place_medal23 · star 230) - A pure Python implementation of Apache Spark's RDD and DStream.. MITPypeline (3rd_place_medal22 · star 1.2K) - Concurrent data pipelines in Python . MITdpark (3rd_place_medal20 · star 2.6K) - Python clone of Spark, a MapReduce alike framework in Python. BSD-3 mrq (3rd_place_medal20 · star 840) - Mr. Queue - A distributed worker task queue in Python using Redis & gevent. MITpdpipe (3rd_place_medal20 · star 590) - Easy pipelines for pandas DataFrames. MIT ploomber (3rd_place_medal20 · star 210) - A convention over configuration workflow orchestrator. Develop.. Apache-2spark-deep-learning (3rd_place_medal18 · star 1.8K) - Deep Learning Pipelines for Apache Spark. Apache-2 Mara Pipelines (3rd_place_medal18 · star 1.6K) - A lightweight opinionated ETL framework, halfway between plain.. MITTaskTiger (3rd_place_medal18 · star 1K) - Python task queue using Redis. MITDatabolt Flow (3rd_place_medal18 · star 900) - Python library for building highly effective data science workflows. MITBatchFlow (3rd_place_medal18 · star 160) - BatchFlow helps you conveniently work with random or sequential.. Apache-2flupy (3rd_place_medal18 · star 150) - Fluent data pipelines for python and your shell. MITriko (3rd_place_medal17 · star 1.6K · zzz) - A Python stream processing engine modeled after Yahoo! Pipes. MITzenml (3rd_place_medal14 · star 900 · hatching_chick) - ZenML: Bring Zen to your ML with reproducible pipelines. Apache-2Show 1 hidden projects...

Distributed Machine Learning

Back to top

Libraries that provide capabilities to distribute and parallelize machine learning tasks across large-scale compute infrastructure.

Ray (1st_place_medal32 · star 15K) - An open source framework that provides a simple, universal API for.. Apache-2dask (1st_place_medal32 · star 8K · chart_with_downwards_trend) - Parallel computing with task scheduling. BSD-3dask.distributed (1st_place_medal31 · star 1.2K · chart_with_downwards_trend) - A distributed task scheduler for Dask. BSD-3horovod (2nd_place_medal29 · star 11K) - Distributed training framework for TensorFlow, Keras, PyTorch, and.. Apache-2ipyparallel (2nd_place_medal28 · star 1.9K) - Interactive Parallel Computing in Python. BSD-3 Mesh (2nd_place_medal26 · star 910) - Mesh TensorFlow: Model Parallelism Made Easier. Apache-2 BigDL (2nd_place_medal25 · star 3.7K) - BigDL: Distributed Deep Learning Framework for Apache Spark. Apache-2Elephas (2nd_place_medal25 · star 1.5K) - Distributed Deep learning with Keras & Spark. MIT keras petastorm (2nd_place_medal25 · star 1.1K) - Petastorm library enables single machine or distributed training.. Apache-2mpi4py (2nd_place_medal25 · star 390) - Python bindings for MPI. BSD-3DeepSpeed (3rd_place_medal24 · star 4.5K) - DeepSpeed is a deep learning optimization library that makes.. MIT TensorFlowOnSpark (3rd_place_medal24 · star 3.6K) - TensorFlowOnSpark brings TensorFlow programs to.. Apache-2  dask-ml (3rd_place_medal24 · star 690) - Scalable Machine Learning with Dask. BSD-3MMLSpark (3rd_place_medal23 · star 2.3K) - Microsoft Machine Learning for Apache Spark. MIT analytics-zoo (3rd_place_medal22 · star 2.2K) - Distributed Tensorflow, Keras and PyTorch on Apache.. Apache-2 FairScale (3rd_place_medal21 · star 850) - PyTorch extensions for high performance and large scale training. BSD-3 Submit it (3rd_place_medal21 · star 310) - Python 3.6+ toolbox for submitting jobs to Slurm. MITApache Singa (3rd_place_medal19 · star 2.2K) - a distributed deep learning platform. Apache-2BytePS (3rd_place_medal18 · star 2.7K) - A high performance and generic framework for distributed DNN training. Apache-2Fiber (3rd_place_medal18 · star 860) - Distributed Computing for AI Made Simple. Apache-2Hivemind (3rd_place_medal18 · star 660) - Decentralized deep learning in PyTorch. Built to train models on.. MITsk-dist (3rd_place_medal18 · star 260) - Distributed scikit-learn meta-estimators in PySpark. Apache-2  somoclu (3rd_place_medal18 · star 220 · zzz) - Massively parallel self-organizing maps: accelerate training on.. MITShow 3 hidden projects...

Hyperparameter Optimization & AutoML

Back to top

Libraries for hyperparameter optimization, automl and neural architecture search.

Optuna (1st_place_medal31 · star 4.2K) - A hyperparameter optimization framework. MITHyperopt (1st_place_medal30 · star 5.5K) - Distributed Asynchronous Hyperparameter Optimization in Python. BSD-3scikit-optimize (1st_place_medal29 · star 2.1K) - Sequential model-based optimization with a `scipy.optimize`.. BSD-3Keras Tuner (1st_place_medal28 · star 2.3K) - Hyperparameter tuning for humans. Apache-2 AutoKeras (2nd_place_medal27 · star 7.8K) - AutoML library for deep learning. Apache-2 Bayesian Optimization (2nd_place_medal27 · star 4.9K) - A Python implementation of global optimization with.. MITNNI (2nd_place_medal26 · star 9.3K) - An open source AutoML toolkit for automate machine learning lifecycle,.. MITauto-sklearn (2nd_place_medal26 · star 5.3K) - Automated Machine Learning with scikit-learn. BSD-3 AutoGluon (2nd_place_medal26 · star 3K) - AutoGluon: AutoML for Text, Image, and Tabular Data. Apache-2 nevergrad (2nd_place_medal26 · star 2.8K) - A Python toolbox for performing gradient-free optimization. MITBoTorch (2nd_place_medal26 · star 1.9K) - Bayesian optimization in PyTorch. MIT SMAC3 (2nd_place_medal26 · star 560) - Sequential Model-based Algorithm Configuration. BSD-3featuretools (2nd_place_medal25 · star 5.4K) - An open source python library for automated feature engineering. BSD-3Ax (2nd_place_medal25 · star 1.4K) - Adaptive Experimentation Platform. MIT Hyperas (2nd_place_medal23 · star 2.1K) - Keras + Hyperopt: A very simple wrapper for convenient.. MIT GPyOpt (2nd_place_medal23 · star 720) - Gaussian Process Optimization using GPy. BSD-3Talos (2nd_place_medal22 · star 1.4K) - Hyperparameter Optimization for TensorFlow, Keras and PyTorch. MIT Orion (2nd_place_medal22 · star 180) - Asynchronous Distributed Hyperparameter Optimization. BSD-3AdaNet (3rd_place_medal21 · star 3.2K · zzz) - Fast and flexible AutoML with learning guarantees. Apache-2 mljar-supervised (3rd_place_medal21 · star 950) - Automates Machine Learning Pipeline with Feature Engineering.. MITNeuraxle (3rd_place_medal21 · star 380) - A Sklearn-like Framework for Hyperparameter Tuning and AutoML in.. Apache-2lazypredict (3rd_place_medal20 · star 400) - Lazy Predict help build a lot of basic models without much code.. MIT optunity (3rd_place_medal20 · star 360 · zzz) - optimization routines for hyperparameter tuning. BSD-3Auto ViML (3rd_place_medal20 · star 220) - Automatically Build Multiple ML Models with a Single Line of Code... Apache-2Test Tube (3rd_place_medal19 · star 660 · zzz) - Python library to easily log experiments and parallelize.. MITDragonfly (3rd_place_medal17 · star 570 · zzz) - An open source python library for scalable Bayesian optimisation. MITHyperparameterHunter (3rd_place_medal16 · star 650) - Easy hyperparameter optimization and automatic result.. MITAlphaPy (3rd_place_medal16 · star 560) - Automated Machine Learning [AutoML] with Python, scikit-learn, Keras,.. Apache-2Parfit (3rd_place_medal15 · star 200 · zzz) - A package for parallelizing the fit and flexibly scoring of.. MIT ENAS (3rd_place_medal13 · star 2.4K · zzz) - PyTorch implementation of Efficient Neural Architecture Search via.. Apache-2Devol (3rd_place_medal11 · star 920 · zzz) - Genetic neural architecture search with Keras. MITShow 14 hidden projects...

Reinforcement Learning

Back to top

Libraries for building and evaluating reinforcement learning & agent-based systems.

OpenAI Gym (1st_place_medal35 · star 24K) - A toolkit for developing and comparing reinforcement learning.. MITDopamine (1st_place_medal27 · star 9.3K) - Dopamine is a research framework for fast prototyping of.. Apache-2 TensorLayer (1st_place_medal27 · star 6.5K) - Deep Learning and Reinforcement Learning Library for.. Apache-2 TF-Agents (1st_place_medal27 · star 1.8K) - TF-Agents: A reliable, scalable and easy to use TensorFlow.. Apache-2 TensorForce (2nd_place_medal25 · star 2.9K) - Tensorforce: a TensorFlow library for applied.. Apache-2 ViZDoom (2nd_place_medal25 · star 1.2K) - Doom-based AI Research Platform for Reinforcement Learning from Raw.. MITStable Baselines (2nd_place_medal24 · star 3K) - A fork of OpenAI Baselines, implementations of reinforcement.. MITAcme (3rd_place_medal23 · star 2K) - A library of reinforcement learning components and agents. Apache-2 garage (3rd_place_medal22 · star 1.1K) - A toolkit for reproducible reinforcement learning research. MIT ChainerRL (3rd_place_medal22 · star 930) - ChainerRL is a deep reinforcement learning library built on top of.. MITPARL (3rd_place_medal21 · star 1.9K) - A high-performance distributed training framework for Reinforcement.. Apache-2 TRFL (3rd_place_medal19 · star 3.1K · zzz) - TensorFlow Reinforcement Learning. Apache-2 Coach (3rd_place_medal19 · star 1.9K) - Reinforcement Learning Coach by Intel AI Lab enables easy.. Apache-2PFRL (3rd_place_medal19 · star 530) - PFRL: a PyTorch-based deep reinforcement learning library. MITReAgent (3rd_place_medal17 · star 2.8K) - A platform for Reasoning systems (Reinforcement Learning,.. BSD-3 RLax (3rd_place_medal17 · star 570) - A library of reinforcement learning building blocks in JAX. Apache-2 jaxShow 3 hidden projects...

Recommender Systems

Back to top

Libraries for building and evaluating recommendation systems.

lightfm (1st_place_medal27 · star 3.5K) - A Python implementation of LightFM, a hybrid recommendation algorithm. Apache-2implicit (1st_place_medal27 · star 2.3K) - Fast Python Collaborative Filtering for Implicit Feedback Datasets. MITscikit-surprise (2nd_place_medal26 · star 4.7K · zzz) - A Python scikit for building and analyzing recommender.. BSD-3TF Ranking (2nd_place_medal22 · star 2.1K) - Learning to Rank in TensorFlow. Apache-2 Cornac (2nd_place_medal22 · star 310) - A Comparative Framework for Multimodal Recommender Systems. Apache-2Recommenders (2nd_place_medal21 · star 9.3K) - Best Practices on Recommendation Systems. MITfastFM (3rd_place_medal20 · star 910 · zzz) - fastFM: A Library for Factorization Machines. BSD-3RecBole (3rd_place_medal20 · star 770) - A unified, comprehensive and efficient recommendation library. MIT TF Recommenders (3rd_place_medal19 · star 750) - TensorFlow Recommenders is a library for building.. Apache-2 recmetrics (3rd_place_medal18 · star 240) - A library of metrics for evaluating recommender systems. MITCase Recommender (3rd_place_medal16 · star 320 · zzz) - Case Recommender: A Flexible and Extensible Python.. MIT Show 3 hidden projects...

Privacy Machine Learning

Back to top

Libraries for encrypted and privacy-preserving machine learning using methods like federated learning & differential privacy.

PySyft (1st_place_medal26 · star 6.9K) - A library for answering questions using data you cannot see. Apache-2 Opacus (2nd_place_medal22 · star 760) - Training PyTorch models with differential privacy. Apache-2 FATE (2nd_place_medal20 · star 2.8K) - An Industrial Grade Federated Learning Framework. Apache-2TensorFlow Privacy (2nd_place_medal20 · star 1.4K) - Library for training machine learning models with.. Apache-2 TFEncrypted (2nd_place_medal20 · star 830 · zzz) - A Framework for Encrypted Machine Learning in TensorFlow. Apache-2 CrypTen (3rd_place_medal16 · star 730) - A framework for Privacy Preserving Machine Learning. MIT 

Workflow & Experiment Tracking

Back to top

Libraries to organize, track, and visualize machine learning experiments.

Tensorboard (1st_place_medal36 · star 5.2K) - TensorFlow's Visualization Toolkit. Apache-2 mlflow (1st_place_medal32 · star 8.6K) - Open source platform for the machine learning lifecycle. Apache-2DVC (1st_place_medal30 · star 7.5K) - Data Version Control | Git for Data & Models. Apache-2wandb client (1st_place_medal30 · star 2.8K) - A tool for visualizing and tracking your machine learning.. MITSageMaker SDK (1st_place_medal30 · star 1.3K) - A library for training and deploying machine learning.. Apache-2  kaggle (2nd_place_medal29 · star 3.9K) - Official Kaggle API. Apache-2AzureML SDK (2nd_place_medal29 · star 2.2K) - Python notebooks with ML and deep learning examples with Azure.. MITsnakemake (2nd_place_medal29 · star 880) - This is the development home of the workflow management system.. MITtensorboardX (2nd_place_medal28 · star 6.8K) - tensorboard for pytorch (and chainer, mxnet, numpy, ...). MITsacred (2nd_place_medal28 · star 3.3K) - Sacred is a tool to help you configure, organize, log and reproduce.. MITPyCaret (2nd_place_medal28 · star 3K) - An open-source, low-code machine learning library in Python. MITMetaflow (2nd_place_medal26 · star 4.2K) - Build and manage real-life data science projects with ease. Apache-2Catalyst (2nd_place_medal26 · star 2.5K) - Accelerated deep learning R&D. Apache-2 VisualDL (2nd_place_medal24 · star 3.9K) - Deep Learning Visualization Toolkit. Apache-2 ClearML (2nd_place_medal24 · star 2.2K) - ClearML - Auto-Magical Suite of tools to streamline your ML.. Apache-2TNT (2nd_place_medal24 · star 1.3K) - Simple tools for logging and visualizing, loading and training. BSD-3 livelossplot (2nd_place_medal24 · star 1K) - Live training loss plot in Jupyter Notebook for Keras, PyTorch.. MIT ml-metadata (2nd_place_medal24 · star 290) - For recording and retrieving metadata associated with ML.. Apache-2TensorWatch (3rd_place_medal22 · star 3K) - Debugging, monitoring and visualization for Python Machine Learning.. MITknockknock (3rd_place_medal22 · star 2K · zzz) - Knock Knock: Get notified when your training ends with only two.. MITlore (3rd_place_medal21 · star 1.5K · zzz) - Lore makes machine learning approachable for Software Engineers and.. MITGuild AI (3rd_place_medal21 · star 550) - Experiment tracking, ML developer tools. Apache-2Studio.ml (3rd_place_medal21 · star 370) - Studio: Simplify and expedite model building process. Apache-2quinn (3rd_place_medal21 · star 220) - pyspark methods to enhance developer productivity. Apache-2 hiddenlayer (3rd_place_medal20 · star 1.4K · zzz) - Neural network graphs and training metrics for.. MIT   Labml (3rd_place_medal20 · star 500) - Monitor deep learning model training and hardware usage from your mobile.. MITgokart (3rd_place_medal19 · star 170) - A wrapper of the data pipeline library luigi. MITaim (3rd_place_medal15 · star 880) - Aim a super-easy way to record, search and compare 1000s of ML training.. Apache-2Show 7 hidden projects...

Model Serialization & Conversion

Back to top

Libraries to serialize models to files, convert between a variety of model formats, and optimize models for deployment.

onnx (1st_place_medal33 · star 9.9K) - Open standard for machine learning interoperability. Apache-2Core ML Tools (1st_place_medal26 · star 2.1K) - Core ML tools contain supporting tools for Core ML model.. BSD-3TorchServe (2nd_place_medal24 · star 1.6K) - Model Serving on PyTorch. Apache-2 mmdnn (2nd_place_medal23 · star 5.3K · zzz) - MMdnn is a set of tools to help users inter-operate among different deep.. MITcortex (2nd_place_medal21 · star 7.4K) - Model serving at scale. Apache-2m2cgen (2nd_place_medal21 · star 1.8K) - Transform ML models into a native code (Java, C, Python, Go, JavaScript,.. MITHummingbird (3rd_place_medal20 · star 2.3K) - Hummingbird compiles trained ML models into tensor computation for.. MITpytorch2keras (3rd_place_medal18 · star 670 · zzz) - PyTorch to Keras model convertor. MITtfdeploy (3rd_place_medal16 · star 350) - Deploy tensorflow graphs for fast evaluation and export to.. BSD-3 Show 2 hidden projects...

Model Interpretability

Back to top

Libraries to visualize, explain, debug, evaluate, and interpret machine learning models.

shap (1st_place_medal34 · star 12K) - A game theoretic approach to explain the output of any machine learning model. MITLime (1st_place_medal29 · star 8.5K) - Lime: Explaining the predictions of any machine learning classifier. BSD-2pyLDAvis (1st_place_medal28 · star 1.4K) - Python library for interactive topic model visualization. Port of.. BSD-3 InterpretML (1st_place_medal27 · star 3.5K) - Fit interpretable models. Explain blackbox machine learning. MIT Model Analysis (1st_place_medal27 · star 1K) - Model analysis tools for TensorFlow. Apache-2  yellowbrick (2nd_place_medal25 · star 3.1K) - Visual analysis and diagnostic tools to facilitate machine.. Apache-2 Captum (2nd_place_medal25 · star 2.2K) - Model interpretability and understanding for PyTorch. BSD-3 dtreeviz (2nd_place_medal25 · star 1.4K) - A python library for decision tree visualization and model interpretation. MITFairness 360 (2nd_place_medal25 · star 1.2K) - A comprehensive set of fairness metrics for datasets and.. Apache-2arviz (2nd_place_medal25 · star 960) - Exploratory analysis of Bayesian models with Python. Apache-2Lucid (2nd_place_medal24 · star 4.1K) - A collection of infrastructure and tools for research in neural.. Apache-2 DoWhy (2nd_place_medal24 · star 2.7K) - DoWhy is a Python library for causal inference that supports explicit.. MITkeras-vis (2nd_place_medal23 · star 2.8K · zzz) - Neural network visualization toolkit for keras. MIT TreeInterpreter (2nd_place_medal23 · star 650) - Package for interpreting scikit-learn's decision tree.. BSD-3 Alibi (2nd_place_medal22 · star 910) - Algorithms for monitoring and explaining machine learning models. Apache-2keract (2nd_place_medal22 · star 860) - Activation Maps (Layers Outputs) and Gradients in Keras. MIT random-forest-importances (2nd_place_medal22 · star 420) - Code to compute permutation and drop-column.. MIT Explainability 360 (3rd_place_medal21 · star 780) - Interpretability and explainability of data and machine.. Apache-2iNNvestigate (3rd_place_medal21 · star 780) - A toolbox to iNNvestigate neural networks' predictions!. BSD-2 tf-explain (3rd_place_medal21 · star 780) - Interpretability Methods for tf.keras models with Tensorflow 2.x. MIT fairlearn (3rd_place_medal21 · star 710) - A Python package to assess and improve fairness of machine.. MIT aequitas (3rd_place_medal21 · star 360) - Bias and Fairness Audit Toolkit. MITexplainerdashboard (3rd_place_medal20 · star 370) - Quickly build Explainable AI dashboards that show the inner.. MITchecklist (3rd_place_medal19 · star 1.3K) - Beyond Accuracy: Behavioral Testing of NLP models with CheckList. MIT CausalNex (3rd_place_medal19 · star 1K) - A Python library that helps data scientists to infer.. Apache-2  deeplift (3rd_place_medal19 · star 510) - Public facing deeplift repo. MITWhat-If Tool (3rd_place_medal19 · star 460) - Source code/webpage/demos for the What-If Tool. Apache-2sklearn-evaluation (3rd_place_medal19 · star 290) - Machine learning model evaluation made easy: plots,.. MIT tcav (3rd_place_medal18 · star 440) - Code for the TCAV ML interpretability project. Apache-2 fairness-indicators (3rd_place_medal18 · star 180) - Tensorflow's Fairness Evaluation and Visualization.. Apache-2  LIT (3rd_place_medal17 · star 2.4K) - The Language Interpretability Tool: Interactively analyze NLP models for.. Apache-2ExplainX.ai (3rd_place_medal17 · star 190) - Explainable AI framework for data scientists. Explain & debug any.. MITimodels (3rd_place_medal17 · star 190) - Interpretable ML package for concise, transparent, and accurate predictive.. MITDiCE (3rd_place_medal16 · star 480) - Generate Diverse Counterfactual Explanations for any machine.. MIT  LOFO (3rd_place_medal16 · star 310 · zzz) - Leave One Feature Out Importance. MITmodel-card-toolkit (3rd_place_medal16 · star 180) - a tool that leverages rich metadata and lineage.. Apache-2FlashTorch (3rd_place_medal15 · star 560 · zzz) - Visualization toolkit for neural networks in PyTorch! Demo --. MIT Anchor (3rd_place_medal14 · star 630) - Code for High-Precision Model-Agnostic Explanations paper. BSD-2Show 8 hidden projects...

Vector Similarity Search (ANN)

Back to top

Libraries for Approximate Nearest Neighbor Search and Vector Indexing/Similarity Search.

link ANN Benchmarks ( star 2.1K) - Benchmarks of approximate nearest neighbor libraries in Python.

Faiss (1st_place_medal29 · star 13K) - A library for efficient similarity search and clustering of dense vectors. MITAnnoy (1st_place_medal29 · star 8.2K) - Approximate Nearest Neighbors in C++/Python optimized for memory usage.. Apache-2NMSLIB (2nd_place_medal28 · star 2.3K) - Non-Metric Space Library (NMSLIB): An efficient similarity search.. Apache-2hnswlib (2nd_place_medal26 · star 1.4K) - Header-only C++/python library for fast approximate nearest neighbors. Apache-2Milvus (2nd_place_medal25 · star 5.3K) - An open source embedding vector similarity search engine powered by.. Apache-2PyNNDescent (2nd_place_medal25 · star 380) - A Python nearest neighbor descent for approximate nearest neighbors. BSD-2Magnitude (3rd_place_medal23 · star 1.4K · zzz) - A fast, efficient universal vector embedding utility package. MITNGT (3rd_place_medal19 · star 630) - Nearest Neighbor Search with Neighborhood Graph and Tree for High-.. Apache-2N2 (3rd_place_medal19 · star 460) - TOROS N2 - lightweight approximate Nearest Neighbor library which runs fast.. Apache-2Show 2 hidden projects...

Probabilistics & Statistics

Back to top

Libraries providing capabilities for probabilistic programming/reasoning, bayesian inference, gaussian processes, or statistics.

PyMC3 (1st_place_medal32 · star 5.6K) - Probabilistic Programming in Python: Bayesian Modeling and.. Apache-2tensorflow-probability (1st_place_medal31 · star 3.3K) - Probabilistic reasoning and statistical analysis in.. Apache-2 hmmlearn (1st_place_medal29 · star 2.2K) - Hidden Markov Models in Python, with scikit-learn like API. BSD-3 Pyro (2nd_place_medal28 · star 6.8K) - Deep universal probabilistic programming with Python and PyTorch. Apache-2 GPyTorch (2nd_place_medal28 · star 2.3K) - A highly efficient and modular implementation of Gaussian Processes.. MIT pomegranate (2nd_place_medal27 · star 2.6K) - Fast, flexible and easy to use probabilistic modelling in Python. MITfilterpy (2nd_place_medal27 · star 1.7K) - Python Kalman filtering and optimal estimation library. Implements.. MITGPflow (2nd_place_medal27 · star 1.4K) - Gaussian processes in TensorFlow. Apache-2 pgmpy (3rd_place_medal25 · star 1.7K) - Python Library for learning (Structure and Parameter) and inference.. MITSALib (3rd_place_medal24 · star 440) - Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris,.. MITbambi (3rd_place_medal20 · star 580) - BAyesian Model-Building Interface (Bambi) in Python. MITscikit-posthocs (3rd_place_medal20 · star 190) - Multiple Pairwise Comparisons (Post Hoc) Tests in Python. MIT Funsor (3rd_place_medal19 · star 160) - Functional tensors for probabilistic programming. Apache-2 pyhsmm (3rd_place_medal18 · star 480 · zzz) - Bayesian inference in HSMMs and HMMs. MITOrbit (3rd_place_medal18 · star 340) - A Python package for Bayesian forecasting with object-oriented design.. Apache-2Baal (3rd_place_medal17 · star 320) - Using approximate bayesian posteriors in deep nets for active learning. Apache-2Show 5 hidden projects...

Adversarial Robustness

Back to top

Libraries for testing the robustness of machine learning models against attacks with adversarial/malicious examples.

CleverHans (1st_place_medal27 · star 5K) - An adversarial example library for constructing attacks, building.. MIT Foolbox (1st_place_medal27 · star 1.8K) - A Python toolbox to create adversarial examples that fool neural networks.. MITART (2nd_place_medal23 · star 2.1K) - Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning.. MITTextAttack (2nd_place_medal23 · star 1.3K) - TextAttack is a Python framework for adversarial attacks, data.. MITrobustness (3rd_place_medal18 · star 490) - A library for experimenting with, training and evaluating neural.. MITAdvBox (3rd_place_medal16 · star 1.1K · zzz) - Advbox is a toolbox to generate adversarial examples that fool.. Apache-2Show 2 hidden projects...

GPU Utilities

Back to top

Libraries that require and make use of CUDA/GPU system capabilities to optimize data handling and machine learning tasks.

CuPy (1st_place_medal31 · star 4.9K) - A NumPy-compatible array library accelerated by CUDA. MITgpustat (1st_place_medal26 · star 2.3K) - A simple command-line utility for querying and monitoring GPU status. MITPyCUDA (2nd_place_medal25 · star 1.1K · chart_with_downwards_trend) - CUDA integration for Python, plus shiny features. MITApex (2nd_place_medal23 · star 5.1K) - A PyTorch Extension: Tools for easy mixed precision and distributed.. BSD-3 ArrayFire (2nd_place_medal23 · star 3.3K) - ArrayFire: a general purpose GPU library. BSD-3scikit-cuda (2nd_place_medal23 · star 800) - Python interface to GPU-powered libraries. BSD-3cuDF (3rd_place_medal21 · star 3.7K) - cuDF - GPU DataFrame Library. Apache-2py3nvml (3rd_place_medal21 · star 170 · zzz) - Python 3 Bindings for NVML library. Get NVIDIA GPU status inside.. BSD-3DALI (3rd_place_medal20 · star 3.1K) - A library containing both highly optimized building blocks and an.. Apache-2cuML (3rd_place_medal19 · star 2K) - cuML - RAPIDS Machine Learning Library. Apache-2BlazingSQL (3rd_place_medal17 · star 1.4K) - BlazingSQL is a lightweight, GPU accelerated, SQL engine for.. Apache-2Vulkan Kompute (3rd_place_medal17 · star 350) - General purpose GPU compute framework for cross vendor.. Apache-2cuGraph (3rd_place_medal16 · star 670) - cuGraph - RAPIDS Graph Analytics Library. Apache-2cuSignal (3rd_place_medal15 · star 460) - GPU accelerated signal processing. Apache-2Show 4 hidden projects...

Tensorflow Utilities

Back to top

Libraries that extend TensorFlow with additional capabilities.

tensorflow-hub (1st_place_medal32 · star 2.8K) - A library for transfer learning by reusing parts of.. Apache-2 tensor2tensor (1st_place_medal31 · star 11K) - Library of deep learning models and datasets designed to.. Apache-2 TF Addons (1st_place_medal31 · star 1.2K) - Useful extra functionality for TensorFlow 2.x maintained by.. Apache-2 TensorFlow Transform (2nd_place_medal29 · star 860) - Input pipeline framework. Apache-2 TensorFlow I/O (2nd_place_medal26 · star 420) - Dataset, streaming, and file system extensions.. Apache-2 TF Model Optimization (3rd_place_medal25 · star 980) - A toolkit to optimize ML models for deployment for.. Apache-2 efficientnet (3rd_place_medal23 · star 1.7K) - Implementation of EfficientNet model. Keras and.. Apache-2 TensorFlow Cloud (3rd_place_medal22 · star 230) - The TensorFlow Cloud repository provides APIs that.. Apache-2 Neural Structured Learning (3rd_place_medal21 · star 790) - Training neural models with structured signals. Apache-2 TensorNets (3rd_place_medal19 · star 980) - High level network definitions with pre-trained weights in.. MIT tffm (3rd_place_medal18 · star 760 · zzz) - TensorFlow implementation of an arbitrary order Factorization Machine. MIT TF Compression (3rd_place_medal18 · star 450) - Data compression in TensorFlow. Apache-2 Saliency (3rd_place_medal17 · star 640) - TensorFlow implementation for SmoothGrad, Grad-CAM, Guided.. Apache-2 

Sklearn Utilities

Back to top

Libraries that extend scikit-learn with additional capabilities.

imbalanced-learn (1st_place_medal31 · star 5.1K) - A Python Package to Tackle the Curse of Imbalanced.. MIT MLxtend (1st_place_medal30 · star 3.4K) - A library of extension and helper modules for Python's data.. BSD-3 category_encoders (2nd_place_medal24 · star 1.6K · zzz) - A library of sklearn compatible categorical variable.. BSD-3 sklearn-contrib-lightning (2nd_place_medal24 · star 1.4K) - Large-scale linear classification, regression and.. BSD-3 scikit-opt (2nd_place_medal22 · star 2K) - Genetic Algorithm, Particle Swarm Optimization, Simulated.. MIT fancyimpute (2nd_place_medal22 · star 940) - Multivariate imputation and matrix completion algorithms.. Apache-2 combo (2nd_place_medal22 · star 480) - (AAAI' 20) A Python Toolbox for Machine Learning Model.. BSD-2  xgboostscikit-lego (3rd_place_medal20 · star 440) - Extra blocks for scikit-learn pipelines. MIT DESlib (3rd_place_medal20 · star 320) - A Python library for dynamic classifier and ensemble selection. BSD-3 iterative-stratification (3rd_place_medal19 · star 530) - scikit-learn cross validators for iterative.. BSD-3 scikit-tda (3rd_place_medal19 · star 270) - Topological Data Analysis for Python. MIT skggm (3rd_place_medal16 · star 180) - Scikit-learn compatible estimation of general graphical models. MIT Show 5 hidden projects...

Pytorch Utilities

Back to top

Libraries that extend Pytorch with additional capabilities.

pretrainedmodels (1st_place_medal27 · star 7.8K · zzz) - Pretrained ConvNets for pytorch: NASNet, ResNeXt,.. BSD-3 pytorch-summary (1st_place_medal25 · star 3K · zzz) - Model summary in PyTorch similar to `model.summary()` in.. MIT pytorch-optimizer (1st_place_medal25 · star 1.7K) - torch-optimizer -- collection of optimizers for.. Apache-2 EfficientNet-PyTorch (2nd_place_medal24 · star 5.5K) - A PyTorch implementation of EfficientNet. Apache-2 torchdiffeq (2nd_place_medal24 · star 3.4K) - Differentiable ODE solvers with full GPU support and.. MIT PML (2nd_place_medal24 · star 2.8K) - The easiest way to use deep metric learning in your application. Modular,.. MIT SRU (2nd_place_medal23 · star 1.9K) - Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755). MIT Torchmeta (2nd_place_medal21 · star 1.2K) - A collection of extensions and data-loaders for few-shot learning.. MIT torch-scatter (2nd_place_medal21 · star 610) - PyTorch Extension Library of Optimized Scatter Operations. MIT PyTorch Sparse (2nd_place_medal21 · star 360) - PyTorch Extension Library of Optimized Autograd Sparse.. MIT reformer-pytorch (2nd_place_medal20 · star 1.4K) - Reformer, the efficient Transformer, in Pytorch. MIT EfficientNets (2nd_place_medal20 · star 1.3K) - Pretrained EfficientNet, EfficientNet-Lite, MixNet,.. Apache-2 Higher (2nd_place_medal20 · star 1.1K) - higher is a pytorch library allowing users to obtain higher.. Apache-2 TabNet (2nd_place_medal20 · star 860) - PyTorch implementation of TabNet paper :.. MIT Pytorch Toolbelt (3rd_place_medal19 · star 940) - PyTorch extensions for fast R&D prototyping and Kaggle.. MIT Performer Pytorch (3rd_place_medal17 · star 540 · hatching_chick) - An implementation of Performer, a linear attention-.. MIT Tensor Sensor (3rd_place_medal17 · star 530) - The goal of this library is to generate more helpful.. MIT tinygrad (3rd_place_medal15 · star 4.1K · hatching_chick) - You like pytorch? You like micrograd? You love tinygrad!. MIT Lambda Networks (3rd_place_medal15 · star 1.4K · hatching_chick) - Implementation of LambdaNetworks, a new approach to.. MIT Torch-Struct (3rd_place_medal15 · star 910) - Fast, general, and tested differentiable structured prediction.. MIT torchsde (3rd_place_medal15 · star 680) - Differentiable SDE solvers with GPU support and efficient.. Apache-2 Pywick (3rd_place_medal15 · star 320) - High-level batteries-included neural network training library for.. MIT Tez (3rd_place_medal14 · star 580 · hatching_chick) - Tez is a super-simple and lightweight Trainer for PyTorch. It.. Apache-2 micrograd (3rd_place_medal12 · star 1.6K · zzz) - A tiny scalar-valued autograd engine and a neural net library.. MIT Show 3 hidden projects...

Database Clients

Back to top

Libraries for connecting to, operating, and querying databases.

link best-of-python - DB Clients ( star 1.5K · hatching_chick) - Collection of database clients for python.

Others

Back to top

scipy (1st_place_medal40 · star 8K) - Ecosystem of open-source software for mathematics, science, and engineering. BSD-3SymPy (1st_place_medal36 · star 7.9K) - A computer algebra system written in pure Python. BSD-3Autograd (1st_place_medal30 · star 5.2K) - Efficiently computes derivatives of numpy code. MIThdbscan (1st_place_medal29 · star 1.8K) - A high performance implementation of HDBSCAN clustering. BSD-3 PyOD (1st_place_medal28 · star 4.2K) - (JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly.. BSD-2Keras-Preprocessing (1st_place_medal28 · star 920) - Utilities for working with image data, text data, and.. MIT Cython BLIS (1st_place_medal28 · star 160) - Fast matrix-multiplication as a self-contained Python library no.. BSD-3Streamlit (2nd_place_medal27 · star 14K) - Streamlit The fastest way to build data apps in Python. Apache-2carla (2nd_place_medal26 · star 5.7K) - Open-source simulator for autonomous driving research. MITDatasette (2nd_place_medal26 · star 4.8K) - An open source multi-tool for exploring and publishing data. Apache-2DeepChem (2nd_place_medal26 · star 2.8K) - Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry,.. MIT agate (2nd_place_medal26 · star 1K) - A Python data analysis library that is optimized for humans instead of machines. MITpyclustering (2nd_place_medal26 · star 800) - pyclustring is a Python, C++ data mining library. BSD-3Trax (2nd_place_medal25 · star 5.9K) - Trax Deep Learning with Clear Code and Speed. Apache-2causalml (2nd_place_medal25 · star 1.8K) - Uplift modeling and causal inference with machine learning.. Apache-2Pythran (2nd_place_medal25 · star 1.5K) - Ahead of Time compiler for numeric kernels. BSD-3TabPy (2nd_place_medal25 · star 1K) - Execute Python code on the fly and display results in Tableau visualizations:. MITkmodes (2nd_place_medal25 · star 820) - Python implementations of the k-modes and k-prototypes clustering.. MITmetric-learn (2nd_place_medal24 · star 1.1K · zzz) - Metric learning algorithms in Python. MIT PennyLane (2nd_place_medal24 · star 800) - PennyLane is a cross-platform Python library for differentiable.. Apache-2pyopencl (2nd_place_medal24 · star 790 · chart_with_downwards_trend) - OpenCL integration for Python, plus shiny features. MITPySwarms (2nd_place_medal24 · star 740) - A research toolkit for particle swarm optimization in Python. MITpyjanitor (2nd_place_medal24 · star 640) - Clean APIs for data cleaning. Python implementation of R package Janitor. MITfindspark (2nd_place_medal24 · star 390 · zzz) - Find pyspark to make it importable. BSD-3 datalad (2nd_place_medal24 · star 230) - Keep code, data, containers under control with git and git-annex. MITGradio (3rd_place_medal23 · star 2.1K) - Wrap UIs around any model, share with anyone. Apache-2modAL (3rd_place_medal23 · star 1.1K) - A modular active learning framework for Python. MIT PaddleHub (3rd_place_medal22 · star 4.7K) - Awesome pre-trained models toolkit based on.. Apache-2 pycm (3rd_place_medal22 · star 1.1K) - Multi-class confusion matrix library in Python. MITPrince (3rd_place_medal22 · star 590) - Python factor analysis library (PCA, CA, MCA, MFA, FAMD). MIT SUOD (3rd_place_medal22 · star 240) - (MLSys' 21) An Acceleration System for Large-scare Unsupervised.. BSD-2Mars (3rd_place_medal21 · star 2.1K) - Mars is a tensor-based unified framework for large-scale data.. Apache-2tensorly (3rd_place_medal21 · star 970) - TensorLy: Tensor Learning in Python. BSD-2StreamAlert (3rd_place_medal20 · star 2.5K) - StreamAlert is a serverless, realtime data analysis framework.. Apache-2AstroML (3rd_place_medal20 · star 730) - Machine learning, statistics, and data mining for astronomy and.. BSD-2 alibi-detect (3rd_place_medal20 · star 600) - Algorithms for outlier and adversarial instance detection,.. Apache-2baikal (3rd_place_medal20 · star 570) - A graph-based functional API for building complex scikit-learn pipelines. BSD-3BioPandas (3rd_place_medal20 · star 330) - Working with molecular structures in pandas DataFrames. BSD-3 scikit-rebate (3rd_place_medal20 · star 310) - A scikit-learn-compatible Python implementation of ReBATE, a.. MIT rrcf (3rd_place_medal20 · star 290 · zzz) - Implementation of the Robust Random Cut Forest algorithm for anomaly.. MITFeature Engine (3rd_place_medal19 · star 470) - Feature engineering package with sklearn like functionality. BSD-3apricot (3rd_place_medal18 · star 310) - apricot implements submodular optimization for the purpose of selecting.. MITRiver (3rd_place_medal17 · star 1.4K) - Online machine learning in Python. BSD-3traingenerator (3rd_place_medal10 · star 940 · hatching_chick) - A web app to generate template code for machine learning. MITShow 8 hidden projects...

Related Resources


3. Machine Learning - TensorFlow - Part III

https://www.tensorflow.org/

 

 

 

 

4. Artificial Intelligence (AI) - Part I

Reproduced from GitHub https://github.com/

A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.

Contents

  1. Courses
  2. Books
  3. Programming
  4. Philosophy
  5. Free Content
  6. Code
  7. Videos
  8. Learning
  9. Organizations
  10. Journals
  11. Competitions
  12. Newsletters
  13. Misc

Courses

Books

Programming

Philosophy

Free Content

Code

Videos

Learning

Organizations

Journals

Competitions

Newsletters

Misc




5. Artificial Intelligence (AI) - Part II

A curated list of awesome awesomeness about artificial intelligence(AI).

Table of Contents

Artificial Intelligence(AI)

Machine Learning(ML)

Deep Learning(DL)

Computer Vision(CV)

Natural Language Processing(NLP)

Speech Recognition

Other Research Topics

Programming Languages

Framework

Datasets

AI Career




 

6. Artificial Intelligence (AI) - Part III

 

A curated list of artificial intelligence resources (Courses, Tools, App, Open Source Project)

Contents

  1. Courses & Articles
  2. Artificial Intelligence Company & Reseach Institute
  3. Artificial Intelligence Tools
  4. Books
  5. Development
  6. News
  7. Events and Conferences

Courses & Articles

Artificial Intelligence Generative Adversarial Networks (GANs) Robotics

Artificial Intelligence Company & Research Institute

Business Intelligence & Analytics Machine Learning Robotics Conversational Interfaces & Chatbots Data Science Development Vehicle Insurance / Legal

Artificial Intelligence Tools

Personal Tools Education Tools Health / Medical Tools Travel AI Tools Finance AI Tools Language / Translation AI Tools IoT / IIoT Research Tools

CaptionBot — Microsoft describes any photo Crowdfunding.ai — crowdfunding platform for AI projects Fieldguide — universal field guide that suggests possible matches

Books

Blogs, Papers, and Articles

Development

Bot Development Haskell C++ Java Julia Javascript Python PHP R TensorFlow

News

Podcast

Events and Conferences

Location