Search Results for "python data analysis" - Page 10

Showing 4112 open source projects for "python data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Build experiences that drive engagement and increase transactions Icon
    Build experiences that drive engagement and increase transactions

    Connect your users - doctors, gamers, shoppers, or lovers - wherever they are.

    Sendbird's chat, voice, and video APIs power conversations and communities in hundreds of the most innovative apps and products. Sendbird’s feature-rich platform, and pre-fab UI components make developers more productive. We take care of a ton of operational complexity under the hood, so you can power a rich chat service, and life-like voice, and video experiences, and not worry about features, edge cases, reliability, or scale.
    Learn More
  • 1
    DataChain

    DataChain

    AI-data warehouse to enrich, transform and analyze unstructured data

    Datachain enables multimodal API calls and local AI inferences to run in parallel over many samples as chained operations. The resulting datasets can be saved, versioned, and sent directly to PyTorch and TensorFlow for training. Datachain can persist features of Python objects returned by AI models, and enables vectorized analytical operations over them. The typical use cases are data curation, LLM analytics and validation, image segmentation, pose detection, and GenAI alignment. Datachain is especially helpful if batch operations can be optimized – for instance, when synchronous API calls can be parallelized or where an LLM API offers batch processing.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Bytewax

    Bytewax

    Python Stream Processing

    Bytewax is a Python framework that simplifies event and stream processing. Because Bytewax couples the stream and event processing capabilities of Flink, Spark, and Kafka Streams with the friendly and familiar interface of Python, you can re-use the Python libraries you already know and love. Connect data sources, run stateful transformations, and write to various downstream systems with built-in connectors or existing Python libraries.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    DeepVariant

    DeepVariant

    DeepVariant is an analysis pipeline that uses a deep neural networks

    DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data. DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    julep

    julep

    A new DSL and server for AI agents and multi-step tasks

    Julep is a platform for creating AI agents that remember past interactions and can perform complex tasks. It offers long-term memory and manages multi-step processes. Julep enables the creation of multi-step tasks incorporating decision-making, loops, parallel processing, and integration with numerous external tools and APIs. While many AI applications are limited to simple, linear chains of prompts and API calls with minimal branching, Julep is built to handle more complex scenarios.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Deliver trusted data with dbt Icon
    Deliver trusted data with dbt

    dbt Labs empowers data teams to build reliable, governed data pipelines—accelerating analytics and AI initiatives with speed and confidence.

    Data teams use dbt to codify business logic and make it accessible to the entire organization—for use in reporting, ML modeling, and operational workflows.
    Learn More
  • 5
    Sentry

    Sentry

    Cross-platform application monitoring and error tracking software

    ...It lets you build better software faster and more efficiently by showing you all issues in one place and providing the trail of events that lead to errors. It also provides real-time monitoring and data visualization through dashboards. Sentry’s server is in Python, but its API enables for sending events from any language, in any application. More than fifty-thousand companies already ship better software faster thanks to Sentry; let yours be one of them!
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Fli

    Fli

    Google Flights MCP and Python Library

    Fli is a powerful Python library and command-line tool that provides direct programmatic access to Google Flights data through reverse-engineered API interactions rather than traditional web scraping. This approach enables faster, more reliable, and more stable access to flight information, avoiding the fragility associated with HTML parsing and UI changes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    msgspec

    msgspec

    A fast serialization and validation library, with builtin

    msgspec is a fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    pdfly

    pdfly

    CLI tool to extract (meta)data from PDF and manipulate PDF files

    A Python library designed for manipulating PDF files with functionalities for extraction, transformation, and document generation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    Cortex Analyzers

    Cortex Analyzers

    Cortex Analyzers Repository

    Analyzers can be written in any programming language supported by Linux such as Python, Ruby, Perl, etc. Refer to the How to Write and Submit an Analyzer page for details on how to write and submit one.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Award-winning proxy networks, AI-powered web scrapers, and business-ready datasets for download.
 Icon
    Award-winning proxy networks, AI-powered web scrapers, and business-ready datasets for download.


    How the world collects public web data

    Bright Data is a leading data collection platform, enabling businesses to collect crucial structured and unstructured data from millions of websites through our proprietary technology. Our proxy networks give you access to sophisticated target sites using precise geo-targeting. You can also use our tools to unblock tough target sites, accomplish SERP-specific data collection tasks, manage and optimize your proxy performance as well as automating all of your data collection needs.
    Learn More
  • 10
    Matplotlib

    Matplotlib

    matplotlib: plotting with Python

    Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible. Matplotlib ships with several add-on toolkits, including 3D plotting with mplot3d, axes helpers in axes_grid1 and axis helpers in axisartist. A large number of third party packages extend and build on Matplotlib functionality, including several higher-level plotting interfaces (seaborn, HoloViews, ggplot, ...), and a...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 11
    E2B Cookbook

    E2B Cookbook

    Examples of using E2B

    ...The repository acts as a practical learning resource for developers who want to integrate AI agents with secure cloud execution environments that allow large language models to run code and interact with tools. The examples illustrate how developers can build AI workflows capable of performing tasks such as data analysis, code execution, and application generation inside isolated sandbox environments. E2B itself provides secure Linux-based sandboxes that enable AI systems to safely run generated code and interact with real computing resources without compromising the host environment. The cookbook organizes examples across multiple frameworks and model providers, allowing developers to experiment with integrations involving models from OpenAI, Anthropic, and other ecosystems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Dash

    Dash

    Build beautiful web-based analytic apps, no JavaScript required

    Dash is a Python framework for building beautiful analytical web applications without any JavaScript. Built on top of Plotly.js, React and Flask, Dash easily achieves what an entire team of designers and engineers normally would. It ties modern UI controls and displays such as dropdown menus, sliders and graphs directly to your analytical Python code, and creates exceptional, interactive analytics apps. Dash apps are very lightweight, requiring only a limited number of lines of Python or...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    JILL.py

    JILL.py

    A cross-platform installer for the Julia programming language

    The enhanced Python fork of JILL, Julia Installer for Linux (and every other platform), Light.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Apache Hamilton

    Apache Hamilton

    Helps data scientists define testable self-documenting dataflows

    Apache Hamilton is an open-source Python framework designed to simplify the creation and management of dataflows used in analytics, machine learning pipelines, and data engineering workflows. The framework enables developers to define data transformations as simple Python functions, where each function represents a node in a dataflow graph and its parameters define dependencies on other nodes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    pytype

    pytype

    A static type analyzer for Python code

    pytype is a static type analyzer that checks and infers types for Python code without executing it, catching errors at “compile time” and generating actionable diagnostics. It grew alongside Python typing at Google and can understand both inline annotations and unannotated code via powerful inference. The tool consumes stub files (.pyi) for the standard library and third-party packages (from typeshed and its own built-ins), enabling accurate checks even in large, mixed-quality codebases....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    pytablewriter

    pytablewriter

    pytablewriter is a Python library to write a table in various formats

    pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV / YAML.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    simplejson

    simplejson

    simplejson is a simple, fast, extensible JSON encoder/decoder

    simplejson is a simple, fast, complete, correct and extensible JSON <http://json.org> encoder and decoder for Python 3.3+ with legacy support for Python 2.5+. It is pure Python code with no dependencies but includes an optional C extension for a serious speed boost. simplejson is the externally maintained development version of the json library included with Python (since 2.6). This version is tested with the latest Python 3.8 and maintains backward compatibility with Python 3.3+ and the...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    py-pdf-parser

    py-pdf-parser

    A Python tool to help extracting information from structured PDFs

    py-pdf-parser is a Python tool designed to help extract information from structured PDFs. It provides a simple interface to define parsing rules and extract data from PDF documents. ​
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 20
    GemGIS

    GemGIS

    Spatial data processing for geomodeling

    GemGIS is a Python-based, open-source geographic information processing library. It is capable of preprocessing spatial data such as vector data (shape files, geojson files, geopackages,…), raster data (tif, png,…), data obtained from online services (WCS, WMS, WFS) or XML/KML files (soon). Preprocessed data can be stored in a dedicated Data Class to be passed to the geomodeling package GemPy in order to accelerate the model-building process. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Argus

    Argus

    Python toolkit for OSINT and reconnaissance with 135+ modules

    Argus is a Python-based open source toolkit designed to simplify information gathering and reconnaissance tasks in cybersecurity. It provides an integrated command-line environment that consolidates numerous reconnaissance utilities into a single framework. The tool enables users to collect data about networks, domains, web applications, and infrastructure in an organized and efficient manner.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Mimesis

    Mimesis

    High-performance fake data generator for Python

    Mimesis is an open source high-performance fake data generator for Python, able to provide data for various purposes in various languages. It's currently the fastest fake data generator for Python, and supports many different data providers that can produce data related to people, food, transportation, internet and many more. Mimesis is really easy to use, with everything you need just an import away.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Anna’s Archive

    Anna’s Archive

    Comprehensive search engine for books, papers, comics, magazines

    Anna’s Archive is a large-scale open-source search engine and data aggregation platform designed to index and provide access to a vast collection of books, academic papers, comics, magazines, and other digital texts through a unified interface. The project includes all the infrastructure required to run a full instance locally or in production, combining web servers, databases, and search indexing systems into a scalable architecture. It relies heavily on technologies such as Elasticsearch...
    Downloads: 78 This Week
    Last Update:
    See Project
  • 24
    jsonschema

    jsonschema

    An implementation of the JSON Schema specification for Python

    jsonschema is an implementation of the JSON Schema specification for Python. Full support for Draft 2020-12, Draft 2019-09, Draft 7, Draft 6, Draft 4 and Draft 3. Lazy validation that can iteratively report all validation errors. Programmatic querying of which properties or items failed validation.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    Mercury

    Mercury

    Convert Python notebook to web app and share with non-technical users

    Turn Python notebooks to web applications with open-source Mercury framework. Hide code and add interactive widgets. Non-technical users can tweak widgets and execute notebook with new parameters. The core of Mercury is Open Source under AGPLv3. We provide Mercury Pro with additional features, dedicated support and friendly commercial license. Mercury is a perfect tool to convert Python notebook to interactive web application and share with non-programmers. You define interactive widgets for...
    Downloads: 5 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB