GIT_FEED

D4Vinci/Scrapling

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

View on GitHub

What it does

Scrapling is a Python tool that automatically collects data from websites at any scale — from grabbing a single page to running massive, coordinated web crawls. It's smart enough to adapt when websites change their layout, and it can slip past anti-bot protections that typically block automated data collection.

Why it matters

With 66,000+ stars, this is one of the most widely adopted open-source web data tools available, signaling massive demand for affordable, scalable data collection outside expensive third-party APIs. For builders, it dramatically lowers the cost and complexity of feeding products with real-time web data — a core requirement for AI applications, market intelligence tools, and price-monitoring services.

Why it's trending

Web scraping has become a critical infrastructure problem for AI teams and data businesses, and Scrapling is catching fire because it solves the part that always breaks — when sites update their layouts or start blocking bots, most scrapers just fail silently. The project added over 8,100 stars this week alone and is sustaining that pace with 114 commits in the last 30 days, signaling that this isn't a viral moment but an active, fast-moving project that builders are genuinely adopting. With 3,265 forks and a small but highly productive team of 15 contributors driving that commit volume, this looks like serious infrastructure being built by practitioners for practitioners.

38Active

On the radar — signal detected

Stars
66.8k
Forks
6.6k
Contributors
15
Language
Python
Downloads (7d)
95.1k

pypi/scrapling

Score updated Jun 29, 2026

Related projects

ClickHouse is an open-source database built specifically for analyzing massive amounts of data at lightning speed, returning results in real-time rather than making you wait minutes or hours. Think of it as a supercharged spreadsheet engine that can crunch billions of rows of data almost instantly, making it ideal for dashboards, reports, and any product that needs to show users live insights from large datasets.

// why it matters As user expectations shift toward real-time everything, products that can surface instant insights from data have a significant competitive edge over those with slow, laggy reporting. With nearly 50,000 stars and almost 3,000 contributors, ClickHouse has become a proven, battle-tested foundation that startups and enterprises alike are using to build analytics features without paying the enormous costs of proprietary alternatives like Snowflake or BigQuery.

C++48.3k stars8.6k forks2951 contrib

AFNI is a comprehensive software toolkit used by neuroscientists to process, analyze, and visualize brain scan images, including the functional MRI scans (brain imaging that shows activity over time) used in research studies. It handles every step of the brain imaging workflow, from initial data collection through final statistical analysis and visual reporting.

// why it matters Brain imaging research underpins a massive and growing market spanning clinical neurology, mental health diagnostics, and neurotechnology, and AFNI is a foundational open-source tool trusted by academic and medical research institutions worldwide. For founders or investors in brain health, medical imaging, or research software, understanding that AFNI represents the established standard workflow gives important context for where new AI-driven or cloud-based neuroimaging products can integrate or compete.

C191 stars118 forks81 contrib

Foxglove SDK is a toolkit that lets robotics and engineering teams record, stream, and visually explore complex sensor data — think camera feeds, GPS tracks, and sensor readings — all in one place. It connects to the popular Foxglove visualization platform, allowing teams to replay and analyze what their robots or autonomous systems are doing in real time or from saved recordings.

// why it matters As robotics, autonomous vehicles, and industrial automation become major investment areas, teams need better tools to understand and debug what their machines are actually doing — and Foxglove is positioning itself as the standard observability platform for that space. With 43 contributors, support for multiple programming languages, and integration with the widely-used ROS robotics framework, this SDK signals a maturing ecosystem that could become a critical dependency for any company building physical AI products.

Rust265 stars101 forks45 contrib

Apache Airflow is an open-source platform that lets teams build, schedule, and monitor automated workflows — think of it as a smart traffic controller for your data pipelines, ensuring the right tasks run in the right order at the right time. With nearly 46,000 stars and over 4,300 contributors, it has become the industry standard for orchestrating complex sequences of tasks, from pulling data out of databases to training AI models.

// why it matters For any company building data-driven products or AI features, Airflow is often the backbone that keeps everything running reliably — making it a critical piece of infrastructure that reduces engineering overhead and accelerates time-to-insight. Its massive adoption signals that data orchestration is now a foundational business need, and teams that implement it early gain a significant operational advantage as their data complexity grows.

Python46.0k stars17.3k forks4456 contrib4289.7k dl/wk
// SUBSCRIBE

The repos that moved this week, why they matter, and what to watch next. One email. No noise.