GIT_FEED

datalab-to/marker

Convert PDF to markdown + JSON quickly with high accuracy

View on GitHub

What it does

Marker is a free, open-source tool that converts PDFs and other documents into clean, structured formats like markdown and JSON, working accurately across tables, equations, images, and multiple languages. It can run on your own hardware or be boosted with AI to achieve accuracy that rivals expensive paid services like LlamaParse and Mathpix.

Why it matters

For any product that needs to extract or process information from documents — think legal tech, finance, research tools, or AI apps that ingest reports — this removes a costly dependency on paid APIs and gives teams full control over their document pipeline. With 33,000+ stars, it signals strong market demand for affordable, high-quality document parsing as a foundational layer in AI-powered products.

25Active

On the radar — signal detected

Stars
36.4k
Forks
2.5k
Contributors
29
Language
Python

Score updated Jun 27, 2026

Related projects

Quarkdown is a writing and publishing tool that lets you create books, academic papers, presentations, and websites all from a single document using an enhanced version of Markdown (a simple text formatting language). Instead of juggling multiple tools for different output formats, you write once and the system automatically produces polished, print-ready results in whatever format you need.

// why it matters With over 11,000 stars on GitHub, there is clear demand for a unified authoring tool that eliminates the fragmentation between documentation, publishing, and presentation software — a space currently dominated by expensive or clunky incumbents like LaTeX and Microsoft Office. For builders, this signals a growing market of creators and researchers who want developer-friendly, version-controllable workflows for professional publishing without the overhead of traditional desktop tools.

Kotlin15.6k stars484 forks16 contrib

Web Platform Tests (WPT) is a massive shared test suite that checks whether all major web browsers — Chrome, Firefox, Safari, Edge, and others — behave consistently when displaying websites and web apps. Think of it as a universal quality checklist that browser makers run to confirm their software follows the agreed-upon rules of how the web should work.

// why it matters When browsers behave differently, developers must build workarounds that add cost and slow down shipping — WPT is the industry's shared mechanism for reducing that friction, making the web a more reliable platform for products to run on. For builders, broader browser consistency means less money spent on cross-browser bug fixes and greater confidence that web-based products will reach users as intended, regardless of what device or browser they use.

HTML6.0k stars3.8k forks3245 contrib

LLVM is the foundational software that turns code written by developers into programs that actually run on computers and chips — it's the engine behind how most modern programming languages get translated into working software. It includes tools like Clang (which handles C and C++ code) and powers compilers used by Apple, Google, and countless other companies across nearly every platform and device.

// why it matters Almost every major tech product — from iPhone apps to AI chips — relies on LLVM to build and run software efficiently, making it one of the most critical pieces of infrastructure in the entire industry. For founders and investors, understanding LLVM matters because teams building new programming languages, custom hardware, or performance-critical software almost always depend on or integrate with it, meaning its evolution directly shapes what's technically possible in product development.

LLVM39.0k stars17.6k forks8791 contrib

The Supabase CLI is a command-line tool that lets developers manage their Supabase projects — an open-source alternative to Google Firebase — directly from their computer, including setting up local development environments, managing database changes, and deploying serverless functions. It essentially gives builders a fast, scriptable way to control their entire backend infrastructure without touching a web dashboard.

// why it matters As more startups choose Supabase over Firebase or custom backends to move faster, having a robust CLI means entire backend workflows can be automated, version-controlled, and reproduced — reducing errors and speeding up shipping. With nearly 2,000 stars and 163 contributors, this is a well-adopted tool in a growing ecosystem, signaling strong developer momentum behind Supabase as a serious Firebase competitor.

Go2.3k stars485 forks169 contrib
// SUBSCRIBE

The repos that moved this week, why they matter, and what to watch next. One email. No noise.