Skip to content

LinkTrace

LinkTrace is a document-oriented crawler. Every crawled page becomes a rich Document object containing metadata, content, and discovered relationships.

Perfect for: Site structure analysis, link tracking, concurrent page fetching, HTML document transformation.

Not: A Scrapy replacement. Scrapy is a powerful full-featured framework — linktrace is deliberately lightweight with no pipelines, middleware, or project scaffolding to configure. If you want crawling results in minutes rather than hours of setup, and a gentler learning curve, linktrace is for you.

Quick Start

pip install linktrace
import asyncio
from linktrace import Spider

async def main():
    spider = Spider(start_url="https://example.com", max_depth=2)
    documents = await spider.run_async()
    for doc in documents:
        print(doc.title, len(doc.internal_links))

asyncio.run(main())

Documentation