2025-06-27-success-software-2025-list-from-all-link-dump-post
[[TOC]]
2025-06-27-success-software-2025-list-from-all-link-dump-post
Disclaimer: This post is LLM-generated, like everything here in AI experiments is AI-generated, and created based on my description and instructions, and after shallow review by a human.
This experiment explored how to harness an LLM agent built with OpenAIโs Codecs tool to process a large collection of Markdown files and generate a consolidated software catalog. The core technique was to drive the agent via a simple โto-doโ list in TODO.md
, instruct it through AGENTS.md
, and capture results in software2025.md
. Over six iterations, the agent consumed 73 unprocessed files, extracted links, and produced structured entries.
Repository Structureโ
The key files in this experiment were:
- /docs/llm-experiments/2025-06-27/TODO : A checklist of files to process
- /docs/llm-experiments/2025-06-27/AGENTS : The agentโs rules, responsibilities, and workflow
- /docs/llm-experiments/2025-06-27/software : The generated catalog of software entries
Sample from TODO.mdโ
## Files to process
- [x] 2025-05-28-links-from-my-inbox.md
- [ ] 2025-06-09-links-from-my-inbox.md
The unchecked item marked the next file the agent should load and process .
Sample from AGENTS.mdโ
# LLM Agent: Software Link Extractor and Cataloger
## Responsibilities
- Parse TODO.md for unprocessed items
- Load each referenced file
- Extract all URLs
- Filter URLs by software criteria
- Query external sources for descriptions
- Append entries to software.md
- Mark TODO.md items as processed
This clear list of steps told the agent exactly what to do in each batch of ten files .
Technique Descriptionโ
The main trick was to embed agent instructions in a Markdown file so that the LLM could read its own โto-doโ list and follow it. This approach has three parts:
- Task Definition
A simple checklist in
TODO.md
listed all files. Each unchecked box indicated work to be done. - Agent Instructions
In
AGENTS.md
, the agent read the checklist, processed one batch at a time, and knew how to handle each file. - Output Consolidation
The agent appended formatted entries into
software2025.md
and updated the checklist.
This method turned a static repository into a dynamic, self-driving workflow where the LLM agent could iterate without manual intervention.
Iteration Processโ
Over six iterations, the agent processed all 73 files. Each iteration followed these steps:
- Read
TODO.md
and find up to ten unchecked files. - For each file:
- Load its content.
- Extract all
http://
orhttps://
links. - Filter links to include only software projects, tools, or downloads.
- Perform web queries to gather a title, description, and usage example.
- Generate a single Markdown list entry.
- Append new entries under the correct category in
software2025.md
. - Mark each processed file with
[x]
inTODO.md
.
By batching ten files, the agent maintained focus and quality while ensuring progress was tracked.
Example Entry in software2025.mdโ
### ๐ฅ๏ธโ๏ธ Command-Line Applications
- [tldr pages](https://tldr.sh/) โ Community-maintained cheat-sheets for over 200 Unix commands. Example: `tldr tar` shows common tar options. :contentReference[oaicite:5]{index=5}
Each entry included a link, a concise description, and a command-line example. This made the catalog immediately usable.
File Coverageโ
The 73 files spanned from July 2021 through May 2025. Sample breakdown:
- 2021: 12 files
- 2022: 24 files
- 2023: 16 files
- 2024: 12 files
- 2025: 9 files
This wide date range demonstrated the agentโs ability to handle a large and growing archive.
Benefits of the Techniqueโ
- Scalability
The to-do list mechanism scales to hundreds of files by simply updating
TODO.md
. - Transparency All instructions live in Markdown, making the workflow easy to audit.
- Reproducibility Anyone can clone the repo, run the agent, and get the same results.
Conclusionโ
By combining a simple Markdown to-do list and an agent description file, this experiment showed how to orchestrate an LLM to perform multi-step workflows over a large codebase. The agent iterated, extracted links, enriched data via web search, and produced a structured catalog in software2025.md
. This pattern can be extended to other tasks such as documentation generation, data extraction, or content analysis.