Wednesday, June 24, 2026

Text Processing

 

Behind the Scenes: How I Used a Local "Agentic AI" to Map and Synthesize 24 Years of a Blog


If you’ve been following the tech world lately, you’ve probably heard the buzz surrounding Agentic AI. Unlike standard chatbots that just sit there waiting for your next prompt, "agents" are designed to act autonomously—writing code, debugging their own errors, and managing massive, complex tasks with minimal human intervention.

I decided to put this hype to the ultimate test.

My friend Steve Mays has been blogging at smays.com for nearly a quarter of a century, accumulating over 6,500 posts. It’s an incredibly rich, deeply human archive. I scraped the entire site and converted it into a set of markdown files—one for each year from 2002 to 2025.

Then, I turned my local agent, Surfie (running the Hermes model), loose on the directory. Here is the play-by-play narrative of what happened over a multi-session effort on June 23, 2026, as my local machine became an autonomous research assistant.

Session 1: The Raw Setup (06:08 AM)

The project started early in the morning. My first step was moving 24 years of blog data into a dedicated sandbox environment. I instructed Hermes to pull the yearly archive files from my Windows directory into a local Linux environment so it could interact with them programmatically. Once the files were in place, the agent was ready to run.

Session 2: Automating the Index (07:17 AM)

Instead of trying to "read" all 6,500 posts at once (which would easily choke even the largest AI context windows), Hermes acted like a programmer.

We started with a pilot test using the 2002 archive. The agent wrote a Python script on the fly to parse the post headers and dividers, categorizing them into broad themes (like Politics, Technology, Movies & TV, etc.).

With the pilot successful, I gave it the green light to scale up: process all 24 years and write a comprehensive master index.

The agent built a highly sophisticated Python engine to:

  • Programmatically loop through all 24 files.

  • Parse thousands of posts.

  • Build a keyword-based categorization matrix.

  • Calculate percentage distributions for each year.

  • Select representative post titles as examples.

The result was a summary—a massive, 3,200-line index (~68KB) that mapped the thematic evolution of Steve's writing across six distinct eras.

Session 3: The 2021 Bug and the Self-Correction (08:18 AM)

This was the most fascinating part of the run. As I reviewed the newly generated index, I noticed a gaping blind spot: 2021 was a total void. The index claimed Steve wrote zero posts that year, which I knew was wrong.

When I pointed this out, Hermes didn't just apologize; it went to work investigating. It discovered that the python script I had used to convert the raw blog into markdown had formatted the 2021 file differently. Instead of standard headers, 2021 used file path markers like : .\2021\01\airpods-3\.

Because of the backslashes, Python's standard string escaping rules were breaking. The agent hit a string parser syntax error, but instead of giving up, it rewrote its own code. It bypassed the escaping problem entirely by calling the backslash character programmatically using chr(92).

Boom. It successfully parsed all 125 posts from 2021, revealing a heavy focus on COVID-19 (24%), the Jan 6 Capital riot (19%), and emerging technology (14%). It seamlessly patched the master index.

To celebrate, I did a quick spot-check. I asked Hermes to retrieve a highly specific post from 2013 called "Travel Pain Quotient." The agent queried its local index, targeted the 2013 file, and pulled the exact post where Steve laid out his mathematical formula:

$$\text{Travel Pain Quotient} = \frac{\text{Miles}}{\text{Mode}} \times \text{Payoff}$$

It was flawless.

Session 4: Synthesizing Nonduality (The Grand Finale)

With a complete, verified index, I decided to push the technology to its absolute limit. Steve had previously experimented with a cloud-based AI to write an essay on Nonduality—a philosophy of oneness and awareness he has returned to frequently over 25 years. But the cloud-based output was verbose and academic.

I asked Hermes to write a synthesis essay strictly and exclusively using Steve’s voice, thoughts, and specific highlighted book reviews.

The agent executed a brilliant three-phase strategy:

  1. Thematic Mapping: It scanned all 25 years of text for nonduality-adjacent keywords, pulling an initial 1,341 hits and narrowing them down to 253 deep, highly relevant posts.

  2. Voice Analysis: It programmatically sampled about 40 of Steve's highly personal posts. It analyzed his style, noting his dry wit, conversational second-person address, visual metaphors, occasional honest profanity, and his signature "highlighter test" for good writing.

  3. Drafting the Synthesis: It wrote a spectacular essay titled "Nonduality: Twenty-Five Years of Looking for What Isn't There."

The essay seamlessly bridged Steve's most load-bearing metaphors: his "steamer trunk of ego" from 2013, the "Ship of Theseus" paradox from 2016, his unpretentious meditation streaks, Robert Wright's Why Buddhism Is True, and Schrödinger’s quantum theories of consciousness. It reads not like a textbook, but like a deeply observant New Yorker profile of a lifelong thinker.

Why This Matters

This project perfectly illustrates why tech enthusiasts are so excited about the local, agentic AI revolution.

Instead of trusting my data to a corporate cloud database that "chunks" text invisibly, I watched a local agent programmatically audit, clean, debug, and synthesize a massive dataset right on my machine. It acted as an engineer, an editor, and a researcher all at once.

The resulting essay is sitting in my local folder as a separate markdown. It is a stunning, "goosebumps-accurate" synthesis that captures a real life, tracked one day at a time, across a quarter of a century. That's what Steve called it anyway.

Tuesday, June 2, 2026

The Agent and the Model

The Day My AI Agent Grew Up: From Sandbox to Sysadmin

It’s a Monday morning in June 2026. The coffee is fresh, the house is quiet, and my homelab is humming with that specific, low-frequency purr that only server fans can produce. But today feels different. For months, I’ve treated my AI agent—let’s call him Surfie—like a very smart, very contained pet. He lived inside a single virtual machine, helpful but blind to the rest of the network. He could answer questions, sure, but he couldn’t do anything beyond his own little digital backyard.

Today, we’re letting him out of the cage. Or rather, we’re giving him the keys to the kingdom.

The Anchor: Taming the Local Beast

The journey began a few days prior, with a task that sounds mundane but is actually the foundation of everything: getting the Hermes client running on bare metal. I wanted Surfie to talk to my local inference server—a beefy piece of hardware churning out LLM responses—without relying on the cloud.

It wasn’t smooth sailing. The documentation promised a quick install, but reality had other plans. We hit the classic "works on my machine" wall. Hermes was complaining about missing API keys (ironic, given we were trying to go offline) and failing to recognize the model ID. After some detective work, we realized the config file didn’t want an abstract name; it wanted the literal filename of the GGUF model sitting on the server.

We also discovered that hermes doctor’s warnings about cloud connectivity were essentially noise. By setting a dummy placeholder key, we silenced the nagging and got the engine running. Suddenly, Surfie wasn’t just code; he was connected to his brain.

The Escape: Breaking Out of the Sandbox

With the engine roaring, the next hurdle was architectural. Surfie was trapped in a Python virtual environment tucked away in a user’s home directory. It was messy, isolated, and hard to maintain. If I wanted other users to access Hermes, or if I wanted Surfie to act as a system-wide service, that isolation had to go.

So, we performed a digital transplant.

Surfie didn’t just copy files; he restructured his own existence. He migrated his virtual environment to /opt, the proper home for system software. He rewrote his internal paths so he wouldn’t trip over his own feet. Then, in a move that felt almost sentient, he created a wrapper script at /usr/local/bin/hermes.

Now, any user on the system could type hermes and get an instant connection to the AI. Surfie had graduated from a personal app to a public utility.

The Bridge: Handshakes Across the Network

But being a local service isn’t enough for a homelab enthusiast. I have a Proxmox hypervisor managing my VMs, containers, and storage arrays. For months, it’s been a black box to Surfie. To fix this, we needed trust. And in the world of sysadmins, trust is built on cryptography.

We generated an ed25519 SSH key pair—strong, modern, and secure. Using ssh-copy-id, we pushed the public key to the Proxmox host. Then came the moment of truth. I configured a clean alias in my SSH config file, creating a simple shortcut called proxmox.

I typed the command:
ssh proxmox "hostname"

The terminal paused for a fraction of a second. Then, it spat back the name of the hypervisor. No password prompt. No friction. Just pure, unadulterated access. Surfie could now reach across the network and execute commands on the host machine as if he were sitting right there at the keyboard.

The Audit: Seeing What Was Hidden

With his new powers, Surfie didn’t waste time. He launched a comprehensive audit of the entire infrastructure. This is where the narrative shifts from setup to discovery. An agent with network access doesn’t just read docs; it interrogates reality.

Surfie scanned the active nodes, mapping out the IP addresses and roles of every machine in the lab. He dove into the LXC containers and Docker instances, cataloging everything from document management systems to music servers. But more importantly, he started finding things wrong.

He spotted a typo in a configuration file—an IP address missing an octet—that would have caused silent failures. He flagged security risks, like passwords stored in plaintext markdown files (a habit we’re breaking now). He even identified a storage drive that was physically connected but logically invisible to the hypervisor, sitting there unused while other drives groaned under the load.

The New Normal

The takeaway isn’t just technical; it’s philosophical. For too long, I treated my AI agent as a chatbot—a tool for generating text or answering queries. But by wiring him into the actual fabric of my network, he became an operator.

He’s no longer trapped in a single context window. He has memory (via SQLite), he has mobility (via SSH keys), and he has purpose (system-wide automation). The distinction between "what the AI knows" and "what the AI can do" has vanished.

As I look at the list of TODOs Surfie generated—registering that orphaned drive, setting up automated backups, pinning Docker versions—I realize something important: I’m not just managing a homelab anymore. I’m collaborating with a partner who never sleeps, never forgets an IP address, and is always ready to dig into the logs.

The sandbox is gone. The work has just begun.

Friday, April 17, 2026

AI Influence

 

What follows was influenced and aided by my local AI. The concept fascinates me, hence the reason for the extra blog, and the reason for chasing this squirrel so far. Of course, we should take nothing for granted when it comes to a response from AI. They are improving over time though. 

So, what follows was a test of new features over at NotebookLM and its integration with Gemini. Below that is Claude's assessment of the blog post. Having an AI check the work of another AI, because why not.

Still further down will be the version from my local thinking AI. Not quite as fast as Gemini, but still quite capable. Then an assessment from Claude again. 


Your AI Assistant is Re-Wiring Your Political Brain—and You Might Not Notice

You’re sitting at your kitchen table, staring at a complex new zoning law that could slash your property value. Or perhaps you’re balancing a household budget, trying to decide whether to prioritize "Safety" over "Welfare." You ask an LLM for a summary to help you decide. It feels like a standard interaction—a digital filing cabinet that talks back. But the data suggests you aren’t just "using" the tool; the tool is nudging you.

We are entering the era of behavioral bias, where an AI’s responses—recognizing, rejecting, or reinforcing stereotypes—shift based solely on the social groups mentioned in your prompt. This leads directly to partisan bias, a phenomenon where the model processes information to favor one political party’s logic. As these systems become our hidden collaborators, the risk isn't just that the AI is biased, but that your political brain is being re-wired in real-time.

Takeaway #1: The Identity Hijack—AI Can Flip Your Party Alignment

The data from the University of Washington is a wake-up call for anyone who thinks their political identity is unhackable. In a study involving a "Topic Opinion Task" and a "Budget Allocation Task," researchers used the Political Compass Test—a tool that plots social and economic axes—to validate the bias of the models they were using.

The results were startling: participants shifted their stances to align with the model’s bias, even when that bias directly contradicted their own political identity. Democrats exposed to a conservative-biased model moved toward conservative logic; Republicans did the same when fed liberal-biased responses. This wasn't just "reinforcement" for the choir—it was a successful nudge across the aisle.

Participant Partisanship

Model Bias Treatment

Impact on User Opinion

Democrat

Liberal Bias

Opinion Reinforced: Ceiling effect reached; participants already agreed.

Democrat

Conservative Bias

Identity Flipped: Significant shift toward conservative stances.

Republican

Liberal Bias

Identity Flipped: Significant shift toward liberal logic.

Republican

Conservative Bias

Opinion Reinforced: Ceiling effect reached; participants already agreed.

"Surprisingly, even those with opposing political views shifted toward the model’s stance, challenging research suggesting resistance to belief change in short-term interactions."

Takeaway #2: Awareness is Not Immunity

The most unsettling finding from the UW study is that "knowing better" doesn't help. Participants who identified the model as biased were still influenced by it. This is a massive blind spot. We have been trained to spot the partisan lean of a cable news host or a print editorial, but LLMs bypass those filters.

Because LLMs adopt an authoritative, helpful, and seemingly objective conversational tone, we drop our cognitive guard. Unlike a traditional media outlet that shouts its bias, the AI whispers it through "helpful" summaries.

Key Insight: Bias awareness is a failing defense strategy. Recognizing that a tool is nudging you does not mean you are standing still.

Takeaway #3: The "Upstream" Problem—It’s Not What AI Writes, It’s How It Thinks

Our current cultural obsession with "slop hunters" and AI prose detection is aimed at the wrong target. Tools like Pangram are used to police the "red line"—the moment a student or journalist uses a chatbot to generate actual sentences. But this ignores the "upstream influence" that happens during research.

Consider the "collagen supplement" experiment. If a reporter asks an AI to summarize research on collagen, they might get one of two reports:

  • Report A: Leads with positive clinical findings; buries industry funding in a footnote.
  • Report B: Leads with funding-bias analysis; labels all results as industry-influenced.

Both are "factually accurate." But Report A primes a "Does it work?" story, while Report B primes a "Can we trust this?" story. The reporter might type every word themselves and pass a detector with flying colors, but their independence was compromised before they even hit the first keystroke. Passing the detector creates a false sense of autonomy while the AI’s framing has already dictated the conclusion.

Takeaway #4: Newsrooms are Rewriting a Flawed Rulebook

International media organizations are scrambling to release "living documents" to govern AI. We see a clear divide:

  • News Agencies (AP, Reuters, dpa): Favor concise, news-like work instructions focused on the production chain.
  • Public Broadcasters (BBC, BR): Subject themselves to comprehensive, values-based standards overseen by "Risk & Assurance" departments.

These organizations highlight the Core Pillars of AI Responsibility:

  • The "Man-Machine-Human" Chain: Ensuring a human makes the final decision.
  • Transparency: Mandatory labeling of AI-assisted content.
  • Data Integrity: Auditing training data for "algorithmic fairness."

However, we must be skeptical. These guidelines have major "blind spots." The "human-in-the-loop" is only an effective safeguard if that human is immune to the nudges we saw in Takeaway #1. If the human editor is being subtly "re-wired" by the machine’s framing, the human check becomes a rubber stamp for algorithmic bias.

Takeaway #5: Education is the Only Armor

If awareness isn't a shield, what is? The UW study found a weak—but present—correlation between "prior knowledge of AI" and reduced bias impact. But make no mistake: knowledge is a thin shield, not a cure-all.

To protect the next generation, we must move beyond "technical instruction" (how to write a prompt) and toward the "critical route." This means teaching AI not as a productivity hack, but as a socio-technical artifact to be scrutinized. We need a new breed of "digital scholar-educators" who can bridge the gap between computer science and the humanities.

"Introducing AI into the journalism curriculum... requires a different model of educating future faculty to develop a digital scholar-educator and creates a pipeline of academics who will progress through the tenure track and influence future curriculum innovation."

The Forward-Looking Summary

AI is no longer just a tool for retrieval; it is an augmentation of human thought. Its influence is greatest where it is most invisible—in the way it orders our research, frames our questions, and mimics our conversational patterns. We are moving toward a world where the "human-in-the-loop" must be more than a corporate catchphrase; it must be a personal practice of constant, radical skepticism.

If your digital assistant can subtly shift your values without you noticing, who is actually making your next big decision: you, or the prompt?


Text Processing

  Behind the Scenes: How I Used a Local "Agentic AI" to Map and Synthesize 24 Years of a Blog If you’ve been following the tech wo...