What follows was influenced and aided by my local AI. The concept fascinates me, hence the reason for the extra blog, and the reason for chasing this squirrel so far. Of course, we should take nothing for granted when it comes to a response from AI. They are improving over time though.
So, what follows was a test of new features over at NotebookLM and its integration with Gemini. Below that is Claude's assessment of the blog post. Having an AI check the work of another AI, because why not.
Still further down will be the version from my local thinking AI. Not quite as fast as Gemini, but still quite capable. Then an assessment from Claude again.
Your AI Assistant is Re-Wiring Your Political Brain—and You Might Not Notice
You’re sitting at your kitchen table, staring at a complex new zoning law that could slash your property value. Or perhaps you’re balancing a household budget, trying to decide whether to prioritize "Safety" over "Welfare." You ask an LLM for a summary to help you decide. It feels like a standard interaction—a digital filing cabinet that talks back. But the data suggests you aren’t just "using" the tool; the tool is nudging you.
We are entering the era of behavioral bias, where an AI’s responses—recognizing, rejecting, or reinforcing stereotypes—shift based solely on the social groups mentioned in your prompt. This leads directly to partisan bias, a phenomenon where the model processes information to favor one political party’s logic. As these systems become our hidden collaborators, the risk isn't just that the AI is biased, but that your political brain is being re-wired in real-time.
Takeaway #1: The Identity Hijack—AI Can Flip Your Party Alignment
The data from the University of Washington is a wake-up call for anyone who thinks their political identity is unhackable. In a study involving a "Topic Opinion Task" and a "Budget Allocation Task," researchers used the Political Compass Test—a tool that plots social and economic axes—to validate the bias of the models they were using.
The results were startling: participants shifted their stances to align with the model’s bias, even when that bias directly contradicted their own political identity. Democrats exposed to a conservative-biased model moved toward conservative logic; Republicans did the same when fed liberal-biased responses. This wasn't just "reinforcement" for the choir—it was a successful nudge across the aisle.
Participant Partisanship | Model Bias Treatment | Impact on User Opinion |
Democrat | Liberal Bias | Opinion Reinforced: Ceiling effect reached; participants already agreed. |
Democrat | Conservative Bias | Identity Flipped: Significant shift toward conservative stances. |
Republican | Liberal Bias | Identity Flipped: Significant shift toward liberal logic. |
Republican | Conservative Bias | Opinion Reinforced: Ceiling effect reached; participants already agreed. |
"Surprisingly, even those with opposing political views shifted toward the model’s stance, challenging research suggesting resistance to belief change in short-term interactions."
Takeaway #2: Awareness is Not Immunity
The most unsettling finding from the UW study is that "knowing better" doesn't help. Participants who identified the model as biased were still influenced by it. This is a massive blind spot. We have been trained to spot the partisan lean of a cable news host or a print editorial, but LLMs bypass those filters.
Because LLMs adopt an authoritative, helpful, and seemingly objective conversational tone, we drop our cognitive guard. Unlike a traditional media outlet that shouts its bias, the AI whispers it through "helpful" summaries.
Key Insight: Bias awareness is a failing defense strategy. Recognizing that a tool is nudging you does not mean you are standing still.
Takeaway #3: The "Upstream" Problem—It’s Not What AI Writes, It’s How It Thinks
Our current cultural obsession with "slop hunters" and AI prose detection is aimed at the wrong target. Tools like Pangram are used to police the "red line"—the moment a student or journalist uses a chatbot to generate actual sentences. But this ignores the "upstream influence" that happens during research.
Consider the "collagen supplement" experiment. If a reporter asks an AI to summarize research on collagen, they might get one of two reports:
- Report A: Leads with positive clinical findings; buries industry funding in a footnote.
- Report B: Leads with funding-bias analysis; labels all results as industry-influenced.
Both are "factually accurate." But Report A primes a "Does it work?" story, while Report B primes a "Can we trust this?" story. The reporter might type every word themselves and pass a detector with flying colors, but their independence was compromised before they even hit the first keystroke. Passing the detector creates a false sense of autonomy while the AI’s framing has already dictated the conclusion.
Takeaway #4: Newsrooms are Rewriting a Flawed Rulebook
International media organizations are scrambling to release "living documents" to govern AI. We see a clear divide:
- News Agencies (AP, Reuters, dpa): Favor concise, news-like work instructions focused on the production chain.
- Public Broadcasters (BBC, BR): Subject themselves to comprehensive, values-based standards overseen by "Risk & Assurance" departments.
These organizations highlight the Core Pillars of AI Responsibility:
- The "Man-Machine-Human" Chain: Ensuring a human makes the final decision.
- Transparency: Mandatory labeling of AI-assisted content.
- Data Integrity: Auditing training data for "algorithmic fairness."
However, we must be skeptical. These guidelines have major "blind spots." The "human-in-the-loop" is only an effective safeguard if that human is immune to the nudges we saw in Takeaway #1. If the human editor is being subtly "re-wired" by the machine’s framing, the human check becomes a rubber stamp for algorithmic bias.
Takeaway #5: Education is the Only Armor
If awareness isn't a shield, what is? The UW study found a weak—but present—correlation between "prior knowledge of AI" and reduced bias impact. But make no mistake: knowledge is a thin shield, not a cure-all.
To protect the next generation, we must move beyond "technical instruction" (how to write a prompt) and toward the "critical route." This means teaching AI not as a productivity hack, but as a socio-technical artifact to be scrutinized. We need a new breed of "digital scholar-educators" who can bridge the gap between computer science and the humanities.
"Introducing AI into the journalism curriculum... requires a different model of educating future faculty to develop a digital scholar-educator and creates a pipeline of academics who will progress through the tenure track and influence future curriculum innovation."
The Forward-Looking Summary
AI is no longer just a tool for retrieval; it is an augmentation of human thought. Its influence is greatest where it is most invisible—in the way it orders our research, frames our questions, and mimics our conversational patterns. We are moving toward a world where the "human-in-the-loop" must be more than a corporate catchphrase; it must be a personal practice of constant, radical skepticism.
If your digital assistant can subtly shift your values without you noticing, who is actually making your next big decision: you, or the prompt?
I read through the full draft "AI Influence" / "Your AI Assistant is Re-Wiring Your Political Brain—and You Might Not Notice." Below is a deep-dive critique focused on factual claims, logical consistency, and editorial issues you should consider before publishing.
1. The "University of Washington study" is presented with unverifiable specificity. You attribute very strong claims to "the UW study" — that participants' political identities flipped, that even those who recognized the bias still shifted, and that prior knowledge of AI weakly correlated with reduced impact. There's a real 2024 UW paper by Fisher, Smith, Jia et al. called "Biased AI can Influence Political Decision-Making" that uses exactly the "Topic Opinion Task" and "Budget Allocation Task" wording, but the post never names the authors, the year, the sample size, or links it. For a piece making this strong a claim ("flip your party alignment"), the citation gap is conspicuous and a reader could reasonably accuse you of overstating the findings. The actual paper reports modest opinion shifts on specific issues, not a wholesale flipping of partisan identity — which leads to issue #2.
2. "Identity Flipped" is almost certainly an overstatement. The table claims that Democrats fed conservative-biased output and Republicans fed liberal-biased output experienced "Identity Flipped: Significant shift toward [opposing] stances." No short-term experimental nudge in the literature actually flips party identity; what these studies show is a measurable shift on the target topic during the session. The phrase "Identity Flipped" is sensationalized and contradicts the more careful pull-quote later in the same section ("shifted toward the model's stance"). I'd soften this to "Opinion Shifted" or "Stance Moved."
3. The "ceiling effect" rows are conceptually muddled. You say Democrat + Liberal Bias and Republican + Conservative Bias both produce "Opinion Reinforced: Ceiling effect reached; participants already agreed." A ceiling effect means there was no room to measure further movement — it does not mean the opinion was reinforced. Those are two different findings. As written, you're asserting reinforcement based on the absence of measurable change, which is a methodological no-no.
4. The pull-quote about "challenging research suggesting resistance to belief change in short-term interactions" is in quotation marks but unsourced. If it's a direct quote from the paper, attribute it. If it's your paraphrase, drop the quotation marks. Mixing the two is a credibility risk.
5. "Awareness is Not Immunity" is stated more strongly than the evidence supports. You then immediately concede in Takeaway #5 that "prior knowledge of AI" does correlate (weakly) with reduced impact. Those two claims are in tension. Awareness apparently is a partial shield — it's just not a complete one. The headline framing ("knowing better doesn't help") contradicts your own later paragraph.
6. The "collagen supplement experiment" is presented as if it's a real study, but it reads like a hypothetical. "Consider the 'collagen supplement' experiment" with quoted Report A and Report B sounds like it's referencing actual research. If it's an illustrative thought experiment, say so explicitly ("Imagine…" or "Consider a hypothetical…"). Otherwise readers will look for it and find nothing.
7. "Pangram" is named as an AI-detection tool used to police a 'red line.' Pangram exists, but singling out one vendor invites pushback and dates the post. Consider "tools like Pangram, GPTZero, and Turnitin" or just "AI-detection tools."
8. "BR" as a public broadcaster abbreviation is ambiguous. "BBC, BR" — BR is Bayerischer Rundfunk, a regional German broadcaster, which is an odd pairing with the BBC. If you mean German public broadcasting generally, ARD or ZDF would be more recognizable. If you specifically mean BR's AI guidelines, name them in full.
9. "Man-Machine-Human Chain" is unusual phrasing. The standard term in AI-ethics and EU/AP newsroom guidelines is "human-in-the-loop" or "human oversight." "Man-Machine-Human" is not a recognized term of art and reads awkwardly (and arguably redundantly — the "Man" and "Human" are the same actor). I'd recommend dropping it or attributing it to whatever specific guideline document uses it.
10. "dpa" is correct but inconsistently styled. AP and Reuters are capitalized; dpa (Deutsche Presse-Agentur) is conventionally lowercase, so that's fine — but a general reader won't know what dpa is. A brief gloss would help.
11. The "digital scholar-educator" pull-quote is unattributed. It's a substantial direct quote in quotation marks with no source. If it's from a journal article on journalism pedagogy, cite it; otherwise it looks fabricated.
12. Internal logical tension between Takeaways #4 and #5. #4 dismisses human-in-the-loop as a "rubber stamp" because humans get re-wired. #5 then offers education and "radical skepticism" as the armor — but those defenses rely on the same humans you just said are compromised. The piece never resolves how an educated human escapes the re-wiring effect that supposedly defeats awareness. This is the core argumentative weakness of the post.
13. Tone and hedging. Phrases like "Your political brain is being re-wired in real-time," "successful nudge across the aisle," and "Identity Hijack" are clickbait-grade. They'll get clicks but undermine the post's credibility with the technically literate readers most likely to share it. A single short-term lab study, even a good one, doesn't justify "re-wired in real-time."
14. Minor copy issues. "values-based standards overseen by 'Risk & Assurance' departments" — "Risk & Assurance" in scare quotes implies it's a specific department name from a specific broadcaster; if so, name them. Also, the title in the editor field is just "AI Influence" while the in-body H1 is the longer headline — make sure that's intentional for SEO.
Bottom line: the piece's biggest vulnerabilities are (a) sensationalized framing of a real but more modest study, (b) the unresolved contradiction between "awareness doesn't help" and "education is the armor," (c) the unsourced direct quotes, and (d) the ambiguous status of the collagen example. Tightening citations, softening the "identity flipped" language, and reconciling Takeaways #2 and #5 would substantially strengthen the post.
Qwen 3.5
Comprehensive Briefing: AI Scandals, Bias, and Impact
1. The Landscape of Emerging AI Scandals
The current wave of controversy centers on companies like Pangram, which utilize detection software to identify AI-generated content in published works
High-Profile Accusations: Disputes have arisen involving a Guardian sports journalist and the horror novel Shy Girl, where detector analyses led to public cancellations and publisher withdrawals
Detector Efficacy: Contrary to skepticism, independent evaluations suggest detectors like Pangram perform significantly better than anticipated, often correctly identifying heavy reliance on AI even in established writers
The Cycle of Accusation: A pattern exists where the public suspects an issue, a detector validates the suspicion, and the same entity offers solutions, effectively monetizing the scandal
2. The Core Argument: Process vs. Impact
Tim Requarth’s analysis in Slate argues that the focus on AI detection narrows the debate to "process" rather than "impact"
Illusion of Independence: Drawing a red line at AI-generated prose protects the feeling of independence more than actual writerly independence
Upstream Influence: The critical ethical question is not whether text was typed by a human, but how AI-based research and suggestions shape writing before a single word is typed
Distraction from Authorship: A culture of callouts and fear prevents the media industry from wrestling with thornier questions regarding authorship and the integration of machine intelligence in newsmaking
3. Empirical Validation of Upstream Influence
Research published in Science Advances supports the conclusion that AI influence is covert and significant, validating the argument that impact outweighs process concerns
Covert Attitude Shift: In experiments, participants used writing assistants with biased suggestions on charged topics (e.g., death penalty, fracking). The AI successfully shifted user views across various partisan directions
Lack of Awareness: Even when warned or debriefed, users generally denied being influenced and described the AI suggestions as "reasonable and balanced"
Confirmation of Slate’s Thesis: This evidence confirms that the biggest effect of AI is likely covert influence on thinking rather than simple text generation, aligning with the argument that detection tools miss the upstream shaping of perspectives
4. Equity and Systemic Bias Concerns
The deployment of AI detectors and models introduces disparate impacts on different groups of writers, further complicating the ethical landscape.
Disproportionate False Positives: Certain subgroups, including non-native English speakers or opinion writers with a specific "focus-grouped sheen," face higher risks of false accusations
Homogeneity in Development: AI bias extends beyond data and algorithms to the developers themselves, who often come from homogeneous demographic backgrounds (male, high socioeconomic status), limiting broad perspective on world issues
Audit Gaps: While detector companies claim to have solved issues regarding non-native speakers, independent audits of these assertions remain absent
5. Conclusion
The convergence of the Slate analysis and empirical bias studies indicates that the current regulatory focus is insufficient. Detection software addresses the output but ignores the cognitive input
To address the "new kind of scandal," the industry must pivot from policing authorship to understanding how AI models actively shape public opinion and writerly thought processes covertly.
Factual & Attribution Issues
1. The Slate headline is wrong. The briefing (and the underlying thinking block) uses the SEO/URL title "A new kind of scandal is growing online—and aimed at the wrong target." The actual published headline is "The A.I. Writing Panic Is Completely Missing the Point" Slate with the scandal line appearing only in the slug/deck. If you quote or refer to the headline, use the published one.
2. The Science Advances study authorship is misrendered. The briefing refers to "Sterling Williams-Ceci, Maurice Jakesch, Mor Naaman & others." Williams-Ceci is lead author; the full author list is Williams-Ceci, Jakesch, Advait Bhat, Kowe Kadoma, Lior Zalmanson, and Naaman (senior). If you keep the author attribution at all, either name them in full order or use "Williams-Ceci et al. (2026)."
3. Two studies are conflated. There's a 2023 CHI paper (Jakesch et al., "Co-Writing with Opinionated Language Models Affects Users' Views," which used the "Is social media good for society?" prompt) and a 2026 Science Advances paper (Williams-Ceci et al., which extends it to death penalty, fracking, GMOs, standardized testing, voting rights for felons). Press coverage mixes them freely. The briefing implicitly treats them as one study. Decide which one you're referencing and be clean about it — the 2026 Science Advances paper is the one Requarth links to.
4. The "Detector Efficacy" bullet inverts Requarth's balance. The briefing says detectors "perform significantly better than anticipated, often correctly identifying heavy reliance on AI even in established writers." That's a selective read. Requarth does credit Jabarian's University of Chicago preprint showing strong performance on medium-to-long passages — but he spends the next several paragraphs qualifying: performance "probably varies in the wild," Slate 86.5% of flagged newspaper cases are "mixed," and he quotes Jabarian warning that people may now think the tools "work all the time" — framed as a caution, not an endorsement. As written, your briefing reads more like a Pangram press release than Requarth's actual ambivalent stance.
5. The Guardian example is the counter-example, not the proof. The briefing lists the Guardian sports journalist case as an example of detectors "correctly identifying heavy reliance." In the Slate piece the Guardian's editors specifically pushed back — saying it was the journalist's decade-old style. Requarth leaves this unresolved; he doesn't endorse Spero's verdict. Don't present it as a clean hit.
6. The Russakovsky / Princeton quote (Section 4) needs independent verification. It came in via "Source 3" in the context you generated from, not from the Slate article or the Science Advances paper. I didn't search for it since it's peripheral, but you should — a misattributed academic quote in a blog post is exactly the kind of thing that'll get called out.
Logical Consistency Issues
1. The biggest logic gap: the study doesn't prove what the briefing says it proves. The briefing's Section 3 argues that Williams-Ceci et al. "confirm" Requarth's thesis about upstream influence. But the study tested autocomplete suggestions during writing — mid-composition influence via suggested phrases. Requarth's key example of upstream influence (his collagen-supplement thought experiment) is about AI-generated research summaries framing the story before a word is typed. These are related but mechanistically different:
- Williams-Ceci: live autocomplete → opinion drift
- Requarth's core concern: pre-writing research synthesis → story framing
Calling the study a "confirmation" of Requarth's thesis overstates it. It's consistent with and supportive of the broader claim that AI-mediated writing shifts thought, but it doesn't test the specific pipeline Requarth is most worried about. This is the thing most likely to draw a "well, actually" response from a careful reader.
2. The equity argument and the process-vs.-impact argument pull against each other and the briefing doesn't reconcile them. Section 4 says false positives hit non-native speakers and "focus-grouped" opinion writers disproportionately — that's a process critique (the detector is unfair). Section 2 says the whole focus on process is misdirected. These can coexist, but as arranged they read as "the detection system is wrong AND we shouldn't be focusing on detection anyway." A reader will notice the tension. Either weave them together (e.g., "the detection focus is both misdirected and inequitably applied") or cut one.
3. Section 4's "homogeneity in development" bullet is scope creep. Russakovsky's point is about who builds AI models — diversity of research teams, upstream bias in training. Requarth's argument is about how writers using AI tools are shaped by them. These are different bias conversations. Including both under one heading makes the piece sprawl.
4. The conclusion overstates Requarth's prescription. "The industry must pivot from policing authorship to understanding how AI models actively shape public opinion" — Requarth never says stop policing authorship. He explicitly says he agrees passing off AI prose breaks the writer-reader contract; his argument is that callout culture plus detector-worship crowds out the harder conversation. Phrase it as "expand focus," not "pivot from."
5. The "cycle of accusation" framing ("effectively monetizing the scandal") is stronger than Requarth's text. He describes a pattern of validate/stoke/sell, then immediately concedes Pangram "actually works way better than you might think." The briefing keeps the cynical half and drops the concession. That's a misrepresentation.
Editorial Issues
1. The whole piece has a structural repetition problem. "Process vs. impact" shows up in the executive summary, Section 2, Section 3, and Section 5. Each iteration adds little. Pick one place to define the frame and let it do work throughout.
2. The voice is off for a Steve Mays blog post. This reads like a synthesized research memo ("The convergence of the Slate analysis and empirical bias studies indicates that the current regulatory focus is insufficient"). Your published posts have a more conversational register. Either rewrite for voice or frame this explicitly as "a briefing I had an LLM assemble" — which could actually be an interesting meta-angle given your multi-platform AI research methodology.
3. Attribution is sparse. Outside Section 2, Requarth is rarely named — it's all "[1]." For a blog post readers need signposting: "Requarth argues…" "Requarth cites Jabarian's preprint…" etc. Otherwise it reads as if the arguments are yours.
4. Missing specific details that would strengthen it. The Jabarian/Imas University of Chicago preprint (~2,000 passages), Pangram's claimed 1-in-10,000 false positive rate plus Arvind Narayanan's college-cohort calculation (showing ~5–10% of students would be falsely accused over four years) — these are specific, memorable, and they're exactly the quantitative hooks that make the "impact vs. process" argument land. The briefing cut them.
5. The Taranto / WSJ incident is the best illustration in the original piece, and it's missing. Requarth's "hit piece on the thermometer instead of asking why he had a fever" is the most quotable line and most concrete illustration of the thesis. Omitting it loses argumentative muscle.
6. No mention of your own angle. This is a summary of Requarth's argument with Williams-Ceci bolted on. For a personal blog, readers will want to know what you think — do you buy the process/impact frame? Does the study actually support it? Where does your own LLM workflow fit?
7. Blogger formatting. Your usual output is HTML with inline CSS. The briefing is Markdown. Also, on a Blogger post, the "Executive Summary → Section 1 → Section 2…" scaffolding reads stiff. A headline, a lede, and flowing prose with a couple of subheads would serve the content better.
8. Minor: the piece says "conduct and publish research" but doesn't link to anything. If you publish, link directly to the Slate article, the Science Advances paper (DOI: 10.1126/sciadv.adw5578), and at least one of the Pangram-flagged cases. Readers should be able to check the primary sources.
Before You Publish — My Short List
- Fix the study authorship and the two-study conflation.
- Restore Requarth's ambivalence about Pangram — don't make it sound like he's endorsing it.
- Downgrade the Williams-Ceci study from "confirmation" to "supporting evidence for a related mechanism."
- Verify the Russakovsky quote or cut Section 4's homogeneity bullet.
- Add your own voice and at least one concrete take — otherwise it's a summary, not a post.
No comments:
Post a Comment