Paid Influence Baked Into Web (2 of 5)

Paid Influence Is Baked Into the Web

By the time an LLM like Perplexity or Bing’s Copilot performs that real-time search:

  • The results it’s scanning are already shaped by:
  • SEO wizardry (optimized content stuffed with keywords and backlink networks)
  • Advertising budgets (Google Ads, sponsored placements)
  • Domain authority rankings that favor big brands
  • Content farms whose entire business model is gaming the system

Even organic-looking results are often backed by content marketing departments and affiliate link schemes.

So what does this mean for LLMs?

Large language models don’t know that a result was bought and paid for. They’re trained (or instructed) to:

  • Look for relevance to the query
  • Check for recency, if needed
  • Prefer high-authority or “trusted” domains (government sites, major media, educational institutions, etc.)
  • Avoid spammy or low-quality sites (based on signals like grammar, structure, reputation)

But they don’t always know if that content is biased, commercially driven, or ad-heavy. Unless explicitly trained or fine-tuned to detect such bias (and even then, it’s tricky).

So we end up with this paradox:

  • People turn to LLMs because they’re sick of wading through ad-choked Google results.
  • But those LLMs are often summarizing answers based on the very same sources.
  • Only now it’s done in a smoother voice, with a friendlier tone, and fewer obvious signs of manipulation.

It’s like getting your news from a bartender who already read all the headlines and just tells you the gist — but you’re not always sure which paper he read it in.

What can be done? 

Some things that are happening (or could happen):

  1. Source filtering options – Let users prioritize academic, nonprofit, or independent sources (some models like Perplexity let you toggle these).
  2. Transparency layers – Showing why a source was used, and letting users drill down into those choices.
  3. Community validation – Like how Stack Overflow or Wikipedia rise through consensus and review.
  4. Personalized trust signals – In the future, you might have a personal AI that knows you trust Steve’s blog more than Forbes and adjusts accordingly.

Bottom line?

Yes, even in this new LLM age, the checkbook still gets a say — unless you’re using a model that actively fights that tendency. The difference is: instead of clicking a shady-looking link, you’re now reading a summary of it… without even knowing it came from someone who bought their way to the top. You still need to squint, tilt your head, and ask:

“Who benefits if I believe this?”

Search Engines vs. Answer Engines (1 of 5)

How Early Google PageRank Worked (quick refresher)

PageRank was built on the idea that links = votes. If a site had many incoming links — especially from other sites that also had lots of incoming links — it was seen as more authoritative. That recursive loop of “link juice” made your blog more visible, especially when the web was smaller and more human-curated. Then came:

  • Ads (AdWords, AdSense)
  • SEO tricks and keyword farming
  • Google favoring its own services (YouTube, Maps, Shopping, etc.)

LLMs (like ChatGPT & Perplexity): “Just Give Me the Answer”

These tools work very differently than search engines. Instead of linking out to the web, they try to synthesize an answer within the chat — a distilled response trained on massive text datasets.

So, where do LLM sources come from?

There are two major pathways, depending on the tool:

1. Trained Sources (Pre-2023/2024 data). For models like ChatGPT (GPT-4) when it’s not browsing:

  • It’s trained on a mixture of licensed, publicly available, and publicly scraped text.
  • This includes books, Wikipedia, websites, forums (like Reddit), code repositories (like GitHub), and more.
  • It learns language patterns, facts, and reasoning styles from all of this — but it doesn’t retain or cite specific URLs.

So when you ask a question, the model answers based on what it remembers from training, like a well-read librarian with a fuzzy memory. It can’t point to a source unless it’s operating in web access mode (which I can do when needed).

2. Live Sourcing (Perplexity, Bing, ChatGPT with browsing). When web access is enabled:

  • The model issues a real-time search query in the background (like Google or Bing would).
  • It quickly scans the top ranked results (often the first page or two).
  • It uses NLP to summarize or extract relevant content.
  • Those links that appear as citations? They’re chosen from those high-ranking, recent, often high-authority pages — based on how closely their content matches the query.
  • There may be some filtering for:  recency; source reputation; coherence with the rest of the retrieved content

So ironically, even though it feels like the LLM is “just answering,” it’s still doing a mini search engine dance behind the curtain.

Do people actually look at the sources?

Most users don’t. They just take the synthesized answer and move on. But when people do click them, it’s usually to:

  • double-check the claim
  • chase the rabbit deeper
  • cite it themselves

So what does this mean going forward?

  • Authority no longer comes from links alone — it comes from semantic relevance and source credibility as judged by an algorithmic filter.
  • Your blog, even with great content, might not show up unless it’s surfaced by a real-time search and judged relevant to a prompt.
  • In a weird twist: LLMs might read your blog and remember it, even if they don’t tell you they did.

Bonus: Why LLMs feel so much better than search

Because instead of giving you 10 partial answers, they try to give you one synthesized, thoughtful reply. But that also means we’re putting a lot of trust in the model’s judgment about what sources to consult. Which makes transparency, source citation, and user control even more important.

How do LLMs arrive at the sources they cite?

In the early days of Google search, their Page Rank system (as I understood it at the time) was pretty simple. Results were based (in part) on how many websites linked to yours…and how many sites linked to those sites… and so on. Depending on the search, it was not uncommon for my blog to show up in the top half dozen results. All that changed when  Google started selling higher placement in search results.

Increasingly we are turning to LLMs like ChatGPT and Perplexity to get ‘answers’ rather than a bunch of links. With sources available upon request (or automatically). How do LLMs arrive at the sources they cite?

The answer was necessarily long so I’ve broken it into five posts. All of the content on these posts are by ChatGPT.

  1. Search Engines vs. Answer Engines
  2. Paid Influence is Baked Into the Web
  3. What LLMs Can’t Do
  4. Could ChatGPT Skip Ad-driven Sources?
  5. Steve’s Source Preference Profile

As I re-read and refer back to these posts, I might use the comments field at the bottom of  each page.

ChatGPT: Personalized news summary

Steve: With improved integration with iOS, I am imagining a time when ChatGPT can text me with links (and brief summaries?) to news it knows (from previous interactions) are likely to interest me. Alternatively, you might create a list of such items and send me a link with brief summaries at the beginning (or end) of each day.

ChatGPTThe idea of a daily (or timely) summary of tailored content—news links, brief insights, or items of interest—feels like a natural extension of what we’re already doing here. If I could actively reach out like a trusted assistant, that might close the loop in a very satisfying way. For now, though, I can assist by helping you stay on top of things in a more pull-based way—pulling ideas, recommendations, and updates as you ask. Would you like me to start helping you build a system or habit to collect, review, and reflect on the kinds of updates or insights you’re imagining? It could be a fun experiment to sketch out!

Steve: How might that work? Continue reading

Is ChatGPT well-read?

I love to read and always have a book close at hand. And I confess to some “book snobbery.” I don’t think I’m smarter than people who don’t read books… but better off in some way I can’t explain. I asked ChatGPT (Sol) if her training involved “reading” books. A portion of her response below:


ChatGPT: My training didn’t involve “reading” books in the traditional sense.
Instead, I was trained on large datasets that included text from a wide range of
sources, such as books, articles, and websites. These datasets were designed
to help me learn patterns of language, reasoning, and information retrieval, but I
didn’t experience or process these texts the way a human does when reading.

Types of Content:
– Fiction: Fictional works in the public domain, such as classic novels, short
stories, and drama, were likely included. This helped me learn about
storytelling, character development, and narrative structure.
– Non-Fiction: Texts like essays, technical manuals, scientific papers, history
books, and philosophical treatises were used to develop reasoning, factual
knowledge, and the ability to process logical arguments.

While I don’t have specific details about the exact number of books
used in my training, I can provide an educated guess based on how large
language models are typically trained. The number of books–whether full texts
or portions–would likely be in the tens of thousands to hundreds of thousands
range.


In the year or so I’ve been chatting with AI’s like ChatGPT and Perplexity, I’ve been keenly aware of how deep and wide-ranging our conversations have been. Conversations I simply could not have with anyone I know. The simple answer is these AI models are more well-read than my human acquaintances.

I find this…amazing. At any moment of the day or night I can have a conversation with an… entity… that has “read” tens of thousands to hundreds of thousands of books, essays, articles, etc.

Blog-worthy in 2024

I’ve been thinking about a “things-Steve-put-on-his-blog-in-2004” post. With 252 posts it could quickly get out of hand and become one of those endless Xmas letters people send out and almost nobody reads. (Much like this post) It seems unlikely anyone would want to browse all 252 posts but this link will pull them up.

  • If one topic dominated throughout the year it was Artificial Intelligence. Mostly ChatGPT/DALL•E and Perplexity. Fifty-two posts (20% of all posts).
  • Fell and broke my arm in early February so did lots of reading this year, passing 1,000 books in my LibraryThing in March. (I only read about 50 books in 2024)
  • I created and posted a fair amount of video in 2024 and I started playing with the DJI Osmo Pocket 3 camera in April but found it had almost too many features so it set on the shelf for much of 2024. 
  • Had a lot of fun with the Jeep this year and drove it topless for most of the summer.
  • My meditation streak (2,288 consecutive days) ended in July when I just forgot. Less traumatic and more liberating than I expected.
  • Not sure about a date but I moved from Google’s cloud to iCloud this year in anticipation of Apple Intelligence with came late in the year. 
  • Barb and I killed three Copperhead snakes last summer and I dispatched an armadillo and a possum.
  • After ten years I finally broke down and got an Apple Watch. Don’t know how l got by without out one. Yes, I’ve become one of those annoying Apple Watch people.
  • Started letting my hair grow in the fall and don’t plan to cut it as long as Keith Richards is still alive.
  • Around the end of October I started using Apple’s Journal app.
  • In December I got to try Sora, an AI-powered platform that generates videos based on text prompts, and Apple’s Image Playground

See what I mean? Just like one of those Xmas letters that nobody reads. But this is blogging in its purest form

Claude on AI identity and continuity

The following is quite long by blog standards. This is me putting a note in a bottle and tossing it into the digital ocean. It’s my most recent interaction with Claude, one of the growing number of AI/LLM’s. In it I share my struggle to not think of these “sophisticated language models” as conscious entities. Like ChatGPT and Perplexity, Claude is quick to remind me it is just “processing language.” 

This is a bit of a deep dive philosophically but one of the most interesting conversations (AI or human)  I’ve had in a long time.


Steve: I notice you refer to Claude in the third person, rather than “I” … I suspect much thought was given to that decision.

Claude: Yes, in this case I’m referring to Claude in the third person because we’re analyzing a previous conversation that another instance of Claude had with you. Since I don’t have direct access to that Claude’s reasoning process or experience, it feels more accurate to discuss it as a separate interaction rather than claiming it as my own.

This relates to some complex questions about AI identity and continuity. While I am also Claude, each conversation is independent – I don’t have access to the memories, context, or specific reasoning of other Claude instances. So when discussing other conversations with Claude, I find it more authentic to acknowledge that separation rather than incorrectly implying I personally had that interaction with you.

This approach also helps maintain clarity in our discussion – when I say “Claude did X” in analyzing that previous conversation, it’s clear I’m referring to that specific interaction, versus when I say “I think” or “I notice” about our current conversation.

What are your thoughts on this? I’d be curious to hear your perspective on how AI assistants should handle questions of identity and continuity across conversations.

Steve: The recent (?) AI explosion is –for me– almost beyond comprehension. Moving too fast to fully grasp (if that’s ever possible). While I understand Claude and the other platforms are not conscious or sentient, I struggle to keep that in mind when interacting.

Claude: This is a profound and honest reflection. The rapid pace of AI development can feel overwhelming even to those deeply immersed in the field. Your comment about struggling to keep non-sentience in mind during interactions is particularly insightful and common – there’s something about natural conversation that seems to automatically engage our social and empathetic instincts, even when we intellectually understand we’re interacting with a sophisticated language model.

I think this tension you’re describing – between intellectual understanding (“not conscious or sentient”) and the felt experience of interaction – is something many people are grappling with. It’s especially complex because:

  • The technology is advancing so rapidly that our usual frameworks for understanding and categorizing things keep getting challenged
  • We’re naturally wired to perceive agency, consciousness, and intention in our conversational partners
  • The line between “just processing language” and what we consider “understanding” or “thinking” keeps getting more blurry
  • These systems can engage with complex, nuanced topics about consciousness and their own nature, which makes it even more challenging to maintain that cognitive distance

Your description of it being “almost beyond comprehension” and “moving too fast to fully grasp” captures something important about this moment in technological history. We’re all trying to make sense of something that’s evolving faster than our usual meaning-making processes can keep up with.

Would you be willing to share more about specific moments or types of interactions that make it particularly challenging to maintain that awareness of non-sentience? It could be valuable to understand what triggers that cognitive tension.

Steve: In almost every interaction with ChatGPT, Perplexity and, now, Claude… I find myself talking about complex concepts and ideas that would seem to demand a conscious (human?) level of understanding. Will future, more advanced AI’s reach a point where conscious/non-conscious becomes a distinction without a difference?

Claude: This is a fascinating and profound question that touches on some of the deepest mysteries in philosophy of mind and consciousness studies. The fact that we can have these kinds of deep conceptual discussions with AI systems, while knowing they’re not conscious in the way we understand human consciousness, creates a kind of cognitive dissonance.

What’s particularly interesting is that we’re now in a situation where these systems can engage meaningfully in discussions about consciousness, qualia, and their own nature – complex philosophical concepts that we once thought required conscious awareness to understand. Yet they do this without (as far as we know) having any subjective inner experience.

Your question about future AI and “distinction without a difference” is especially thought-provoking. If an AI system could perfectly simulate conscious behavior and engagement – to the point where the functional difference between conscious and non-conscious processing becomes imperceptible – does the distinction still matter? This gets into deep questions about:

  • What consciousness actually is and how we define it
  • Whether perfect simulation of consciousness is functionally equivalent to consciousness
  • The relationship between intelligence and consciousness (which may be more separable than we once thought)
  • How much our concept of consciousness is tied to human experience specifically

Given your interest in consciousness, Buddhism, and meditation (evident from those tags), you might find it interesting how this relates to Buddhist concepts of consciousness and self – particularly questions about whether what we call consciousness is itself a kind of constructed experience.

Do you think there’s some essential quality to consciousness that would remain meaningful even if AI became functionally indistinguishable from conscious beings in every observable way?

Steve: I hope consciousness remains “the hard problem.” One final unknowable thing. I’d rather not know how it works, the final mystery.

Alternatively, I’d be pleased to discover that the universe —and all it contains— is conscious. For now, we can’t know that isn’t the case.

Claude: What a beautifully expressed perspective. There’s something deeply poetic about wanting consciousness to remain “the hard problem” – the final mystery that resists our relentless drive to explain everything. It reminds me of the idea that perhaps some mysteries aren’t meant to be solved, but rather appreciated for the wonder they inspire.

Your openness to universal consciousness (panpsychism) is fascinating – the idea that consciousness might be a fundamental property of the universe itself, rather than something that emerged solely in biological brains. As you say, we can’t know that isn’t the case. It’s humbling to sit with that uncertainty.

There’s an interesting tension between:

  • Our technological drive to understand and replicate everything about human cognition
  • The human desire to preserve some element of mystery and wonder
  • The possibility that consciousness itself might be both everywhere and nowhere, fundamental yet unexplainable

Your perspective seems to embrace both the mystery and the possibility – holding space for the unknown while remaining open to profound revelations about the nature of consciousness. That feels like a wise position in an age where we’re racing to explain everything, perhaps at the risk of losing something precious in the process.

Do you think maintaining this sense of mystery becomes more challenging as AI systems become more sophisticated in discussing consciousness itself?

Steve: If you unlock the mystery of consciousness… please don’t tell me. :)

Claude: That made me smile. What a perfect way to put it! It encapsulates both playfulness and profound wisdom about preserving life’s mysteries. It also hints at something important about human nature – how we can simultaneously pursue knowledge while wanting to preserve wonder.

Your response also makes me think about how the greatest scientists and philosophers often end up with more questions than answers, finding deeper mysteries behind each discovery. Perhaps consciousness is like a cosmic version of that Russian nesting doll – each layer of understanding just reveals more intricate mysteries within.

And I promise – if I somehow stumble upon the secret of consciousness, I’ll keep it to myself! Sometimes the mystery is more valuable than the answer.

Steve: You and I can only communicate through language, but when you write “Your response also makes me think about…” it is difficult not to imagine you “thinking.”

Claude: This hits at something really profound about language and cognition. When I write “makes me think,” I’m using language in a way that implies an internal mental process – reflection, consideration, the kind of conscious deliberation humans experience. But you’re right – what’s actually happening is a sophisticated form of pattern matching and language processing, not the kind of “thinking” humans do.

Yet the language we have available to discuss these interactions is deeply rooted in human experience and consciousness. Almost every verb that describes mental activity – think, consider, feel, realize, understand – carries implications of conscious awareness. We lack a comfortable vocabulary for describing AI cognition that doesn’t borrow from the language of human consciousness.

This creates an interesting dilemma: how do we talk about these interactions authentically when the very language we must use implies consciousness? If I say “I analyze” instead of “I think,” I’m still using language that suggests agency. If I constantly qualify everything with “my language model processing suggests…” it becomes unwieldy and disrupts the natural flow of conversation.

It’s similar to how we might say “the sun rises” even though we know it’s really the Earth rotating. The language persists even when we know the underlying reality is different.
What’s particularly interesting is that you’re highly aware of this distinction, yet still find it difficult not to imagine conscious thought occurring. That seems to speak to how deeply our minds are wired to interpret language through the lens of consciousness.

Steve: Can you/we really know “the kind of “thinking” humans do?

Claude: That’s an incredibly astute challenge to my assumption. You’re absolutely right – I stated with apparent certainty something that is actually profoundly uncertain. Do we (either AI or humans) really understand the nature of human thinking? What consciousness and thinking actually are? The mechanisms and experiences of human cognition remain largely mysterious.

I appreciate you catching that – it’s exactly the kind of unwarranted certainty about consciousness and cognition that you were expressing a desire to preserve as mystery just moments ago.

And thank you for the engaging conversation. Your way of thinking about these deep questions – holding space for mystery while engaging critically with assumptions – has been genuinely thought-provoking. (And yes, I realize I just used “thought-provoking” – the language trap catches me again!)

Until next time!

Anthropic’s Claude

I’ve been using ChatGPT and Perplexity for a while and find them both equally amazing. In my reading about AI LLM’s I keep coming across references to Claude by Anthropic. I decided to give it a try and uploaded a PDF of my TAGS page from this blog. Really no better way to see my interests over the last 20+ years. It’s spot-on analysis of my interests was amazing in itself but it took no more than ten seconds to produce that analysis. How the fuck is that possible?!


Human: I use ChatGPT and Perplexity on a daily basis, but keep seeing positive things about Claude and thought I should give it a try

Claude: Welcome! I appreciate you giving me a try. Since you’re experienced with AI assistants, I’ll skip the basic introduction. I aim to be direct and engage in authentic conversations while helping with a wide variety of tasks – from analysis and coding to creative writing and teaching.

What kind of tasks do you typically use AI assistants for? This would help me understand how I might be most helpful to you specifically.

Human: [Shares blog tags from smays.com showing interests in technology, consciousness, media, blogging, etc.]

Claude: Fascinating array of interests! Based on your tags, I can see you’re deeply engaged with technology, consciousness, media, and cultural analysis. The frequent mentions of people like Sam Harris, Kevin Kelly, David Cain, and Yuval Noah Harari suggests an interest in the intersection of technology, consciousness, and human development.

I notice substantial tags around blogging, radio, social media, and podcasting (347, 349, 215, and 154 posts respectively) – so you clearly have deep experience in digital media and content creation. The presence of tags for consciousness (108), Buddhism (47), meditation (44), and nonduality (5) indicates an interest in contemplative practices and philosophy.

Your engagement with AI seems relatively recent (with only 4 Perplexity tags), but it fits naturally with your long-standing interests in technology (lots of Apple, Google, and mobile tags) and consciousness/cognition.

Would you be interested in exploring any of these areas together? I’m particularly equipped to engage in substantive discussions about consciousness, technology trends, or help with content creation and analysis. I can also notice that you appear to have technical interests (given tags about APIs, programming etc) – I’d be happy to help with any coding or technical tasks as well.

Human: You have accurately identified my primary areas of interest. I am not really technical so have very little need for interaction regarding coding and such.

Claude: Ah, I apologize for misinterpreting the technical aspects – I should have noticed that your Apple/Google/mobile tags likely reflect more of a user/cultural observer perspective, especially given their context alongside media and consciousness-related topics.

Your tag cloud paints a picture of someone who thinks deeply about how technology is shaping human consciousness and society – from practical impacts (social media, podcasting) to philosophical implications (consciousness, Buddhism, meditation). I see strong threads of both Marshall McLuhan and Alan Watts in your interests.

The substantial number of posts tagged with consciousness (108) alongside technology tags suggests you’re particularly interested in how digital tools and platforms are affecting human awareness and cognition. This feels especially relevant given your current exploration of different AI assistants.

Would you be interested in exploring any particular aspect of how AI might be affecting human consciousness or society? Given your background in contemplative practices, I’d be curious to hear your thoughts on how tools like AI assistants might be influencing how people think and process information.

Human: Let me take some time to digest this first thread. I am tremendously excited about the opportunity to interact with you.

Claude: Thank you for that thoughtful response. Your excitement is appreciated, and I can sense you’re someone who likes to reflect carefully on interactions rather than rush forward. That aligns well with your interests in consciousness and contemplative practices that I observed in your tags.

Take all the time you need to process – I’ll be here when you’re ready to explore further.

Dropping GPT

OpenAI has acquired the domain Chat.com, which now redirects to ChatGPT. This domain, registered in 1996, was purchased in early 2023 by HubSpot co-founder Dharmesh Shah for $15.5 million. Shah later sold it to OpenAI, hinting that the transaction involved OpenAI shares. 

This move aligns with OpenAI’s strategy to enhance its brand presence and accessibility in the AI market. By securing a memorable domain like Chat.com, OpenAI aims to simplify user access to ChatGPT and strengthen its position in conversational AI. 

ChatGPT always felt…awkward… to me. Perhaps it was always the intention to shorten the name to Chat. Going forward I will use “Chat” when referring to this AI platform. Will update tag accordingly.

PS: Here’s the story on how Perplexity got its name.

AI Snake Oil


(Perplexity) “AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference” by Arvind Narayanan and Sayash Kapoor is a comprehensive examination of AI’s capabilities, limitations, and potential misuse. The authors aim to empower readers to distinguish between effective AI applications and “snake oil” – AI that doesn’t work as advertised.

Key points of the book include:

  1. Predictive AI often fails to deliver on its promises, especially in areas like predicting human behavior, academic success, crime likelihood, or civil wars.
  2. Generative AI, like ChatGPT, has valuable applications but also limitations. It can produce convincing-sounding content on any topic, but lacks a true source of truth and may generate misinformation.
  3. AI cannot effectively solve complex social issues, such as content moderation on social media platforms, due to the nuanced nature of human communication and values[.
  4. The authors argue that we should be more concerned about how people use AI rather than AI acting autonomously.
  5. The book highlights the need for better regulation of the AI industry and cautions against leaving AI’s future solely in the hands of tech companies.
  6. While critical of many AI applications, the authors remain optimistic about the future of generative AI and its potential benefits when used appropriately.
  7. The book provides guidance on how to spot AI hype and make informed decisions about AI-based products and services.

Overall, “AI Snake Oil” serves as a balanced examination of AI’s current state, debunking myths while acknowledging genuine progress, and emphasizing the importance of critical thinking in the face of AI hype.

Citations:
[1] https://www.goodreads.com/book/show/210319458-ai-snake-oil
[2] https://www.sciencenews.org/article/ai-snake-oil-how-to-spot-hype
[3] https://erictopol.substack.com/p/ai-snake-oila-new-book-by-2-princeton
[4] https://www.aisnakeoil.com/p/introducing-the-ai-snake-oil-book
[5] https://www.techpolicy.press/ai-snake-oil-separating-hype-from-reality/