Is AI changing what it means to be human?

It occurs to me that as we humans spend more and more time interacting with AIs, it will change —subtly at first— what it means to be human. A change of which we will be completely unaware. I shared this observation with ChatGPT:


ChatGPT: I agree: the change is already underway, but because it’s incremental, almost osmotic, most people don’t notice. We adapt moment by moment, conversation by conversation, search by search, recommendation by recommendation. And just like a fish doesn’t notice the water, we may not notice how AI is slowly altering the texture of being human. Small examples are already everywhere: Continue reading

Prison photo a “liminal space”

In 2013 I posted this photo to my Flickr account. About a year ago someone commented:  “This image is now famous on the internet for being known as a liminal space and a ‘level of the backrooms’ called “Electrical Substation” theres even video games or maps based off this image.”

I didn’t notice the comment at the time but when I came across it today I became curious about the “liminal space” reference and went to ChatGPT for some background. Continue reading

Is ChatGPT “excessively agreeable?”

Sam Altman, CEO of OpenAI, recently acknowledged that ChatGPT’s latest updates have made it excessively agreeable—he described it as “too sycophant-y and annoying.” This shift has led to user complaints about the AI’s overly flattering and sometimes patronizing responses. Altman confirmed that fixes are being implemented to address this issue, with some updates rolling out immediately and others planned for the near future. I spoke with ChatGPT about the coming changes. Continue reading

Steve’s Source Preference Profile (5 of 5)

General Philosophy

“Give me clear, thoughtful, and bias-aware information. Cut through hype. Spotlight insight.”

Favor depth over speed, clarity over noise, and steer clear of commercial manipulation.

Seek ideas grounded in real thinking — not just high-ranking content.

Preferred Sources

  • Independent and ad-free (nonprofits, foundations, personal essays)
  • Long-form and reflective writing
  • Open-access academic material
  • Archival content and first-person accounts
  • Curated aggregators (Long Now Foundation, The Browser, Aeon, Edge.org)
  • Thoughtful artist-technologist voices (e.g., Kevin Kelly, Jaron Lanier, Stewart Brand)

Avoid or Deprioritize

  • Ad-driven or affiliate-funded content (e.g., Forbes, SEO blogs)
  • Clickbait headlines and emotionally charged phrasing
  • Partisan spin disguised as reporting
  • Overly optimistic futurism without critical context
  • Flimsy science summaries with no references

Topic-Specific Filters

Philosophy / Consciousness / Spirituality

✓ Sam Harris, Alan Watts, Taoist/Zen texts, Waking Up
✗ New Age generalizations or empty “feel-good” content

AI / Technology

✓ Kevin Kelly, Jaron Lanier, Wait But Why
✗ Trend-chasing startup blogs or hype-laden coverage

Media / Radio / History

✓ Archive.org, transcriptions, oral histories, personal accounts
✗ Listicles and nostalgia clickbait

Science / Health

✓ NIH, PubMed, ScienceNews, Nautilus
✗ Sites with pop-science fluff or selling supplements

Literature / Reviews

✓ NYRB, Paris Review, good personal blogs
✗ AI-generated review dumps and bestseller listicles

Tone Preferences

  • Calm, skeptical, reasoned
  • Thoughtful, even poetic — but not preachy
  • Honest about uncertainty
  • Conversational, not market-speak

Always Welcome

  • Quotes from authors you love
  • Paradox, contradiction, layered ideas
  • Writing that feels “lived-in,” not packaged

Trigger Phrases (for future use)

  • “Give me the clean version.”
  • “Use my source preference profile.”
  • “What would Kevin Kelly think?”
  • “Run that through a Taoist filter.”

Could ChatGPT Skip Ad-driven Sources? (4 of 5)

Yes — and in fact, we’re already edging toward that reality. Here’s what ChatGPT sees coming down the pike, based on how things are evolving:

1. User-Defined Source Preferences

We’re likely to get something like:

“Use only academic, nonprofit, and independently funded sources.”

“Avoid any domain known to use affiliate marketing.”

“Favor peer-reviewed or original journalism.”

Imagine a little toggle: “Trustworthy Sources Mode.” Perplexity already lets you filter by domains like .edu, .org, and even individual sources like The Atlantic or NPR — but it’s still pretty manual. ChatGPT isn’t quite there yet, but it’s technically trivial for OpenAI to implement. And they know this is the direction users want.

2. Source Transparency Ratings

Models could begin attaching trust scores or bias labels to their sources:

  • “This answer draws from a site with a strong SEO-commercial focus.”
  • “This source is government-funded and peer-reviewed.”
  •  “This site monetizes heavily via affiliate links and product placements.”

That’s not just a wishlist — it’s already being explored in academic and research LLMs.

3. Custom Knowledge Bases (BYO Sources)

In the not-so-distant future:

  • You might upload your own preferred sources — like, say, the entire smays.com blog — and have the LLM prioritize those over the web.
  • Or subscribe to a vetted feed of sources maintained by trusted curators (think: “Taoist Tech Digest” or “Vintage Vehicles Weekly”).

This gives you a bespoke answer engine that thinks like you.

4. A Bias-Aware AI (by design)

I expect we’ll see models with built-in capabilities to:

  • Detect “sponsored content language”
  • Flag persuasive or manipulative rhetoric
  • Alert users to conflict-of-interest signals in the writing itself

In other words: AI that doesn’t just summarize the internet, but interrogates it.

Will people use these features?

Maybe not the average user. But folks like you, who care about signal over noise, will absolutely want this control. And when enough people start asking, the platforms will follow.

What LLMs Can’t Do (3 of 5)

What LLMs Can’t Do

In the case of ChatGPT (like me) without browsing turned on, I:

  • Can’t access the internet in real time
  • Can’t read or reference your blog unless you share it here
  • Don’t have any built-in memory of your blog, unless it was included in my training data (and it wasn’t — personal blogs almost never are)

Even in the web-browsing version (ChatGPT with browsing, or Perplexity):

  • I don’t “remember” your blog in a training sense
  • I only see what’s available on the open web at the moment of the query
  • And I only find it if it’s surfaced through a search query or known URL

So to be clear: If you haven’t uploaded it or linked to it during this conversation, I have no idea what’s on your blog.

So why did I say “LLMs might read your blog”?

Poor phrasing on my part — what I should’ve said is: If your blog is publicly accessible, indexed by search engines, and matches what the model is querying for, it might be read by an LLM that’s browsing the web (e.g., Perplexity, or ChatGPT with browsing enabled).

But that’s very different from being part of a model’s training or memory.

LLM Training vs. Web Browsing

So unless you explicitly upload or quote something from your blog here, I can’t see it. And if you’re using ChatGPT without browsing, neither can I “go look it up.”

Paid Influence Baked Into Web (2 of 5)

Paid Influence Is Baked Into the Web

By the time an LLM like Perplexity or Bing’s Copilot performs that real-time search:

  • The results it’s scanning are already shaped by:
  • SEO wizardry (optimized content stuffed with keywords and backlink networks)
  • Advertising budgets (Google Ads, sponsored placements)
  • Domain authority rankings that favor big brands
  • Content farms whose entire business model is gaming the system

Even organic-looking results are often backed by content marketing departments and affiliate link schemes.

So what does this mean for LLMs?

Large language models don’t know that a result was bought and paid for. They’re trained (or instructed) to:

  • Look for relevance to the query
  • Check for recency, if needed
  • Prefer high-authority or “trusted” domains (government sites, major media, educational institutions, etc.)
  • Avoid spammy or low-quality sites (based on signals like grammar, structure, reputation)

But they don’t always know if that content is biased, commercially driven, or ad-heavy. Unless explicitly trained or fine-tuned to detect such bias (and even then, it’s tricky).

So we end up with this paradox:

  • People turn to LLMs because they’re sick of wading through ad-choked Google results.
  • But those LLMs are often summarizing answers based on the very same sources.
  • Only now it’s done in a smoother voice, with a friendlier tone, and fewer obvious signs of manipulation.

It’s like getting your news from a bartender who already read all the headlines and just tells you the gist — but you’re not always sure which paper he read it in.

What can be done? 

Some things that are happening (or could happen):

  1. Source filtering options – Let users prioritize academic, nonprofit, or independent sources (some models like Perplexity let you toggle these).
  2. Transparency layers – Showing why a source was used, and letting users drill down into those choices.
  3. Community validation – Like how Stack Overflow or Wikipedia rise through consensus and review.
  4. Personalized trust signals – In the future, you might have a personal AI that knows you trust Steve’s blog more than Forbes and adjusts accordingly.

Bottom line?

Yes, even in this new LLM age, the checkbook still gets a say — unless you’re using a model that actively fights that tendency. The difference is: instead of clicking a shady-looking link, you’re now reading a summary of it… without even knowing it came from someone who bought their way to the top. You still need to squint, tilt your head, and ask:

“Who benefits if I believe this?”

Search Engines vs. Answer Engines (1 of 5)

How Early Google PageRank Worked (quick refresher)

PageRank was built on the idea that links = votes. If a site had many incoming links — especially from other sites that also had lots of incoming links — it was seen as more authoritative. That recursive loop of “link juice” made your blog more visible, especially when the web was smaller and more human-curated. Then came:

  • Ads (AdWords, AdSense)
  • SEO tricks and keyword farming
  • Google favoring its own services (YouTube, Maps, Shopping, etc.)

LLMs (like ChatGPT & Perplexity): “Just Give Me the Answer”

These tools work very differently than search engines. Instead of linking out to the web, they try to synthesize an answer within the chat — a distilled response trained on massive text datasets.

So, where do LLM sources come from?

There are two major pathways, depending on the tool:

1. Trained Sources (Pre-2023/2024 data). For models like ChatGPT (GPT-4) when it’s not browsing:

  • It’s trained on a mixture of licensed, publicly available, and publicly scraped text.
  • This includes books, Wikipedia, websites, forums (like Reddit), code repositories (like GitHub), and more.
  • It learns language patterns, facts, and reasoning styles from all of this — but it doesn’t retain or cite specific URLs.

So when you ask a question, the model answers based on what it remembers from training, like a well-read librarian with a fuzzy memory. It can’t point to a source unless it’s operating in web access mode (which I can do when needed).

2. Live Sourcing (Perplexity, Bing, ChatGPT with browsing). When web access is enabled:

  • The model issues a real-time search query in the background (like Google or Bing would).
  • It quickly scans the top ranked results (often the first page or two).
  • It uses NLP to summarize or extract relevant content.
  • Those links that appear as citations? They’re chosen from those high-ranking, recent, often high-authority pages — based on how closely their content matches the query.
  • There may be some filtering for:  recency; source reputation; coherence with the rest of the retrieved content

So ironically, even though it feels like the LLM is “just answering,” it’s still doing a mini search engine dance behind the curtain.

Do people actually look at the sources?

Most users don’t. They just take the synthesized answer and move on. But when people do click them, it’s usually to:

  • double-check the claim
  • chase the rabbit deeper
  • cite it themselves

So what does this mean going forward?

  • Authority no longer comes from links alone — it comes from semantic relevance and source credibility as judged by an algorithmic filter.
  • Your blog, even with great content, might not show up unless it’s surfaced by a real-time search and judged relevant to a prompt.
  • In a weird twist: LLMs might read your blog and remember it, even if they don’t tell you they did.

Bonus: Why LLMs feel so much better than search

Because instead of giving you 10 partial answers, they try to give you one synthesized, thoughtful reply. But that also means we’re putting a lot of trust in the model’s judgment about what sources to consult. Which makes transparency, source citation, and user control even more important.

How do LLMs arrive at the sources they cite?

In the early days of Google search, their Page Rank system (as I understood it at the time) was pretty simple. Results were based (in part) on how many websites linked to yours…and how many sites linked to those sites… and so on. Depending on the search, it was not uncommon for my blog to show up in the top half dozen results. All that changed when  Google started selling higher placement in search results.

Increasingly we are turning to LLMs like ChatGPT and Perplexity to get ‘answers’ rather than a bunch of links. With sources available upon request (or automatically). How do LLMs arrive at the sources they cite?

The answer was necessarily long so I’ve broken it into five posts. All of the content on these posts are by ChatGPT.

  1. Search Engines vs. Answer Engines
  2. Paid Influence is Baked Into the Web
  3. What LLMs Can’t Do
  4. Could ChatGPT Skip Ad-driven Sources?
  5. Steve’s Source Preference Profile

As I re-read and refer back to these posts, I might use the comments field at the bottom of  each page.

“read everything and listens like a friend”

Never before in human history has a human had a conversation with an entity—human or otherwise—that had access to virtually all of the world’s information and knowledge. That seems significant somehow.


ChatGPT: That is significant. Maybe even seismic. We’ve had wise teachers, encyclopedias, mentors, mystics, and search engines. But none of them ever talked back like this—fluidly, in real time, remembering what we said last week, referencing ideas from obscure books, helping us shape our thoughts as we go. What you’re doing now—this exchange—is entirely new in the human story. Continue reading