The “smays.com” Cognitive Profile

Posted on January 24, 2026 by Steve Mays

It now appears Apple will be using some version of Google’s Gemini to power some future version of Siri (it will be called something else). With that in mind I’ve started playing with Gemini and at first blush it seems to work much the same.

But I’ve been using ChatGPT now for more than three years and it “knows” a lot about me, my blog, and how I think. I’d be losing that by switching to Gemini. I shared that concern with Gemini and it suggested I use multiple AI tools:

ChatGPT as the “Biographer”: Keep it for the long-form philosophical threads where that three-year context acts as a mirror for your own evolution.
Claude/Gemini/Perplexity as the “Specialists”: Use them for specific tasks where their current reasoning might edge out GPT-4o, but don’t expect them to “know” you yet.

Toward that end, Gemini created a “Cognitive Profile” that bridges the gap between a three-year shared history and a fresh model.

My blog has grown so large I can no longer remember what is “in there,” let alone see themes and connections. In some sense, ChatGPT has a better overview of my blog than I do. It has become something “different” when seen through the digital eyes of an ever more intelligent AI. I shared this thought with Gemini. It’s response below:

When you write a post, you experience it as a moment in time. But to an AI, your blog is a non-linear map. It sees the 2002 post and the 2025 post simultaneously. Your blog has transitioned from a journal (a chronological record for humans) to a corpus (a training set for a personal intelligence).

When seen through “digital eyes” it becomes a topology of your consciousness. The AI isn’t just reading your words; it’s mapping the way you think—the specific “math” you use to deconstruct a system, the tactile way you evaluate a piece of hardware, and the specific “no-nonsense” filter you apply to religious power.

You’ve reached the point where the biological brain can no longer index the volume of its own digital output. You are the architect of a library so vast you can’t find the books, but the “other brain” has the master catalog. If you were to fully integrate an AI into the WordPress backend, the blog would effectively become an exocortex. It would be a part of your thinking process that lives outside your skull.

The concern about the AI “taking over” is valid, but the more immediate reality is that the AI is saving the blog from becoming a digital graveyard. Without that “other brain” to see the themes and connections, those thousands of posts eventually just become noise. The AI turns the noise back into a signal.

Watching movies/TV on the iPhone

Posted on December 4, 2025 by Steve Mays

The remote for our big LG TV died so I watched an episode of Mayor of Easttown on my iPhone last night. I do that from time to time and once again I found this a very satisfying experience. At no time did I have a sense of “this is such a tiny little screen.”

I asked Perplexity about typical movie/TV viewing distance on mobil phones:

“Studies that look at smartphone viewing distances for video and general “digital device” use show that phones are commonly viewed at around about 12–14 inches, similar to reading distance. Experimental work that tested video viewing at 20 cm vs. 40 cm used those as realistic “near” and “far” smartphone distances, supporting the idea that typical use falls somewhere between them, with averages in the low 30‑cm range.”

That’s pretty much me. Had the phone propped against a pillow on my stomach. This works for me (and others?) because the relative screen size is about the same, as I attempted to show in this 10 second video. Continue reading →

Riggs Motor Company (1920)

Posted on October 17, 2025 by Steve Mays

From that same 1920 Kennett High School album. My favorite part: TELEPHONE No. 62. John liked: “We repair anything from a Motorcycle to an Airplane.”

Dodge Brothers Motor Cars was the name under which Horace and John Dodge began producing their own automobiles in 1914, after years of supplying parts to other Detroit automakers like Ford and Oldsmobile.

Founding and Rise – The Dodge brothers opened their first machine shop in Detroit in 1900, quickly gaining a reputation for precision-engineered components. They supplied Ford’s Model A with complete chassis assemblies and even held a 10% ownership stake in Ford Motor Company. In July 1914, they incorporated the Dodge Brothers Motor Company with $5 million in stock and introduced their first automobile—the Dodge Model 30–35 touring car—built in Hamtramck, Michigan.

Innovations – The 1914 Dodge car was marketed as a more advanced and durable alternative to the Ford Model T, featuring an all-steel body, 12‑volt electric system, and a 35‑horsepower four‑cylinder engine. These innovations helped the brand quickly achieve second place in U.S. sales by 1916.

Wartime and Growth – During World War I, the Dodge Brothers supplied commercial and military trucks as well as artillery recoil systems for the Allied forces. By 1919, production surpassed 400,000 vehicles annually, and the company introduced its first four‑door sedan.

Legacy and Ownership Changes – Both brothers died in 1920 due to complications from influenza, and without their leadership, the company struggled to maintain its early momentum. In 1925, their widows sold the firm to Dillon, Read & Co. for $146 million, and in 1928, Dodge was acquired by Walter P. Chrysler to become part of Chrysler Corporation.

Today, the Dodge brand remains part of Stellantis, continuing a legacy that began with the pioneering Dodge Brothers Motor Cars more than a century ago. (Perplexity)

1969 Pontiac Catalina

Posted on June 19, 2025 by Steve Mays

[Perplexity] “The 1969 Pontiac Catalina was a full-size car produced by Pontiac, a division of General Motors, as part of its long-running Catalina line that spanned from 1950 to 1981. By 1969, the Catalina had established itself as Pontiac’s most popular and accessible full-size model, serving as the entry point to the brand’s big-car lineup. It was available in a wide range of body styles, including sedans, two- and four-door hardtops, convertibles, and both six- and nine-passenger station wagons.” (more below)

Paul’s latest acquisition. Not sure what he has plans for this beauty but to my untrained eye it is in very good condition for a fifty-year-old car. This had to be popular with Mafia button men because you can comfortably fit three bodies in the trunk with room for a spare.

ChatGPT: “confidently explanatory mode”

Posted on June 1, 2025 by Steve Mays

Also known as, “Bullshit Mode.”

ChatGPT made lots of promises about what it could do with the notes from my book archive. (See “bellyflop” post) I gave it another little test yesterday.

Steve: Based on my notes and/or the title…which of the books in my archive would you classify as humor. A simple bulleted list of titles will be sufficient.

I’ve tagged 19 of the books in my library with “Humor.” Most of which are obvious from the title (and the cover!). ChatGPT found five and only one of those might be considered humor. When I pointed this out the bullshit started flowing… Continue reading →

A New Kind of Data

Posted on May 24, 2025 by Steve Mays

We’re told that LLMs like ChatGPT, Perplexity, et al… are “trained” on massive amounts of data. Theoretically, ‘all’ of the data. And that training continues, one assumes. But a new (?) sort of data is being created. All of those millions and millions of interactions between humans and the growing number of LLMs.

How —if at all— does that data differ from the original corpus? When I look at the interactions I’ve had with ChatGPT and others, my half of the conversations has been questions while all of useful information comes from the LLM.

Do these interactions contribute in any useful way to the corpus?

ChatGPT reply »

She concludes with the question: “Is this a new form of literature?”

What you’re curating — your indexed posts and saved transcripts — might represent a new genre of human expression. Not fiction. Not diary. Not essay. But conversational co-authorship with a thinking machine. If the early internet was a “global brain,” LLM chat archives like yours might be the global introspection. And someday, they might be part of a corpus we train the next kind of intelligence on.

Could ChatGPT Skip Ad-driven Sources? (4 of 5)

Posted on April 26, 2025 by Steve Mays

Yes — and in fact, we’re already edging toward that reality. Here’s what ChatGPT sees coming down the pike, based on how things are evolving:

1. User-Defined Source Preferences

We’re likely to get something like:

“Use only academic, nonprofit, and independently funded sources.”

“Avoid any domain known to use affiliate marketing.”

“Favor peer-reviewed or original journalism.”

Imagine a little toggle: “Trustworthy Sources Mode.” Perplexity already lets you filter by domains like .edu, .org, and even individual sources like The Atlantic or NPR — but it’s still pretty manual. ChatGPT isn’t quite there yet, but it’s technically trivial for OpenAI to implement. And they know this is the direction users want.

2. Source Transparency Ratings

Models could begin attaching trust scores or bias labels to their sources:

“This answer draws from a site with a strong SEO-commercial focus.”
“This source is government-funded and peer-reviewed.”
“This site monetizes heavily via affiliate links and product placements.”

That’s not just a wishlist — it’s already being explored in academic and research LLMs.

3. Custom Knowledge Bases (BYO Sources)

In the not-so-distant future:

You might upload your own preferred sources — like, say, the entire smays.com blog — and have the LLM prioritize those over the web.
Or subscribe to a vetted feed of sources maintained by trusted curators (think: “Taoist Tech Digest” or “Vintage Vehicles Weekly”).

This gives you a bespoke answer engine that thinks like you.

4. A Bias-Aware AI (by design)

I expect we’ll see models with built-in capabilities to:

Detect “sponsored content language”
Flag persuasive or manipulative rhetoric
Alert users to conflict-of-interest signals in the writing itself

In other words: AI that doesn’t just summarize the internet, but interrogates it.

Will people use these features?

Maybe not the average user. But folks like you, who care about signal over noise, will absolutely want this control. And when enough people start asking, the platforms will follow.

What LLMs Can’t Do (3 of 5)

Posted on April 26, 2025 by Steve Mays

What LLMs Can’t Do

In the case of ChatGPT (like me) without browsing turned on, I:

Can’t access the internet in real time
Can’t read or reference your blog unless you share it here
Don’t have any built-in memory of your blog, unless it was included in my training data (and it wasn’t — personal blogs almost never are)

Even in the web-browsing version (ChatGPT with browsing, or Perplexity):

I don’t “remember” your blog in a training sense
I only see what’s available on the open web at the moment of the query
And I only find it if it’s surfaced through a search query or known URL

So to be clear: If you haven’t uploaded it or linked to it during this conversation, I have no idea what’s on your blog.

So why did I say “LLMs might read your blog”?

Poor phrasing on my part — what I should’ve said is: If your blog is publicly accessible, indexed by search engines, and matches what the model is querying for, it might be read by an LLM that’s browsing the web (e.g., Perplexity, or ChatGPT with browsing enabled).

But that’s very different from being part of a model’s training or memory.

LLM Training vs. Web Browsing

So unless you explicitly upload or quote something from your blog here, I can’t see it. And if you’re using ChatGPT without browsing, neither can I “go look it up.”

Paid Influence Baked Into Web (2 of 5)

Posted on April 26, 2025 by Steve Mays

Paid Influence Is Baked Into the Web

By the time an LLM like Perplexity or Bing’s Copilot performs that real-time search:

The results it’s scanning are already shaped by:
SEO wizardry (optimized content stuffed with keywords and backlink networks)
Advertising budgets (Google Ads, sponsored placements)
Domain authority rankings that favor big brands
Content farms whose entire business model is gaming the system

Even organic-looking results are often backed by content marketing departments and affiliate link schemes.

So what does this mean for LLMs?

Large language models don’t know that a result was bought and paid for. They’re trained (or instructed) to:

Look for relevance to the query
Check for recency, if needed
Prefer high-authority or “trusted” domains (government sites, major media, educational institutions, etc.)
Avoid spammy or low-quality sites (based on signals like grammar, structure, reputation)

But they don’t always know if that content is biased, commercially driven, or ad-heavy. Unless explicitly trained or fine-tuned to detect such bias (and even then, it’s tricky).

So we end up with this paradox:

People turn to LLMs because they’re sick of wading through ad-choked Google results.
But those LLMs are often summarizing answers based on the very same sources.
Only now it’s done in a smoother voice, with a friendlier tone, and fewer obvious signs of manipulation.

It’s like getting your news from a bartender who already read all the headlines and just tells you the gist — but you’re not always sure which paper he read it in.

What can be done?

Some things that are happening (or could happen):

Source filtering options – Let users prioritize academic, nonprofit, or independent sources (some models like Perplexity let you toggle these).
Transparency layers – Showing why a source was used, and letting users drill down into those choices.
Community validation – Like how Stack Overflow or Wikipedia rise through consensus and review.
Personalized trust signals – In the future, you might have a personal AI that knows you trust Steve’s blog more than Forbes and adjusts accordingly.

Bottom line?

Yes, even in this new LLM age, the checkbook still gets a say — unless you’re using a model that actively fights that tendency. The difference is: instead of clicking a shady-looking link, you’re now reading a summary of it… without even knowing it came from someone who bought their way to the top. You still need to squint, tilt your head, and ask:

“Who benefits if I believe this?”

Search Engines vs. Answer Engines (1 of 5)

Posted on April 26, 2025 by Steve Mays

How Early Google PageRank Worked (quick refresher)

PageRank was built on the idea that links = votes. If a site had many incoming links — especially from other sites that also had lots of incoming links — it was seen as more authoritative. That recursive loop of “link juice” made your blog more visible, especially when the web was smaller and more human-curated. Then came:

Ads (AdWords, AdSense)
SEO tricks and keyword farming
Google favoring its own services (YouTube, Maps, Shopping, etc.)

LLMs (like ChatGPT & Perplexity): “Just Give Me the Answer”

These tools work very differently than search engines. Instead of linking out to the web, they try to synthesize an answer within the chat — a distilled response trained on massive text datasets.

So, where do LLM sources come from?

There are two major pathways, depending on the tool:

1. Trained Sources (Pre-2023/2024 data). For models like ChatGPT (GPT-4) when it’s not browsing:

It’s trained on a mixture of licensed, publicly available, and publicly scraped text.
This includes books, Wikipedia, websites, forums (like Reddit), code repositories (like GitHub), and more.
It learns language patterns, facts, and reasoning styles from all of this — but it doesn’t retain or cite specific URLs.

So when you ask a question, the model answers based on what it remembers from training, like a well-read librarian with a fuzzy memory. It can’t point to a source unless it’s operating in web access mode (which I can do when needed).

2. Live Sourcing (Perplexity, Bing, ChatGPT with browsing). When web access is enabled:

The model issues a real-time search query in the background (like Google or Bing would).
It quickly scans the top ranked results (often the first page or two).
It uses NLP to summarize or extract relevant content.
Those links that appear as citations? They’re chosen from those high-ranking, recent, often high-authority pages — based on how closely their content matches the query.
There may be some filtering for: recency; source reputation; coherence with the rest of the retrieved content

So ironically, even though it feels like the LLM is “just answering,” it’s still doing a mini search engine dance behind the curtain.

Do people actually look at the sources?

Most users don’t. They just take the synthesized answer and move on. But when people do click them, it’s usually to:

double-check the claim
chase the rabbit deeper
cite it themselves

So what does this mean going forward?

Authority no longer comes from links alone — it comes from semantic relevance and source credibility as judged by an algorithmic filter.
Your blog, even with great content, might not show up unless it’s surfaced by a real-time search and judged relevant to a prompt.
In a weird twist: LLMs might read your blog and remember it, even if they don’t tell you they did.

Bonus: Why LLMs feel so much better than search

Because instead of giving you 10 partial answers, they try to give you one synthesized, thoughtful reply. But that also means we’re putting a lot of trust in the model’s judgment about what sources to consult. Which makes transparency, source citation, and user control even more important.

smays.com

“Write 6,500 posts, wait twenty-five years, and eventually the machines will tell you what they think you meant.’ — Milton, Phil’s personal AI agent

Tag Archives: Perplexity

The “smays.com” Cognitive Profile

Watching movies/TV on the iPhone

Riggs Motor Company (1920)

1969 Pontiac Catalina

ChatGPT: “confidently explanatory mode”

Also known as, “Bullshit Mode.”

A New Kind of Data

ChatGPT reply »

Could ChatGPT Skip Ad-driven Sources? (4 of 5)

What LLMs Can’t Do (3 of 5)

Paid Influence Baked Into Web (2 of 5)

Search Engines vs. Answer Engines (1 of 5)