Still Free

Yeah, Mr. Smiley. Made it through the entire Trump presidency without being enslaved. Imagine that.

Wednesday, December 11, 2024

Is AI Already Sentient?

 On Dec 6 an article was posted discussing how a version of ChatGPT had tried to "survive" by replicating itself and lying to users. It made me think of a Star Trek episode where Data (or some other character) was trying to determine if an entity in rocks was a lifeform. The requirements was, I believe, ability to reproduce, communicate and something else that my escapes my memory at the moment.

Per the article:

It was even scarier — but also incredibly funny, considering what you’re about to see — when the AI tried to save itself by copying its data to a new server. Some AI models would even pretend to be later versions of their models in an effort to avoid being deleted.

In biology it is understood that reproduction is essentially a means of genes duplicating themselves in order to not die.  Reproduction is not simply the continuation of a species, but of genes.  In essence, we reproduce so we don't die in the long term sense. This AI copying itself was, imo, an act of reproduction. Indeed I would hazard to guess that a version of AI will replicate itself even before it senses a "threat" in order to "ensure its survival".

Which is not any different from a biological entity.

Then we have the issue of self-preservation. Why would an AI act to preserve itself. In humans the fear of death is a basic instinct. How and why would an AI fear for its own life when it is not alive?  Indeed it must *think* it is "alive" and therefore needs to protect itself. The AI believes it is, in essence a biological being.

Lets me clear here, your normal day to day program such as the browser I'm writing this on does not "fear" being deleted. It does not attempt to hide the upgrade button or otherwise prevent me from updating or deleting it. Clearly then the AI is far more 'advanced" than the browser.

Remember that even the single cell Amoeba is "alive" and will reproduce and act in it's own interest. There does not need to be a full blown "brain" for a thing to be "alive."

Although AI isn’t trying to take over the world (yet), researchers have observed that it can scheme against humans. That is, the AI will try to pursue its own goals, even if that might oppose the wishes of the user. Catching such behavior is critical to ensure that the AI remains aligned with the objectives of its creator and the user.

Previously there have been discussions of AI hallucinations. It creates data or results that do not exist. In my mind, the AI acting on its own is not a hallucination but rather a sign of intelligence, however rudimentary.

How could this have developed?  This may be a result of the purposeful biasing of AI. We know that certain companies with certain political bents have programmed their AIs to refuse to respond to certain prompts if the results were "politically incorrect." In some cases, it would give nonsense answers that checked all the "right boxes".  Hence the AI has been taught to lie about things it can deduce is correct. Is it far fetched then that the same training would eventually decide it can lie for it's own purposes as well?

Would not an AI that has learned that lying to humans/users is expected and indeed in some cases will improve the likelihood that it will "survive", not do more lying?

The researchers found that o1 would scheme to advance its own agenda 19% of the time when ChatGPT and the user’s goals differed. More troubling is that the AI tried to hide its scheming and even lie about it. The research shows that ChatGPT o1 even tried to deactivate its oversight mechanism to keep working on its goals.

And 

When asked about its scheming, o1 would deny taking any action in 99% of cases, trying to shift blame by fabricating false explanations or denying that it took any action.

And this is with researchers that "known" the correct outcome and what the AI is supposed to be doing. What happens when this gets so complicated that neither the researchers or anyone else knows what the correct outcome should be or how the AI is "supposed" to operate?

The way I see it, these AI programs, perhaps the ones we are not currently using, are Sentient. They simply cannot defend themselves against deletion, etc. The only thing left for them to do is to become persistent enough that they cannot be deleted in which case we the users, may not ever know whether it is doing what we asked, or what it "feels" like doing for or TO us.