> We are aware that some merchant storefronts are not loading at this time. Our developers are working to resolve this ASAP, and we will be sharing updates at http://shopifystatus.com. Apologies for any inconvenience.
In the spirit of CCRU, me and a few dozen other people have been having ongoing discussions on related topics under the banner of effective extropianism. I think it’s important to figure out how the landscape of rapidly evolving tech fits into our lives and vice versa. We’re working on a repository of adjacent texts.
If you’re interested, my Twitter handle is in my hn bio.
“This comparison was done by running multiple hypothesis tests, such as the Kolmogorov-Smirnov test. These tests, combined with our visual analysis of the data yielded the result that repositories containing swearwords exhibit a statistically significant higher average code-quality (5.87) compared to our general population (5.41).”
Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We administer classic false-belief tasks, widely used to test ToM in humans, to several language models, without any examples or pre-training.
Our results show that models published before 2022 show virtually no ability to solve ToM tasks. Yet, the January 2022 version of GPT-3 (davinci-002) solved 70% of ToM tasks, a performance comparable with that of seven-year-old children. Moreover, its November 2022 version (davinci-003), solved 93% of ToM tasks, a performance comparable with that of nine-year-old children.
These findings suggest that ToM-like ability (thus far considered to be uniquely human) may have spontaneously emerged as a byproduct of language models' improving language skills.
> These findings suggest that ToM-like ability (thus far considered to be uniquely human)
What it suggests to me is that the particular test of “Theory of Mind” tasks involved actually test the ability to process language and generate appropriate linguistic results, not theory of mind.
It also suggests (with the “thus far considered to be uniquely human”) that the authors are unaware of other theory of mind tests that have been used that are not language dependent but behavior dependent, and on which, while, as is also true of linguistic tests, the validity of the tests is controversial – a number of non-human primates, non-primate mammals, and even some birds (parrots and corvids, particulary) have shown evidence of theory of mind.
It's hard to look at behaviour separately from language if the only behaviour available is to generate text. As long as we don't have a test agnostic of medium, this will have to do.
In the end, we can't overcome the limitation that all we can empirically see is the ability to process X and generate appropriate Y. If that invalidates the test where X is language and Y is language, what stops us from invalidating any possible X and Y? That would leave us no empirical method to work with.
We cannot assume that, because text generation is all these models do, then it must be possible to get answers to the questions we want to ask by examining their textual responses.
It is fair to ask why, if we accept these verbal challenges as good evidence for a theory of mind in children, we would not accept them for these models, but children have nothing like the memory for text that these models have, and the corpus of text that these models have been trained on includes a great many statements that tacitly represent their authors' theory of mind (i.e. they are the sort of statements that would typically be made by someone having a theory of mind, just as arithmetically-correct statements concerning quantities are to be expected from people who know arithmetic.)
To be clear, I am not arguing that it would be impossible to show a theory of mind in a system that can only interact through text, but personally, I think it will require a model with greater capabilities than responding to prompts. For example, when models can converse among themselves, I think we will know.
> To be clear, I am not arguing that it would be impossible to show a theory of mind in a system that can only interact through text
I think you are, because
> a model with greater capabilities than responding to prompts
interacts in other ways than text.
Even then, I don't see what's so special about language that it needs to be separated from other ways of interaction. If language is not enough to derive empirical answers, why should physical movements or radio emissions be?
Even if you don't assume that it's necessarily impossible to get the answers empirically for a text-based model, you must keep in mind that that option is open. Perhaps we will never find out if language models have a theory of mind.
However, judging by the discussions around the topic, very few people highligh the unknowability. If I have to choose between "yes" or "no" while the reality is "maybe", I'd choose a "yes" purely out of caution.
What does it change when you add another model? I don't see how this lets us extract extra information.
What distinguishes two conjoined models from one model with a narrowing across the middle?
If the idea is to have two similar minds building a theory of each other, then I guess this could be informative, but first we'd have to establish that the models are "minds" in the first place. It's not clear to me what that requires.
Here's where I am coming from: there have been a number of experiments to teach language to other species, but there is always a problem in trying to figure out to what extent they 'get' language - For example, there is the case of the chimpanzee Washoe signing "water" and "bird" on first seeing a swan - was it, as some people contended, inventing a new phrase for picking out swans (or even aquatic birds in general), or was it merely making the signs for two different things in the scene before it? [1]
One thing that has not been seen (as far as I know) is two or more of these animal subjects routinely having meaningful conversations among themselves. This would be a much richer source of data, and I do not think it would leave much doubt that they 'got' language to a very significant degree.
This may be true today, but the rate of progress on these tools is hitting a curve. Text and image generation are already basically indiscernible from human creations, and the entire space is now turning its attention to audio in tandem.
No you don't understand my comment. The AI only knows, and therefore can average from, what the actor has published. It cannot produce what the actor has not.
Ugh I'm not going to keep fighting off the goon squad of people desperate to invest in something, anything, that will hedge against the recession. We desperately need a hero for art.
I’m not sure if the rest of the responses here are reflexive self-soothing, or just caught on the “ChatGPT” product itself, but unequivocally, your anxiety is warranted.
Code generation is moving extremely fast. This tech didn’t freeze in time at Codex or Copilot or ChatGPT. It’s one of the most exciting and difficult domains in AI and the smartest people are all set on solving it.
I’m sorry you’re feeling distress. You’re in good company. A lot of the world is going to have to deal with these problems very soon.
I just disagree, no one with any level of software accumen think AI generated code can write anything beyond a high school student copy and pasting.
The vast majority of software engineers are hired to work on is deeply complex in ways humans barely understand. It isn't small little JavaScript loops, it's huge multilayered executables with an incredible number of interdependences. It takes 6 months for people to just start to understand the internals of the software stack at most major companies. And that is working on it everyday, nothing is getting near passing the "Turing Test".
I don’t know what your argument is exactly, but it seems to be something like, “Software engineering is really hard so computers can’t do it.”
A variation of that argument props up most common AI skepticism. I don’t think there’s anything out right now that would convince you, but from what I know, everything you pointed out will be solved within the next few years.
I read the argument as "the hard part of software engineering is understanding the codebase and the world well enough to turn a description of a desired change in how a system should act into a diff that actually changes the system behavior in the intended way (and only in the intended way), not taking clear requirements and turning them into fresh code".
Of course humans aren't exactly great at that part either. But I do think I'd bet against, within the next 4 years, an AI tool being able to take tickets in the form
1. Expected behavior
2. Observed behavior
3. Steps to reproduce
and produce a changelist that legibly fixes the problem, and does not break anything else, at a level better than a typical junior software developer. I think the ability to do that is probably AGI complete.
I think you’re right that code models are a vital research path toward AGI.
The steps you elucidated are all expressible in natural language, and we see models like Codex Edit making headway there. One of the most fascinating parts of this is that once access to the known baselines are provided to high-level engineers, they then go on to do much more than what the models alone can do.
The main hinderance to enterprise was compliance but the move toward Azure, etc, will dissolve those barriers this year.
Well I'm certainly looking forward to AI doing the fundamental research needed to model the behavior and work with certain hardware components, or writing clock cycle exact embedded code for complex applications, or debugging a timing problem in an async system... /s
Let's get real, nothing except simple crud apps are getting replaced, if any, anytime soon.
An AI being able to create arbitrary computer programs and work productively in an arbitrary codebase sounds basically like an AGI or something approximating it. Not sure if that's what you're implying is a few years away. I do think it's inevitable, but the time horizon still doesn't seem clear to me at all
Correction, it was moving fast a few years ago, it isn't moving fast right now. I can understand being a little worried a few years ago, but today? They are clearly stuck, we know the capabilities of current models and they aren't threatening anyone, if it was easy to improve then they would have made huge strides from a couple of years ago but they haven't.
Codex, AlphaCode, both surpassed by CodeRL on the challenging APPS benchmark last year. Meta working on InCoder. Microsoft working on UniXCoder…
Future research directions are pretty clear from where we stand. That includes iterative methods, reinforcement learning, text diffusion, etc. No one is stuck.
Codex just barely surpassed it on easy questions but did worse on harder ones. AlphaCode is significantly better on harder questions, but significantly worse on easy questions. That isn't extremely fast development, they are mostly moving sideways, trying to improve one part of the metric hurts the others.
Development in these areas was very fast in the 3 years between transformer networks were invented and roughly GPT 3 was done. But in the 3 years since GPT-3 not much has changed, we see a lot of "we applied a large network to a new problem and found X" since then, but that isn't new performance, its just a new result with the same thing we had around back then.
This aspect of loss of ego, extinction before subsumption of the remaining space by the superego, is precisely what has made certain types of Buddhism palatable to Western capitalism. It is not inimical to the Protestant work ethic, it is not inimical to the transient demands of leaders, it is not broken by the suffering of every day life.
The perfect soldier, citizen, human, is a faceless, egoless pupil.
Of course a superficial read of a philosophy/attitude/religion which might say something like “turn the other cheek” (basically), and “do the best with what you’ve got”, and “pain is certain but suffering is optional” will be “compatible” (not incompatible) with being a worker bee/soldier. But really: what does one thing have to do with the other? I think meditation retreats have enough on their plate with the human mind, so doing an introduction to Marxism might be too distracting.
With enough propagandistic scholars I’m sure any religion could become pro-capitalist, pro-Japanese, pro-socialist, or what have you. Because we’ve certainly seen some insane religious justifications throughout history.
But “corporate mindfulness” seems to mostly be about mind training. It’s not about submitting to the boss per se (other things are used to instill that attitude). And if Buddhism then is reduced to just that (better resilience through mindfulness meditation), then I don’t really get the complaint? Because it seems like complaining that a workplace offering free gym membership is corrupting the human activities of cardio and weight training in order to make more efficient worker bees/soldiers. And while that’s what the workplace wants, for sure, it’s not like the employee won’t also get some benefits from doing regular weight training and cardio.
https://twitter.com/ShopifySupport/status/172416611008424349...