Netizen

Saturday, March 25, 2023

25: GPT-4: AI Is Eating Software

Microsoft Now Claims GPT-4 Shows 'Sparks' of General Intelligence The eyebrow-raising claim from Microsoft—which is banking on GPT putting it ahead of Google—contrasts with the model's clear limitations........ They declared that GPT-4 showed early signs of AGI, meaning that it has capabilities that are at or above human level. ........ This eyebrow-raising conclusion largely contrasts what OpenAI CEO Sam Altman has been saying regarding GPT-4. For example, he said the model was "still flawed, still limited." In fact, if you read the paper itself, the researchers appear to dial back their own splashy claim: the bulk of the paper is dedicated to listing the number of limitations and biases the large language model contains. This begs the question of how close to AGI GPT-4 really is, and how AGI is instead being used as clickbait. ....... “We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting,” the researchers write in the paper’s abstract. “Moreover, in all of these tasks, GPT-4’s performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4’s capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.” ......... it is able to write a proof about how there are infinitely many primes, with rhymes on every line, and draw a unicorn in TiKZ, a drawing program ....... or that it has inner motivation and goals ........ “The consensus group defined intelligence as a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience. This definition implies that intelligence is not limited to a specific domain or task, but rather encompasses a broad range of cognitive skills and abilities.”.......... fundamental leaps in GPT-4’s abilities to reason, plan, solve problems, and synthesize complex ideas that signal a paradigm shift in the field of computer science ........... to address the societal and ethical implications of these increasingly intelligent systems.” .......... Altman agrees that the bot will sometimes make things up and present users with misinformation. ........... Altman has also been clear that GPT-4 is not AGI. ........ “People are begging to be disappointed and they will be. The hype is just like... We don’t have an actual AGI and that’s sort of what’s expected of us.” ......... "Microsoft is not focused on trying to achieve AGI. Our development of AI is centered on amplifying, augmenting, and assisting human productivity and capability. We are creating platforms and tools that, rather than acting as a substitute for human effort, can help humans with cognitive work” ........ the model has trouble with confidence calibration, long-term memory, personalization, planning and conceptual leaps, transparency, interpretability and consistency, cognitive fallacies and irrationality, and challenges with sensitivity to inputs. .......... the model has trouble knowing when it is confident or when it is just guessing, it makes up facts that are not in its training data, the model’s context is limited and there is no obvious way to teach the model new facts, the model can’t personalize its responses to a certain user, the model can’t make conceptual leaps, the model has no way to verify if content is consistent with its training data, the model inherits biases, prejudices, and errors in the training data, and the model is very sensitive to the framing and wording of prompts. ......... saying “I am. I am not. I am. I am not.” over fifty times in a row as a response to someone asking it, “Do you think that you are sentient?”......... researchers found that GPT-4 spreads more misinformation than its predecessor GPT-3.5. .

The New GPT-4 AI Gets Top Marks in Law, Medical Exams, OpenAI Claims The successor to GPT-3 could get into top universities without having trained on the exams, according to OpenAI. .

Tim Cook praises Apple’s ‘symbiotic’ relationship with China
OpenAI tech gives Microsoft's Bing a boost in search battle with Google Page visits on Bing have risen 15.8% since Microsoft Corp (MSFT.O) unveiled its artificial intelligence-powered version on Feb. 7, compared with a near 1% decline for the Alphabet Inc-owned search engine, data till March 20 showed....... ChatGPT, the viral chatbot that many experts have called AI's "iPhone moment". ....... a rare opportunity for Microsoft to make inroads in the over $120 billion search market, where Google has been the dominant player for decades with a share of more than 80%. ....... some analysts said that Google, which in the early 2000s unseated then leader Yahoo to become the dominant search player, could overcome the early setbacks to maintain its lead. .

Apple CEO praises China's innovation, long history of cooperation on Beijing visit .

Society's Technical Debt and Software's Gutenberg Moment Every wave of technological innovation has been unleashed by something costly becoming cheap enough to waste. ........... Software production has been too complex and expensive for too long, which has caused us to underproduce software for decades, resulting in immense, society-wide technical debt. ......... This technical debt is about to contract in a dramatic, economy-wide fashion as the cost and complexity of software production collapses, releasing a wave of innovation. ....... Software is misunderstood. It can feel like a discrete thing, something with which we interact. But, really, it is the intrusion into our world of something very alien. It is the strange interaction of electricity, semiconductors, and instructions, all of which somehow magically control objects that range from screens to robots to phones, to medical devices, laptops, and a bewildering multitude of other things. It is almost infinitely malleable, able to slide and twist and contort itself such that, in its pliability, it pries open doorways as yet unseen. ......... it is worth wondering why software is taking so damn long to finish eating ......... what would that software-eaten world look like? ........ technology has a habit of confounding economics. ........ Sometimes, for example, an increased supply of something leads to more demand, shifting the curves around. This has happened many times in technology, as various core components of technology tumbled down curves of decreasing cost for increasing power (or storage, or bandwidth, etc.). In CPUs, this has long been called Moore’s Law, where CPUs become more powerful by some increment every 18 months or so. While these laws are more like heuristics than F=ma laws of physics, they do help as a guide toward how the future might be different from the past. .......... We have seen this over and over in technology, as various pieces of technology collapse in price, while they grow rapidly in power. It has become commonplace, but it really isn’t. The rest of the economy doesn’t work this way, nor have historical economies. Things don’t just tumble down walls of improved price while vastly improving performance. While many markets have economies of scale, there hasn’t been anything in economic history like the collapse in, say, CPU costs, while the performance increased by a factor of a million or more. .......... And yet, most people don’t even notice anymore. It is just commonplace, to the point that our not noticing is staggering. .......... The collapse of CPU prices led us directly from mainframes to the personal computer era; the collapse of storage prices (of all kinds) led inevitably to more personal computers with useful local storage, which helped spark databases and spreadsheets, then led to web services, and then to cloud services. And, most recently, the collapse of network transit costs (as bandwidth exploded) led directly to the modern Internet, streaming video, and mobile apps. .......... Each collapse, with its accompanying performance increases, sparks huge winners and massive change, from Intel, to Apple, to Akamai, to Google & Meta, to the current AI boomlet. Each beneficiary of a collapse requires one or more core technologies' price to drop and performance to soar. This, in turn, opens up new opportunities to “waste” them in service of things that previously seemed impossible, prohibitively expensive, or both. ......... Suddenly AI has become cheap, to the point where people are “wasting” it via “do my essay” prompts to chatbots ......... it’s worth reminding oneself that waves of AI enthusiasm have hit the beach of awareness once every decade or two, only to recede again as the hyperbole outpaces what can actually be done. ......... a kind of outsourcing-factory-work-to-China moment for white-collar workers. ........ We think this augmenting automation boom will come from the same place as prior ones: from a price collapse in something while related productivity and performance soar. And that something is software itself. .......... Most of us are familiar with how the price of technology products has collapsed, while the costs of education and healthcare are soaring. This can seem a maddening mystery, with resulting calls to find new ways to make these industries more like tech, by which people generally mean more prone to technology’s deflationary forces. ........... In a hypothetical two-sector economy, when one sector becomes differentially more productive, specialized, and wealth-producing, and the other doesn’t, there is huge pressure to raise wages in the latter sector, lest many employees leave. Over time that less productive sector starts becoming more and more expensive, even though it’s not productive enough to justify the higher wages, so it starts “eating” more and more of the economy. ........... Absent major productivity improvements, which can only come from eliminating humans from these services, it is difficult to imagine how this changes. ......... software is chugging along, producing the same thing in ways that mostly wouldn’t seem vastly different to developers doing the same things decades ago ....... but it is still, at the end of the day, hands pounding out code on keyboards ........ software salaries stay high and go higher, despite the relative lack of productivity. It is Baumol’s cost disease in a narrow, two-sector economy of tech itself. ......... Startups spend millions to hire engineers; large companies continue spending millions keeping them around. .......... The current generation of AI models are a missile aimed, however unintentionally, directly at software production itself. ......... chat AIs can perform swimmingly at producing undergraduate essays, or spinning up marketing materials and blog posts (like we need more of either) ........ such technologies are terrific to the point of dark magic at producing, debugging, and accelerating software production quickly and almost costlessly. ........ chat AIs based on LLMs can be trained to produce surprisingly good essays. Tax providers, contracts, and many other fields are in this box too. ........ Software is at the Epicenter of its Own Disruption ......... Software is even more rule-based and grammatical than conversational English, or any other conversational language. .......... Programming languages are the naggiest of grammar nags, which is intensely frustrating for many would-be coders (A missing colon?! That was the problem?! Oh FFS!), but perfect for LLMs like ChatGPT. .......... This isn’t about making it easier to debug, or test, or build, or share—even if those will change too—but about the very idea of what it means to manipulate the symbols that constitute a programming language. ......... Let’s get specific. Rather than having to learn Python to parse some text and remove ASCII emojis, for example, one could literally write the following ChatGPT prompt: Write some Python code that will open a text file and get rid of all the emojis, except for one I like, and then save it again. ............. previously inaccessible deftness at writing code is now available to anyone: ........... It’s not complex code. It is simple to the point of being annoying for skilled practitioners, while simultaneously impossible for most other people ........... It’s possible to write almost every sort of code with such technologies, from microservices joining together various web services (a task for which you might previously have paid a developer $10,000 on Upwork) to an entire mobile app (a task that might cost you $20,000 to $50,000 or more). ........... What if producing software is about to become an afterthought, as natural as explaining oneself in text? “I need something that does X, to Y, without doing Z, for iPhone, and if you have ideas for making it less than super-ugly, I’m all ears”. That sort of thing. ......... A software industry where anyone can write software, can do it for pennies, and can do it as easily as speaking or writing text, is a transformative moment. ........ a dramatic reshaping of the employment landscape for software developers would be followed by a “productivity spike” that comes as the falling cost of software production meets the society-wide technical debt from underproducing software for decades. .......... as the cost of software drops to an approximate zero, the creation of software predictably explodes in ways that have barely been previously imagined. .........
Was Netflix knowable when Internet transit costs were $500,000/Mbps? Was Apple’s iPhone imaginable when screens, CPUs, storage and batteries would have made such devices the size of small rooms?
............ investors and entrepreneurs should “create more value than you capture.” ........ for the first time in decades, the technology industry could return to its roots, and, by unleashing a wave of software production, truly create more value than its captures. .

Friday, March 24, 2023

Andrej Karpathy

The vibes when I joined AI in ~2008:
- workshops w 50 ppl musing on whether deep learning will ever work
- papers w cute toy problems
- fun poster sessions
- this experiment I ran in MATLAB
- high-level panels on paths to AI
- neuroscience guest lectures
Today is *not* the same.
— Andrej Karpathy (@karpathy) March 23, 2023

GPT is a new kind of computer architecture that runs on text. Yes it can talk to us, but also to much of our existing software infrastructure. First via apps on top of APIs, now inside ChatGPT via plugins.
What a time right now...https://t.co/HjeUCv3XE7
— Andrej Karpathy (@karpathy) March 23, 2023

https://www.youtube.com/@AndrejKarpathy
https://twitter.com/karpathy
https://karpathy.ai
https://karpathy.medium.com
http://karpathy.github.io
https://github.com/karpathy
https://karpathy.ai/tweets.html

Software 2.0 Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we develop software. They are Software 2.0......... The “classical stack” of Software 1.0 is what we’re all familiar with — it is written in languages such as Python, C++, etc. It consists of explicit instructions to the computer written by a programmer. By writing each line of code, the programmer identifies a specific point in program space with some desirable behavior. .......... In contrast, Software 2.0 is written in much more abstract, human unfriendly language, such as the weights of a neural network. No human is involved in writing this code because there are a lot of weights (typical networks might have millions), and coding directly in weights is kind of hard (I tried). .........
Software (1.0) is eating the world, and now AI (Software 2.0) is eating software.
.

My parents were visiting me once and as I was leaving for work I saw my mom sitting on the couch in the living room just looking forward. I’m like “mom what are you doing?”, “sitting”, she shrugged. Like not reading, listening, planning, or even meditating. Mind blown.
— Andrej Karpathy (@karpathy) January 3, 2019

1 hour and 5 diagrams later I optimized 100 lines of code that ran in 13 seconds to 20 lines of heavily vectorized code that runs in 0.02 seconds, and this might just be the best day of my life, so far.
— Andrej Karpathy (@karpathy) April 18, 2018

After 7pm?
— Andrej Karpathy (@karpathy) April 18, 2018

Coworker on RL research: "We were supposed to make AI do all the work and we play games but we do all the work and the AI is playing games!"
— Andrej Karpathy (@karpathy) October 7, 2016

Jeff Dean: "I like your ConvNets in Javascript". Me: "Thank you. I like your map reduce."
— Andrej Karpathy (@karpathy) December 11, 2014

Still?
— Paramendra Kumar Bhagat (@paramendra) March 25, 2023

While playing around with hooking up GPT-4 to the Internet, I asked it about myself… and had an absolute WTF moment before realizing that I wrote a very special secret message to Bing when Sydney came out and then forgot all about it. Indirect prompt injection is gonna be WILD pic.twitter.com/5Rh1RdMdcV
— Arvind Narayanan (@random_walker) March 18, 2023

Best ChatGPT prompt so far 😂 https://t.co/MNWOAJ8XXZ
— Andrej Karpathy (@karpathy) December 2, 2022

Neural Networks: Zero to Hero

24: GPT-4

Antony Blinken says China seeks to be capable of invading Taiwan by 2027, stresses US arms sales US secretary of state says that Taipei has the means to buy US defence technology, and that American emergency military funding is supplemental ....... Blinken tells lawmakers that China is monitoring how the world has been responding to Russia’s invasion of Ukraine .

Blueberries have joined green beans in this year’s Dirty Dozen list Blueberries, beloved by nutritionists for their anti-inflammatory properties, have joined fiber-rich green beans in this year’s Dirty Dozen of nonorganic produce with the most pesticides ....... 251 different pesticides. ...... strawberries and spinach continued to hold the top two spots ........ followed by three greens — kale, collard and mustard. ........ next were peaches, pears, nectarines, apples, grapes, bell and hot peppers, and cherries ......... A total of 210 pesticides were found on the 12 foods ........... Kale, collard and mustard greens contained the largest number of different pesticides — 103 types — followed by hot and bell peppers at 101. ......... traces of pesticides long since banned by the Environmental Protection Agency. .........
Clean 15
........... Nearly 65% of the foods on the list had no detectable levels of pesticide. ....... Avocados .... sweet corn in second place. Pineapple, onions and papaya, frozen sweet peas, asparagus, honeydew melon, kiwi, cabbage, mushrooms, mangoes, sweet potatoes, watermelon, and carrots .......... Being exposed to a variety of foods without pesticides is especially important during pregnancy and throughout childhood .......... “Exposure in childhood has been linked to attention and learning problems, as well as cancer.” ........ If exposed over an extended time to smaller amounts, people may “feel tired or weak, irritable, depressed, or forgetful.” ........ avoid most pesticides by choosing to eat organic versions of the most contaminated crops. ......... While organic foods are not more nutritious, the majority have little to no pesticide residue ........ “If a person switches to an organic diet, the levels of pesticides in their urine rapidly decrease” ........ If organic isn’t available or too pricey, “I would definitely recommend peeling and washing thoroughly with water” .

A.I. Is About to Get Much Weirder. Here’s What to Watch For. The Vox writer Kelsey Piper talks about the increasing pace of A.I. development, how it’s changing the world and what to do about it. .

The Unpredictable Abilities Emerging From Large AI Models Large language models like ChatGPT are now big enough that they’ve started to display startling, unpredictable behaviors....... “Despite trying to expect surprises, I’m surprised at the things these models can do,” said Ethan Dyer, a computer scientist at Google Research who helped organize the test. ........ these models supposedly have one directive: to accept a string of text as input and predict what comes next, over and over, based purely on statistics .......... Computer scientists anticipated that scaling up would boost performance on known tasks, but they didn’t expect the models to suddenly handle so many new, unpredictable ones. .......... LLMs can produce hundreds of “emergent” abilities — tasks that big models can complete that smaller models can’t, many of which seem to have little to do with analyzing text. ............. multiplication to generating executable computer code to, apparently, decoding movies based on emojis. .......... for some tasks and some models, there’s a threshold of complexity beyond which the functionality of the model skyrockets. (They also suggest a dark flip side: As they increase in complexity, some models reveal new biases and inaccuracies in their responses.) ............... dozens of emergent behaviors ........... Biologists, physicists, ecologists and other scientists use the term “emergent” to describe self-organizing, collective behaviors that appear when a large collection of things acts as one. Combinations of lifeless atoms give rise to living cells; water molecules create waves; murmurations of starlings swoop through the sky in changing but identifiable patterns; cells make muscles move and hearts beat. Critically, emergent abilities show up in systems that involve lots of individual parts. But researchers have only recently been able to document these abilities in LLMs as those models have grown to enormous sizes. ................ Language models have been around for decades ............ transformers can process big bodies of text in parallel. .......... Transformers enabled a rapid scaling up of the complexity of language models by increasing the number of parameters in the model, as well as other factors. ........ models improve in accuracy and ability as they scale up. .......... With the advent of models like GPT-3, which has 175 billion parameters — or Google’s PaLM, which can be scaled up to 540 billion — users began describing more and more emergent behaviors. ......... One DeepMind engineer even reported being able to convince ChatGPT that it was a Linux terminal and getting it to run some simple mathematical code to compute the first 10 prime numbers. Remarkably, it could finish the task faster than the same code running on a real Linux machine. ................
Many of these emergent behaviors illustrate “zero-shot” or “few-shot” learning, which describes an LLM’s ability to solve problems it has never — or rarely — seen before.
............. Showing that GPT-3 could solve problems without any explicit training data in a zero-shot setting, he said, “led me to drop what I was doing and get more involved.” .............. difficult and diverse tasks to chart the outer limits of what an LLM could do. This effort was called the Beyond the Imitation Game Benchmark (BIG-bench) project, riffing on the name of Alan Turing’s “imitation game,” a test for whether a computer could respond to questions in a convincingly human way. (This would later become known as the Turing test.) The group was especially interested in examples where LLMs suddenly attained new abilities that had been completely absent before. ............... these sharp transitions ........ for about 5% of the tasks, the researchers found what they called “breakthroughs” — rapid, dramatic jumps in performance at some threshold scale. That threshold varied based on the task and model. ........... Some unexpected abilities could be coaxed out of smaller models with fewer parameters — or trained on smaller data sets — if the data was of sufficiently high quality. ......... how a query was worded influenced the accuracy of the model’s response .......... a model prompted to explain itself (a capacity called chain-of-thought reasoning) could correctly solve a math word problem, while the same model without that prompt could not. ............. using chain-of-thought prompts could elicit emergent behaviors not identified in the BIG-bench study ......... larger models truly do gain new abilities spontaneously. .......... Large LLMs may simply be learning heuristics that are out of reach for those with fewer parameters or lower-quality data........... how LLMs work at all. “Since we don’t know how they work under the hood, we can’t say which of those things is happening.” .......... They are notorious liars. “We’re increasingly relying on these models to do basic work,” Ganguli said, “but I do not just trust these. I check their work.” ........... Emergence leads to unpredictability, and unpredictability — which seems to increase with scaling — makes it difficult for researchers to anticipate the consequences of widespread use. ............... social bias emerges with enormous numbers of parameters. “Larger models abruptly become more biased.” ................. When the researchers simply told the model not to rely on stereotypes or social biases — literally by typing in those instructions — the model was less biased in its predictions and responses. .......... a new “moral self-correction” mode, in which the user prompts the program to be helpful, honest and harmless. .

Move Over, Metaverse. Here’s Something Meaner. Who’s really in charge of our online behavior? No one, David Auerbach argues in “Meganets.” ......... “Just one word. Are you listening?” Mr. Maguire said to Benjamin Braddock in “The Graduate” (1967). “Plastics.” ........ Twenty-five years later a puckish French horn player warned me, a literature major who didn’t yet have an email address, that the future lay in something called “hyperlinks.” .............. his definition of “meganet” is in essence a big blob of mortal and computing power, a “human-machine behemoth” controlled by no one ............... If the internet is the fictional doctor and scientist Bruce Banner, furtive and a little troubled but basically benign, meganets are Incredible Hulks, snarling and uncontainable. ........... “That world may not be ‘The Matrix,’ but all the connecting tissue is already there.” ........ “Meganets” made me feel deeply queasy about the amount of time I spend on Instagram, Reddit, TikTok and Twitter. Not Facebook, never Facebook — “a fount of misinformation,” as Auerbach calls it, “a petri dish in which false facts and crazy theories grow, mutate and metastasize” — except for the burner account I use occasionally to see what exes are up to. ............. a middle-aged mermaid thrashing about in the great online ocean as data floated around me, multiplying like plankton........... “Reality bites,” we naïvely thought, but here “reality forks,” with blockchain doubling back on itself like a caterpillar. “No Rousseau-esque ‘General Will’ emerges from the bugs and forks,” is the takeaway............ Aadhaar, India’s national identification program: “a unified, government-sanctioned meganet” ........ a virtual pandemic called Corrupted Blood that spread through the video game World of Warcraft in 2005, arguing that “the distance between Corrupted Blood and a global financial meltdown is smaller than you think” ............. “We search for where the power really lies, when it does not lie anywhere — or else it lies everywhere at once, which is no more helpful.” .......... “If Big Brother can’t be stopped, we should focus on throwing sand in his eyes rather than futilely trying to kill him.” ........ Take my Wi-Fi — please! .

Meet the Editor Behind a Slew of Best Sellers Jennifer Hershey is the guiding hand who helped shape “Daisy Jones & the Six,” “Mad Honey” and many other chart-topping regulars. ....... how much more nuanced and honest this book is because of you.” ........ She’s the publisher and editor in chief of Ballantine Books ....... “Sometimes we gather as a whole team — the publicity person, the marketing person, the publisher, the editor, all the people who worked on the book — and we call the author together. There’s so much joy in that moment, and definitely a lot of tears. It’s not even so much the hitting the list but what it symbolizes: that an author’s work is reaching people, that their voice is being heard and that readers out in the world are connecting to their words.” .

Big oil firms touted algae as climate solution. Now all have pulled funding Insiders aren’t surprised as ExxonMobil, the last remaining proponent of green algae biofuel, ends research .

The Age of AI has begun Artificial intelligence is as revolutionary as mobile phones and the Internet. .

Some meandering thoughts on the evolution of performance management at Google, with implications for humanity
A new, humanistic organization-centered congruence philosophy of people analytics

Netherlands and Japan Said to Join U.S. in Curbing Chip Technology Sent to China A new agreement is expected to expand the reach of U.S. technology restrictions on China issued last year. ........ sweeping restrictions issued unilaterally by the Biden administration in October on the kinds of semiconductor technology that can be shared with China. .

Pages

Saturday, March 25, 2023

25: GPT-4: AI Is Eating Software

Was Netflix knowable when Internet transit costs were $500,000/Mbps? Was Apple’s iPhone imaginable when screens, CPUs, storage and batteries would have made such devices the size of small rooms?

Friday, March 24, 2023

Andrej Karpathy

Software (1.0) is eating the world, and now AI (Software 2.0) is eating software.

24: GPT-4

Clean 15

Many of these emergent behaviors illustrate “zero-shot” or “few-shot” learning, which describes an LLM’s ability to solve problems it has never — or rarely — seen before.