Perspective on Risk Jan. 25, 2023 - Technology Implications
ChatGPT (and its Successors) Is Transformative; 2nd Order Effects; Deskilling Knowledge Work; Anthropomorphizing A.I.; Technology Peaked When You Were Born; Worrying About AI; The EU AI Act
I promised I had a few things in the hopper. Finished up a further post on the transformative nature of Large Language Models (LLMs); I hope this builds on what I’ve written in the past; I think it does. I have another large post coming in a week or so following up on the Poszar decoupling hypothesis, and maybe a month from now I hope to publish a piece on demographics. Interspersed will be the usual, more practical, risk management posts.
ChatGPT (and its Successors) Is Transformative
AI Passes Law and Economics Exam
From Marginal Revolution: AI Claude Passes Law and Economics Exam.
The Claude AI from Anthropic earned a marginal pass on a recent GMU law and economics exam! Graded blind. Claude is a competitor to GPT3 and in my view an improvement.
ChatGPT is a Published Researcher
ChatGPT listed as author on research papers (Nature)
At least four articles credit the AI tool as a co-author
ChatGPT is one of 12 authors on a preprint1 about using the tool for medical education, posted on the medical repository medRxiv in December last year.
Relatedly, Nature Is Setting Out Ground Rules For AIs Use
Tools such as ChatGPT threaten transparent science; here are our ground rules for their use.
First, no LLM tool will be accepted as a credited author on a research paper. That is because any attribution of authorship carries with it accountability for the work, and AI tools cannot take such responsibility.
Second, researchers using LLM tools should document this use in the methods or acknowledgements sections. If a paper does not include these sections, the introduction or another appropriate section can be used to document the use of the LLM.
ChatGPT Can (Probably) Pass the United States Medical Licensing Exam
ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement.
ChatGPT Can’t Pass The Bar (in its first attempt)
In GPT Takes the Bar Exam, the authors observe:
GPT-3.5 achieves a headline correct rate of 50.3% on a complete NCBE MBE practice exam, significantly in excess of the 25% baseline guessing rate, and performs at a passing rate for both Evidence and Torts. GPT-3.5’s ranking of responses is also highly correlated with correctness; its top two and top three choices are correct 71% and 88% of the time, respectively, indicating very strong non-entailment performance. … [T]hese results strongly suggest that an LLM will pass the MBE component of the Bar Exam in the near future.
ChatGPT Can Automate Much of What Professors Do Today?
Prof. Mollick in The Mechanical Professor (One Interesting Thing) asks ChatGPT to try and do his job:
Rather than automating jobs that are repetitive & dangerous, there is now the prospect that the first jobs that are disrupted by AI will be more analytic; creative; and involve more writing and communication.
To demonstrate why I think this is the case, I wanted to see how much of my work an AI could do right now. And I think the results will surprise you.
Create a syllabus for a 12 session MBA-level introduction to entrepreneurship class, and provide the first four sessions. For each, include readings and assignments, as well as a summary of what will be covered. Include class policies at the end.
Could you create a final assignment, to create a business plan in teams. Show a table of the business plan elements and information on how many points each are worth, and how they will be evaluated
Write the first part of the lecture for the second class. Include details. also a topical example. Have a warm tone.
I want to write an academic review paper on why crowdfunding can help entrepreneurship. Write me the introduction in an academic style for a top management journal. Explain why crowdfunding is a context that generalizes to the study of venture-backed companies, and what theories it can help explore.
Create STATA code reshaping a dataset. The stem of the variable to be converted from wide to long is truthly. famid is the unique identifier for records in their wide format. reshape the suffix of truthly into a variable called year
Write an opinion piece calling for crowdfunding to be deregulated.
He concludes:
While not nearly as good as a human professor at any task (please note, school administrators), and with some clear weaknesses, it can do a shocking amount right now. … Think of it like having an intern, but one who just happens to work instantaneously, can write both code and solid descriptive writing, and has a large chunk of the world’s knowledge in their brain.
Click through to see the actual results.
In another post, All my classes suddenly became AI classes,
All of my classes have become AI classes. And I wanted to share with you the experiments I am running to integrate AI into class.
This should significantly reduce the demand for entry-level consultants. ChatGPT writes significantly better than the average Bank Examiner too. It will level the playing field for folks who have not spoken English as their primary language.
AI Can Probably Replace the Average Programmer
Competition-Level Code Generation with AlphaCode (arxiv)
AlphaCode performs roughly at the level of the median competitor" in coding contests by using a purely data-driven approach.
Patients Trust ChatGPT to Answer Less Complex Medical Questions
From Putting ChatGPT's Medical Advice to the (Turing) Test
On average, responses toward patients' trust in chatbots' functions were weakly positive (mean Likert score: 3.4), with lower trust as the health-related complexity of the task in questions increased.
ChatGPT responses to patient questions were close to indistinguishable from provider responses. Laypeople appear to trust the use of chatbots to answer lower risk health questions.
ChatGPT Would Get a ‘B’ on Wharton’s Operations Management Final Exam
Wharton’s Christian Terwiesch fed ChatGPT the final exam questions from an Operations Management course in Would Chat GPT Get a Wharton MBA? A Prediction Based on Its Performance in the Operations Management Course. He then graded the responses and summarized his findings thusly:
it does an amazing job at basic operations management and process analysis questions including those that are based on case studies. Not only are the answers correct, but the explanations are excellent.
Second, Chat GPT at times makes surprising mistakes in relatively simple calculations at the level of 6th grade Math. These mistakes can be massive in magnitude.
Third, the present version of Chat GPT is not capable of handling more advanced process analysis questions, even when they are based on fairly standard templates. This includes process flows with multiple products and problems with stochastic effects such as demand variability.
Finally, Chat GPT is remarkably good at modifying its answers in response to human hints. In other words, in the instances where it initially failed to match the problem with the right solution method, Chat GPT was able to correct itself after receiving an appropriate hint from a human expert.
Considering this performance, Chat GPT would have received a B to B- grade on the exam.
Thinking About 2nd Order Effects
Generative AI and The Future of Work (Maestro’s Musings)
There are disruptive first-order effects of generative AI that we’re already seeing play out, particularly relevant to the future of work. New, AI-powered word processors like Jasper and Lex are promising to help you write faster and even cure writer’s block by using generative AI to suggest the next paragraph as you write. The old guard is taking notice: Microsoft plans to integrate OpenAI’s models into their Office products. Similarly, Shutterstock recently announced it will also integrate AI-generated stock “photos” from DALL-E 2 in its search results, angering the content creators who power their platform (and as it turned out, also helped train DALL-E 2). The promise of these new technologies at helping people be more productive is exciting, but they also raise thorny questions about trust, intellectual property, and plagiarism that we’re just starting to grapple with.
However, lately, I’ve been thinking about some second-order effects that may be less obvious.
AI is more likely to transform content-creation jobs rather than replace them. Instead of completely replacing human workers, AI will augment their abilities and automate certain tasks, allowing them to focus on higher-level, more creative, and more strategic work.
there are two market forces that seem almost certain with respect to content: (1) the amount and quality of per-capita content produced by organizations will increase, and (2) how people value this content will qualitatively change.
Streamlining Routine Tasks
In the near future, we will start to see large language models in the workplace as the first station in a knowledge work assembly line—as a sort of conversational routing layer that can hand off your requests to the right internal human + AI systems.
Reducing the Frictions Between Systems
Any organization is a collection of interdependent systems—systems of people, systems of software, systems of operations, and systems of processes. When different systems interact, there is always some degree of friction that can cause delays, errors, and inefficiencies, ultimately impacting the overall performance and success of the organization.
Widespread adoption of generative AI will act as a lubricant between systems, reducing friction and improving the ease with which work moves across systems.
For example, if your team is talking about a task in Slack, AI will be able to synthesize that discussion and automatically update your task-tracking system. Or when the marketing team notices a bug with similar symptoms to the one you’re trying to solve in a different part of the product, you should get notified about it since it’s relevant to your current task.
Deskilling Knowledge Work
In the Jan 17 Perspective, towards the end I highlighted a paper1 that discusses how the distribution of both productivity and wages had widened. I posited that this might be the link between the two camps on productivity, and wondered:
Some high-skilled workers are becoming more empowered, but perhaps a larger portion of the mid-distribution population are being displaced and are employed in lower-productivity (service?) jobs, resulting in an aggregate decrease in consumption due to returns accruing to those with a lower marginal propensity to consume?
Ashwin Parameswaran has written an interesting piece: Artificial Intelligence, and Deskilling
LLMs and recent developments in AI will cause a gradual deskilling of the average knowledge worker in the same way that automation in other domains has already deskilled workers in many other domains over the last century.
He argues that by automating away the grunge work, we remove the ability for the novice to gain expertise:
it is precisely by doing mundane, simple and somewhat repetitive tasks that most people acquire the skills to become an expert in their domain. Automation and AI deny this opportunity to the novice human operator.
He states that this is nothing new, that that James Bright documented [this] in 1958 (via Harry Braverman’s ‘Labor and Monopoly Capital’).
But what IS new is that this trend is on the verge of moving up the chain to ‘knowledge work."‘
[M]achines can now perform not just codifiable/legible tasks but also the illegible tasks that are the bread and butter of the jobs that comprise the “knowledge economy”. What has already happened in quantitative domains where numbers are the output will now take place where words and images are the output.
Anthropomorphizing A.I.
You Perceive That Technology Peaked Around Your Date of Birth
Ethan Mollick highlighted a paper (that I haven’t read because $$) titled The Golden Age Is Behind Us: How the Status Quo Impacts the Evaluation of Technology. He summarizes their findings as:
A set of experiments shows that folks evaluate technologies developed after they were born as being worse than older ones, because newer innovations threaten their status quo
Now get off my lawn.
Worrying About AI
If I haven’t already bored you to death, and you crave more, there was a 3.5 hour debate amongst a list of lumineries. Here is the Youtube link.
Gary Marcus gives you a tease in An epic AI Debate—and why everyone should be at least a little bit worried about AI going into 2023 (The Road to AI We Can Trust)
What do Noam Chomsky, living legend of linguistics, Kai-Fu Lee, perhaps the most famous AI researcher in all of China, and Yejin Choi, the 2022 MacArthur Fellowship winner who was profiled earlier this week in The New York Times Magazine—and more than a dozen other scientists, economists, researchers, and elected officials—all have in common?
They are all worried about the near-term future of AI. The most worrisome thing of all? They are all worried about different things.
Each spoke last week at December 23’s AGI Debate (co-organized by Montreal.AI’s Vince Boucher and myself). No summary can capture all that was said (though Tiernan Ray’s 8,000 word account at ZDNet comes close)
The EU AI Act
Noah Smith Gives Reason For Optimism in 2023
Techno-optimism for 2023 (Noahpinion)
I see lots of technological developments that are either changing our world in major ways already, or seem likely to change it soon.
The A.I. Breakout
The energy revolution rolls onward
The strange biotech boom