Russell Davies: Working notes on AI

Rod has a very useful habit of blogging his working notes on things: carbon emissions of AI, content design style guides, some UK fintech metrics.

In the same spirit these are some things I've read recently and found useful about 'AI'. It seems that what we're settling on calling it.

There's a spectrum of views on AI. Obvs. And I'd say I'm in box you might call 'sceptical/alarmed/mocking'. The views below are generally from a 'cautiously pro' box. Maybe 'pro but aware of the problems'. They're smart people, they're worth reading.

Maggie Appleton on benchmarks and Humanity's Last Exam. Ends with this thought:

"So, next time you hear someone making grand statements about AI capabilities (both critical and overhyped), ask: which model are they talking about? On what benchmark? With what prompting techniques? With what supporting infrastructure around the model? Everything is in the details, and the only way to be a sensible thinker in this space is to learn about the details."

Laurie Voss on what she's learned about writing AI apps:

"Is what you're doing taking a large amount of text and asking the LLM to convert it into a smaller amount of text? Then it's probably going to be great at it. If you're asking it to convert into a roughly equal amount of text it will be so-so. If you're asking it to create more text than you gave it, forget about it."

and

"Depending how much of the hype around AI you've taken on board, the idea that they "take text and turn it into less text" might seem gigantic back-pedal away from previous claims of what AI can do. But taking text and turning it into less text is still an enormous field of endeavour, and a huge market. It's still very exciting, all the more exciting because it's got clear boundaries and isn't hype-driven over-reaching, or dependent on LLMs overnight becoming way better than they currently are. Take a look at the possibilities, find something that fits within these boundaries, and then have fun with it."

via Simon Willison

Harper Reed's workflow for using AI.

Matt Webb on Model Context Protocol.

Benedict Evans on Deep Research "OpenAI’s Deep Research is built for me, and I can’t use it. It’s another amazing demo, until it breaks. But it breaks in really interesting ways"

And

"Stepping back, I feel ambivalent in writing this, because there are only so many times that I can say that these systems are amazing, but get things wrong all the time in ways that matter, and so the best uses cases so far are things where the error rate doesn’t matter or where it’s easy to see. It would be much easier just to say that these things are amazing and getting better all the time and leave it at that, or to claim that the error rate means these things are the biggest waste of time and money since NFTs. But exploring puzzlement, as I’m really doing here, seems more interesting.

And these things are useful. If someone asks you to produce a 20 page report on a topic where you have deep domain expertise, but you don’t already have 20 pages sitting in a folder somewhere, then this would turn a couple of days’ work into a couple of hours, and you can fix all the mistakes. I always call AI ‘infinite interns’, and there are a lot of teachable moments in what I’ve just written for any intern, but there’s also Steve Jobs’ line that a computer is ‘a bicycle for the mind’ - it lets you go further and faster for much less effort, but it can’t go anywhere by itself."