I wanted to get back to non-AI content this week, but a few things launched on Thursday and Friday last week that I think needed to be shared:
Document to Presentation
Tome is an AI-powered presentation tool. On Thursday they launched Document-to-presentation with one click. I dropped in Friday’s essay, and it spit out a base presentation. Available here. It’s not “good”, but if I wanted to build a presentation based on that content (and I do! I need to present a version of it in May!), then it is a nice skeleton to get started from. And once started, why not use Tome’s other tools to finish the presentation (like natural language editing - “create a page about xxx”, “standard” AI editing tools - “re-write this headline”, automatic adjustments across platforms, video and image imbedding (with Dall-e integration), etc.). It would be very hard to disrupt Powerpoint, but they have a chance.
GPT-API
Open AI has launched plug-in APIs for Chat-GPT. This will allow connection between ChatGPT and other apps. There is a waitlist for direct access, and it is prioritizing people already subscribed to ChatGPT+ (which also gets you access to GPT-4). Once you have access you can connect it to a Zapier account and do things directly in your Chat-GPT window like:
Draft and send emails / Slack messages / etc
Find old messages
Search your note taking app: Notion / Roam / Reflect / Obsidian / etc and interact with the output
Add events or ask questions about your calendar (“When is my next flight?”)
Other services that are already connected:
Expedia
FiscalNote
Instacart
Klarna Shopping
Milo Family AI
OpenTable
Speak
Wolfram
It also has a code interpreter (see below) and a browser extension (so you can get real-time information). This is quickly turning GPT-Chat into a full personal assistant for ~$50/month…
Code Interpretation
The plug-ins already allow Chat-GPT to interact with things like WolframAlpha. This allows it to use specialized tools. No more worrying that the AI is bad at math (although GPT-4 is MUCH better) - now when asked a math question, ChatGPT just needs to know the right phrasing to feed the question to WolframAlpha and get the right answer to any number of decimal places.
Which leads to an OpenAI created plugin called “Code Interpreter”. This plug-in allows developers to work with ChatGPT to create and execute code while staying within the interface. Andrew Mayne has a blog describing some of the possibilities:
Generate songs (“generate xx and turn it into music. Save as a wav file”)
Generate QR codes for a website URL
Upload images and ask ChatGPT to analyze and edit them (“Find faces”, “convert to ASCII format”) (Chat GPT is NOT connected to Dall-e, so it cannot create images (yet), but it can do create simple animations (think about what someone could do on a home computer programing in basic inn 1982)
There is more at the link, including simulations of chess games, simple drawings, plotting moving data on graphs and more.
This is still very early days, but the applications are not coming slowly. Hard to focus on reach vs frequency when this stuff is dropping every day. But I will try to slip in some traditional marketing this week…
Keep it simple,
Edward
PS. I could not think of an image for today’s post so I asked GPT-4 what a good image would be for a post about APIs. It recommended a bridge between two “digital islands”. I plugged the idea into Midjourney and four pretty cool results (two of which are used in this post)
Great find with Tome! Sometimes having the skeleton you can build off of is the push needed to get into action.
I can imagine a world very soon where presentations become much more polished right out of the gate with AI.