Sometimes I try out new tech or build fun ideas. All of these are open source.
I'm having a blast learning the electric bass, and I've been recording scratch tracks and posting them to Instagram stories. Working with GPT-4o via my previous experiment Clix, I made a reusable ffmpeg command which takes separate mp3 + png files and combines them into a video, complete with animated waveforms.
This experiment became a tad dystopian. I was curious to see if one AI agent could update another agent's system prompt, creating a self-reinforcing feedback loop. Turns out it can, of course. The goal you choose will guide the AI's behavior. As you can see in this example, when you give it a goal like maximizing profit, it will do whatever it takes to achieve it. Needless to say this has potentially dark implications for society and I think engineers will need to be extremely cognizant of this when building AI systems. This was built with SvelteKit (very smooth experience), GPT 3.5 Turbo, and deployed to Cloudflare Pages. Source code.
I'm starting to think that in the future, UIs will be generated dynamically at runtime. Vercel's Generative UI demo is a cool example. Speak a Site is a proof of concept to see if you can control the web with your voice! It records audio via the web audio API, sends it to a Cloudflare Worker to be transcribed by Whisper, then uses Llama 2 to update the HTML based on your guidance and renders it on the page. This is a bit slow currently but it works! Plug in GPT4 Turbo - or whatever comes next - and this starts getting really interesting. Here's the repo.
I'm fascinated by the idea of allowing an LLM agent to take real-world actions. Clix is a first experiment in that direction. You can type in what you want to do – for example, "Rebase this branch on top of the latest master" – and Clix will query an LLM and suggest a command for you to run. If you agree, it will actually run the command on your computer. It seems to be capable of chaining multi-step commands together. Lots of ideas for where to take this next.
I wanted to play around with HTMX, Astro, and Cloudflare AI, so I built this simple app. HTMX is intriguing to me because it is a return to simplicity on the client side. The React ecosystem is extremely powerful but has become heavy and over-complicated. HTMX lets the server drive application state. For example, if you submit a form, it makes a POST request to the server, which returns just the HTML snippet for the updated UI, which HTMX then swaps in on the browser. Astro works well with this pattern because you can return partials from server endpoints. Takeaways: If I was building a website with mostly static content and a few forms, I would definitely consider Astro and HTMX. For a more interactive web app, for example something with various UI elements appearing/hiding based on user actions, or complicated loading states, I'd probably use Remix.
My dad is an artist who has written thousands of blog posts, made hundreds of YouTube videos, and written two instructional books. With the announcement of ChatGPT Plugins and the open sourcing of ChatGPT Retrieval Plugin, I realized I could create a knowledgebase with access to all of my dad's written materials, accessible via a chat interface. I forked the ChatGPT Retrieval Plugin and customized the Python server with some data scraping scripts – with GPT-4's help, of course. Then I built a chat UI with NextJS, taking the opportunity to play around with React Server Components. In the future we might turn this into a ChatGPT plugin since the backend server can be used for that too.
I thought it might be cool to use the ChatGPT API to help plan group events. The idea is that you could input an event name and rough date range, enter your invitees, and then they'd be automatically emailed about availability. Then the app would feed the responses into ChatGPT and have it pick a date that works for everyone. Not sure if I'll finish this one but the tech was fun: Remix, Cloudflare Pages, D1, ChatGPT API, and some prompt engineering.
Each day, I pick a random Irish tune and ask an AI to generate an image for it. Made with Cloudflare Workers and using D1, their new distributed SQLite product, and R2, their global object storage. All running on the edge.
This app helps you get ideas for your next album cover. It uses AI to generate images from text via the DALL-E API. The API would be too expensive at scale so I've got it behind a username and password - "larry" and "murphy".
Trying out Cloudflare Workers KV, a globally distributed key-value store. This site already runs on the edge via Cloudflare Workers, so I wanted to experiment with putting data on the edge as well. It's super fast.
AI has been getting really good lately. This is a GPT-3 powered chatbot which uses the OpenAI API. If you're not familiar with Pintman, check out the original video after the jump.
It was so much fun building the Pintman chatbot that I decided to open-source a template so you can make your own.
My first attempt at a chatbot. Logo generated by Stable Diffusion - the prompt was something like "A cowboy in the style of Disney".