
wrote a phoenix liveview app that searches across youtube video transcripts and the real time search feels absurdly good
i work at a mid size marketing agency and we have about 200 youtube videos. client case study recordings, internal strategy sessions, conference talks from our founders, onboarding walkthroughs for new hires. all unlisted and shared through notion. nobody can find anything because the only way to search is by video title which is usually something useless like "Q3 strategy call sept 14."
i've been looking for an excuse to build something real in phoenix liveview so i used this.
the app is a single liveview page. search box at the top, results below. as you type, results update live through the socket. each result shows the video title, date, speaker, and a snippet of the transcript around the matching text with the match highlighted. click the result and it opens the youtube video.
the backend is postgres with full text search. tsvector on the transcript column, GIN index, ts_headline for the snippet extraction. the liveview handles the search with a debounce on the phx-change event so it's not hammering postgres on every keystroke. i set it to 250ms which feels right. fast enough that it seems instant but not so aggressive that it fires on every character.
for pulling the actual transcripts i use transcript api:
npx skills add ZeroPointRepo/youtube-skills --skill youtube-full
i wrote a mix task for ingestion. give it a youtube url and it pulls the transcript, parses it, and inserts it into the database. added a --file flag so i could point it at a text file with all 200 urls and let it run through them. the whole ingestion took maybe 3 minutes.
the thing that sold my coworkers on it was the liveview search. i demoed it in a meeting and people immediately started shouting out search terms to try. someone typed in a client name and found every video where that client was discussed. someone else searched for "attribution modeling" and found a conference talk from 2022 that nobody remembered existed.
the codebase is small. one liveview module, one context module with the search query, the mix task for ingestion, and two templates. maybe 300 lines of elixir total. deployed it on fly.io on the free tier since it's just internal and the traffic is light.
the part i keep coming back to is how well liveview fits this use case. server rendered search with live updates over websockets and zero javascript. the search box, the debounce, the result list, the highlighting, all just liveview. i would have needed react or vue for this in any other framework.