u/starkruzr

UltraBridge: search, TODOs, high-quality handwriting recognition and AI interactivity for notes and TODOs, across devices and ecosystems for Supernote Private Cloud and Boox users
▲ 10 r/eink

UltraBridge: search, TODOs, high-quality handwriting recognition and AI interactivity for notes and TODOs, across devices and ecosystems for Supernote Private Cloud and Boox users

tl;dr: Here's the repository for the thing that lets you build an integrated search index and annotated TODO list across Boox and Supernote with handwriting recognition from any VLM you want to use: https://github.com/jdkruzr/ultrabridge and then use it with ChatGPT, Claude, etc. to help you get your shit together:

  • Sync TODOs to and from CalDAV and Supernote
  • Write in red on Boox Notes to create TODOs
  • Have TODOs and search results linked back to original documents
  • Let an LLM AI platform of your choice use an MCP server to work with your notes and TODO list (e.g. "hey Claude, what did I forget in my last two weeks of notes that was important? Cool, can you create a TODO for me to follow up on that one thing?")

Demo: https://youtu.be/O6lM1hBpkWg

Background/Introduction (WARNING: Long. Very long.)

Hello. My name is J, and I have an e-ink problem. [AUDIENCE: "HI, J"] Like probably quite a few of us in this sub, I have probably too many of these devices, although I think I can make a solid case to anyone that they really DO all have individual use cases in which some are uniquely suited for some things that others aren't.

Regardless, after diversifying platforms away from just Boox, it became clear to me that I needed some way of gluing their data together that I didn't currently have. Primarily, the things I wanted were:

  • Reliable cross-platform handwriting search
  • Cross-platform TODO functionality
  • AI RAG (retrieval-augmented generation, more on this later) across my notes to help me figure out what I might be missing

The Supernote Private Cloud Gets Me Thinking

Then Ratta came out with Supernote Private Cloud, and I got VERY excited, because a dive into that software made a few things clear:

  • They have no interest in "hiding" or gating anything away from their users
  • There were a ton of touchpoints especially in the database that made it easy to flip all kinds of interesting switches (TODOs especially)
  • API endpoints were easy to document

Building a CalDAV TODO endpoint against this (even though SPC TODOs basically recapitulate the RFC for VTODO, there is no such endpoint available from Ratta) was easy. But it also got me thinking about what else was possible. So I redid what I did for Boox last year: a global index across all notes of all text in handwriting, using a visual language model to do the transcription.

Flying Too Close to the Sun

Then I tried to reverse engineer the Supernote note format -- Supernote's on-device MyScript OCR leaves a lot to be desired; could I do better somehow? And discovered that if you reprocess .note files and edit their JIIX encoding you can inject text recognition data into notes. I tried this first with RTR notes which were designed to have the data in them but the device would just clobber the corrected data from the VLM with its own bad interpretation. Then I found out that if you inject Standard notes with recognition data, the device WILL INDEX IT FOR SEARCH ANYWAY. In other words, you can:

  • Create a Standard note
  • Have it upload to Private Cloud
  • Grab the note and run it through high quality OCR
  • Inject the OCR data into the Standard note
  • Push it back to the device
  • Go into the device's Search option from the drop-down bar, pick handwriting and search for a term in that Standard note
  • Whoomp
  • There it is

Now. Unfortunately this does not work consistently. There is something about how the on-device Search feature works where sometimes it will return these results and sometimes it won't. I'm hoping I can get Ratta to quasi-support this thing that theoretically shouldn't work because I would love to have just a couple things change to make it so that we can reliably introduce our own OCR to our Private Cloud installations generally -- but at least in the current incarnation of the note format this might not be possible. They've written before about how they're completely redoing their sync protocols and file formats, so it wouldn't even be worth it to support now. Still, a guy can dream! I have this option turned off by default in Settings in the application because it now breaks file sync and creates a bunch of CONFLICT files if the timing of file editing is wrong. So I just started using RTR notes again and letting UB do the higher quality text recognition for search in the application.

Why Not Zoidb^H^H^H^H^HBoox?

Adding Boox support was fairly straightforward:

  • Add a WebDAV endpoint for Booxen to auto-export .note files to
  • Parse the .note files for metadata and create a jpg to pass to the OCR pipeline out of the stroke data
  • Use more or less the same OCR pipeline we used for Supernote
  • The fun part: run it again looking for red ink and turn any passages in red ink into CalDAV TODOs

This last bit is great because we are still waiting for them to finish adding CalDAV VTODO integration after they updated Calendar Memo in firmware 4.2 to support VEVENT. Once we have it, the red ink thing will be sort of obsolete. Maybe I'll think of something different to do with it!

RAG: The Game-Changer?

This is the part I'm really excited about as someone with ADHD and a fair amount of executive dysfunction. The MCP server support the software has now makes it possible for you to have an LLM like ChatGPT or Claude "talk" to your notes database with RAG (retrieval-augmented generation), search and read, and then perform all the operations you'd want it to be able to do with your TODO list, thereby making the combination of your notes and your AI agent into a kind of executive assistant. This is basically a patch for my brain's broke-ass software, and it works INSANELY well. You can see the way it's supposed to work in the demo. It consists of an embedding model which can run on CPU and does the work of telling the LLM "this concept is related to this other concept" and the actual MCP server which presents tools to the LLM.

Installation

Ideally, you deploy this on the same server as your Supernote Private Cloud deployment. If you are Boox-only, that's fine too; you'll receive your note files via WebDAV from whatever devices you configure for sync. IMPORTANT: when you add your UltraBridge server as a WebDAV sync target, make sure you're configured to export as .note rather than as .pdf.

  • Clone from GitHub
  • Run install.sh, follow instructions
  • Set up your reverse proxy to the UltraBridge server the same way you would with the Supernote Private Cloud but to port 8443 (or whichever one you pick)
  • Go to Settings and configure your source(s) and options like RAG
  • Le fin.

Usage

Your tabs on the side navigate you around like you'd expect. You can get details on imported notes, start and stop the note processors, access your TODOs, etc. The CalDAV server, for those using CalDAV clients, lives (oddly enough) at /caldav. The WebDAV server (for receiving Boox exports) is at /webdav. The schnozzberries taste like schnozzberries.

Part of the problem with trying to explain why and how to use this thing is that I've been neck deep in it for months and so now everything is "obvious" to me. This is not helpful when trying to evangelize! Please ask questions, I need reality checks.

A Note on Software Development with AI

(Same note I put on PowerSearch)

Look, it's very difficult to ignore how good the robots are at writing code at this point. It helps a lot to use plugins that keep it on task and organized and force use of best practices like consistent test coverage and documentation. So I am pretty confident after extensive usage testing that this is good enough for an Alpha release.

However, usage of AI to develop software engenders some hesitation in me -- the only reason it is as good as it is is because it was trained with an enormous corpus of open source software, textbooks and other sources of truth. For that reason, I personally promise that nothing I create with AI will ever be anything other than fully open-source software itself, and I will never attempt to charge for its usage.

A Second Effort

So, this is my second time around trying something like this. The first time I had essentially zero software engineering experience and didn't really know how to design an app at all; I have always been a utility programmer who Knows Enough Python to Be Dangerous™ at best. The second time I had slightly more than zero experience. I am pretty sure I made much better choices this time -- e.g., this is basically middleware, and middleware is sort of what Golang is for in an existential sense -- but am sure smarter people than me have a lot of... constructive(?) criticism of this software. Feel free to post below!

u/starkruzr — 5 days ago