On Shave

I’ve been teasing you all with talk about my own transliteration tool for quite some time. I hadn’t released it, as there were some features I really wanted to have nailed before letting you guys get your hands on it.

The wait is over. I just released a web UI for my shave transliterator so you can give it a spin.

The Shave text-mode UI.

Shave is not like the other Shavian transliteration tools out there. Building on top of an extended version of Readlex, it takes inspiration from Dave Coffin’s python transliterator, and like his, it uses rules and heuristics to figure out words that aren’t in the dictionary, and it uses a machine learning engine to run part-of-speech (or POS) tagging, to help disambiguate words that are pronounced differently in different contexts, such as the different tenses of the verb to read, or the way the word project is pronounced entirely differently depending on whether it is a verb or a noun.

But it goes beyond this. I have built a custom ML model for the specific task of disambiguating the words that can’t be figured out from grammar alone. Words like lead, tear and bow. This Word Sense Disambiguator (WSD) is what really takes the tool to the next level: I’ve been using the latest one for a couple of weeks now, and it’s gotten to the stage where it more often than not gets it right.

I didn’t stop there. I realized that I could use the same techniques to tackle the much harder reverse transliteration problem: taking a text written in Shavian and converting it back to plain old regular English. I trained a network on the specific problem cases that occur in that direction: does 𐑲 mean I, eye or aye? Is it their, there or they’re? A cunning combination of looking for context in neighboring words and the custom trained neural network make the reverse transliteration mode very usable indeed.

But wait, there’s even more! The tool can be run interactively – giving you the ability to step in and correct it, or fill in Shavian versions of words it is not sure about. To make this work, every decision is flagged with a confidence indicator.

When using Shave interactively, you can filter the errors by confidence level. For a nice quick transliteration, set the confidence filter to zero, and it will fill in its best guess everywhere (or leave the word untouched if it really didn’t know what to do. Or you can take it to 100%, and review every single word that wasn’t a found verbatim in the dictionary (or a homograph). Up to you.

The plain text mode of the tool allows you to type (or paste) Shavian or Latin script text into the window, and see its transliteration appear live right next to it, and it gives you an intuitive way to step through the errors and correct them. You can add missing/unknown words to a custom dictionary which is stored locally in your browser cache.

There’s one other mode, and it is the one I’m most excited about: it’s an e-book converter. Upload any (legally owned) ePub file, and go to town with its interactive review system. Like in the plain text mode, you can filter by confidence level, and you can save unknown words to your custom dictionary. You can upload a custom book cover if you want, before downloading the converted ebook.

The Shave E-Book UI in action.

This is just the beginning. The tool is still under active development – it still has some weak spots that need ironing out, and no doubt you guys will find (and hopefully report!) a bunch of bugs. The command-line tool and native macOS & iOS apps, and Safari browser extensions will follow close on its heels, and I’ll probably put some or all of it on GitHub sooner or later too.

In the meantime: happy shave-ing! Don’t hesitate to let me know what you think, how you get on and whether you encounter any issues, either here, on Bluesky or in Discord.

-Joro


Posted

in

,

by

Comments

3 responses to “On Shave”

  1. ·𐑨­𐑤𐑦𐑒­𐑕𐑭𐑯­𐑛𐑼 Avatar
    ·𐑨­𐑤𐑦𐑒­𐑕𐑭𐑯­𐑛𐑼

    𐑞𐑦𐑕 𐑦𐑟 𐑦𐑒𐑕𐑲𐑑𐑦𐑙! 𐑲 𐑢𐑪𐑟 𐑮𐑰𐑕𐑩𐑯𐑑𐑤𐑦 𐑑𐑮𐑲𐑦𐑙 𐑑 𐑒𐑮𐑨𐑓𐑑 𐑩 ·𐑡𐑧𐑥𐑦𐑯𐑲 𐑡𐑧𐑥 𐑞𐑨𐑑 𐑒𐑫𐑛 𐑛𐑵 𐑣𐑲 𐑓𐑦𐑛𐑧𐑤𐑦𐑑𐑦 𐑑𐑮𐑨𐑯𐑟𐑤𐑦𐑑𐑼𐑱𐑖𐑩𐑯. 𐑦𐑑 𐑢𐑪𐑟 𐑥𐑪𐑛𐑼𐑩𐑑𐑤𐑦 𐑕𐑩𐑒𐑕𐑧𐑕𐑓𐑩𐑤, 𐑚𐑳𐑑 𐑲 𐑣𐑨𐑝 𐑯𐑪𐑑 𐑐𐑫𐑑 𐑦𐑑 𐑔𐑮𐑵 𐑦𐑑𐑕 𐑐𐑱𐑕𐑩𐑟 𐑘𐑧𐑑. 𐑲 𐑔𐑦𐑙𐑒 𐑞𐑦𐑕 𐑕𐑬𐑯𐑛𐑟 𐑥𐑳𐑗 𐑥𐑹 𐑨𐑒𐑕𐑧𐑕𐑦𐑚𐑩𐑤 𐑯 𐑛𐑦𐑑𐑼𐑥𐑦𐑯𐑦𐑕𐑑𐑦𐑒. 𐑚𐑮𐑭𐑝𐑴!

    1. Joro Avatar

      𐑦𐑟𐑩𐑯𐑑 𐑦𐑑 𐑡𐑳𐑕𐑑?

      𐑲 𐑒𐑨𐑯𐑪𐑑 𐑧𐑥𐑓𐑩𐑕𐑲𐑟 𐑦𐑯𐑳𐑓 𐑞𐑨𐑑 𐑞𐑦𐑕 𐑦𐑟 𐑴𐑯𐑤𐑦 𐑞 𐑚𐑦𐑜𐑦𐑯𐑦𐑙. 𐑦𐑑𐑕 𐑩 𐑚𐑰𐑑𐑩, 𐑯 𐑞 𐑧𐑯𐑡𐑦𐑯 𐑪𐑯 𐑞 𐑕𐑻𐑝𐑼 𐑦𐑟 𐑩 𐑕𐑯𐑨𐑐𐑖𐑪𐑑 𐑝 𐑩 𐑤𐑲𐑚𐑮𐑼𐑦 𐑞𐑨𐑑 𐑲𐑥 𐑕𐑑𐑦𐑤 𐑨𐑒𐑑𐑦𐑝𐑤𐑦 𐑛𐑦𐑝𐑧𐑤𐑩𐑐𐑦𐑙. 𐑥𐑲 𐑮𐑰𐑕𐑩𐑯𐑑 𐑓𐑴𐑒𐑩𐑕 𐑢𐑪𐑟 𐑪𐑯 𐑛𐑾𐑜𐑯𐑪𐑕𐑑𐑦𐑒𐑕, 𐑥𐑱𐑒𐑦𐑙 𐑞 𐑧𐑯𐑡𐑦𐑯𐑟 𐑛𐑦𐑕𐑦𐑠𐑩𐑯𐑟 𐑝𐑦𐑟𐑦𐑚𐑩𐑤 – 𐑯 𐑞𐑨𐑑𐑕 𐑢𐑪𐑑 𐑤𐑧𐑛 𐑑 𐑞 𐑦𐑯𐑑𐑼𐑨𐑒𐑑𐑦𐑝 ∘𐑿𐑦 𐑲 𐑮𐑦𐑤𐑰𐑕𐑑.

      𐑞 𐑧𐑯𐑡𐑦𐑯 𐑣𐑨𐑟 𐑦𐑒𐑕𐑑𐑧𐑯𐑕𐑦𐑚𐑩𐑤 𐑮𐑵𐑤𐑟 𐑯 𐑑𐑿𐑯𐑩𐑚𐑩𐑤 𐑢𐑱𐑑𐑕. 𐑲 𐑦𐑒𐑕𐑐𐑧𐑒𐑑 𐑑 𐑚𐑰 𐑱𐑚𐑩𐑤 𐑑 𐑑𐑿𐑯 𐑞𐑧𐑥 𐑥𐑳𐑗 𐑚𐑧𐑑𐑼 𐑦𐑯 𐑛𐑿 𐑒𐑹𐑕, 𐑷𐑤𐑥𐑴𐑕𐑑 𐑡𐑩𐑯𐑧𐑑𐑦𐑒 𐑨𐑤𐑜𐑼𐑦𐑞𐑥 𐑕𐑑𐑲𐑤.

      𐑢𐑳𐑯 𐑔𐑦𐑙 𐑦𐑟 𐑕𐑻𐑑𐑦𐑯, 𐑞𐑦𐑕 𐑤𐑲𐑚𐑮𐑼𐑦 𐑦𐑟 𐑥𐑹 𐑐𐑼𐑓𐑹𐑥𐑩𐑯𐑑 𐑞𐑨𐑯 𐑧𐑯𐑦 ∘𐑤𐑤𐑥 𐑢𐑦𐑤 𐑧𐑝𐑼 𐑚𐑰…

      𐑐𐑤𐑰𐑟 𐑐𐑤𐑱 𐑢𐑦𐑞 𐑦𐑑! 𐑤𐑧𐑑 𐑥𐑰 𐑯𐑴 𐑣𐑬 𐑿 𐑜𐑧𐑑 𐑪𐑯!

  2. Accompanying yesterday’s post announcing Shave I thought it might be interesting for some of you to learn a bit more about what is going on under […]

Leave a Reply

Your email address will not be published. Required fields are marked *