![]() # the easy speech API that doesn't need to upload into their file storage # if you want to play it safer, lower silence_thresh further # amount of audio you have to send over the wire. # quiet speech or decreasing accuracy, but it'll lower the # since we have to process the audio to convert it, we might Tgt_filename = original_format_filename + ".flac" With open(original_format_filename, "wb") as fp: Original_format_filename = os.path.join(gettempdir(), evt.event_id) # this bit is hacky/bad because pydub wants to work with files # but it also uses a regex that's unnecessary here # there is a way to filter message types in the bot framework, # I don't remember whether this method has to be named def handle_tombstone(self, evt: MessageEvent) -> None: _speech_context = speech.SpeechContext(phrases=) I threw in people's names and some other phrases I wanted to cue Google might be used - but it hasn't seemed to work all that well. # I've taken out the actual content here. # not sure if this is really right but don't care I will thus provide most of the systemd file I use 2 so you can see where I set the environment variable.įrom mautrix.types import EventType, MessageType You will need to set up a Google Cloud Platform account and enable the speech API for a project and get its relevant credentials in a file 1. I already had a setup of maubot on my server, and I vaguely recall its installation being pretty straightforward, so I won’t walk through that part. I recommend not enabling data logging, and putting up a disclaimer on any room with your bot enabled a la note: voice messages to this room will be transcribed by by sending the audio to Google's speech-to-text API data logging enabled. Now, let me get this out of the way: I am sending voice messages to Google which is obviously Bad in the way that sending big corporations your data is always bad. They’re just starting to introduce the latter into the major clients, so I thought I’d see if I could duct tape on automatic transcription. I want stickers and I want voice messages, Matrix. Yay, open source, yay, encrypted defaults, yay, decentralization, whatever, but I’m not just an engineer: I’m also a young-ish woman with demands of the tech I use with my friends. Google would kindly transcribe the messages so we could each emote and express with tone of voice whether or not the other could listen to it right at that second. ![]() However, if she responded with another voice message, I didn’t always have headphones on that could blast it over the street noise. Often it was far more convenient to send a voice message than to type with my thumbs. See, I was taking the bus a lot back then, a college student living far off-campus, and I liked messaging my mom while walking. Remember Allo? It was one of Google’s messaging product attempts, and it had the killer feature that all mobile messaging apps should have: My messaging app golden age is that of Allo.
0 Comments
Leave a Reply. |