[Hpr] How to check if the intro and outro are added

Carl D Hamann carl.hamann at gmail.com
Wed Dec 30 14:56:08 PST 2015


On Dec 30, 2015 3:17 PM, "Dave Morriss" <perloid at autistici.org> wrote:
> Using a recent HPR show in my podcast queue and running
> echoprint-codegen on the entire thing I found I got a chunk of JSON with
> metadata and a humongous fingerprint string.

After spending some time reading the sever-side code, I found out the
"fingerprint" is an encoded (details in an upcoming episode) list of
timestamped "onset events" from the audio, which is why the lengths are
correlated.

That list then has to be fuzzy (fuzzily?) matched against a candidate
(essentially by counting how many events it has in common and whether they
occur the same distance apart; again, more details to come).

> Then I started wondering how much you'd need to chop off a new show
> given that any intro might be in a multitude of formats and of a
> variable length.

The codegen tool uses ffmpeg, so it should support a lot of formats out of
the box. And if we're only checking whether the very beginning of an upload
matches the intro, selecting a good sample shouldn't be too hard.

> Then I realised I was probably out of my depth.

You and I both. Fortunately (unfortunately?) that hasn't stopped me yet.

> I'll be fascinated to know how people cleverer than I am work this out,
> and look forward to the show on it!

You know what they say: give a man a hammer and he'll fish for nails, teach
a man to code, and he'll waste hours using awk to analyse audio and
misquoting proverbs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://hackerpublicradio.org/pipermail/hpr_hackerpublicradio.org/attachments/20151230/44eb1459/attachment-0002.html>


More information about the Hpr mailing list