digitalmars.D - D for Speech and Signal Processing

Chris (13/13) Nov 28 2013 There are voice analysis and speech processing toolkits like

Joakim (10/23) Nov 28 2013 I agree that D would make an excellent platform for such signal
Frustrated (9/22) Nov 28 2013 The GC is the stop the world collector and a huge issue for real
Baz (23/37) Nov 29 2013 Hi, I have a little experience in dsp programming using oop

ponce (4/13) Nov 29 2013 From what I hear, Protools for Mac is especially demanding and
Chris (2/41) Nov 29 2013 Thanks. That's very interesting, I'll look into it.

Nick (7/7) Nov 29 2013 I'm also waiting on D to address garbage collection better. As

"Chris" <wendlec tcd.ie> writes:

There are voice analysis and speech processing toolkits like 
Covarep and Voicebox (see links below) that were coded in Matlab, 
because they were originally only prototypes. There has been talk 
of porting them to C++. My first thought, as you might imagine, 
was why not use D? However, I don't know if there are any 
performance issues, especially for real time systems (in speech 
recognition), talking about GC, or in fact any other issues 
(number grinding etc.).

A lot of the analysis tools are based on some sort of HMM 
(http://en.wikipedia.org/wiki/Hidden_Markov_model) and I think D 
could handle that elegantly.

https://github.com/covarep/covarep
http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

Nov 28 2013

"Joakim" <joakim airpost.net> writes:

On Thursday, 28 November 2013 at 10:30:36 UTC, Chris wrote:
 There are voice analysis and speech processing toolkits like 
 Covarep and Voicebox (see links below) that were coded in 
 Matlab, because they were originally only prototypes. There has 
 been talk of porting them to C++. My first thought, as you 
 might imagine, was why not use D? However, I don't know if 
 there are any performance issues, especially for real time 
 systems (in speech recognition), talking about GC, or in fact 
 any other issues (number grinding etc.).

 A lot of the analysis tools are based on some sort of HMM 
 (http://en.wikipedia.org/wiki/Hidden_Markov_model) and I think 
 D could handle that elegantly.

 https://github.com/covarep/covarep
 http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

I agree that D would make an excellent platform for such signal 
processing work, though I don't know what real-time constraints 
would have to be worked out either.  Perhaps a better approach 
would be to write a D wrapper for the C version of the CMU Sphinx 
suite, then move development to D over time:

http://cmusphinx.sourceforge.net/

I don't know the relative merits of those three software projects 
though.  Maybe Sphinx isn't the best implemented, but it 
certainly might be the quickest to get up and running with D.

Nov 28 2013

"Frustrated" <c1514843 drdrb.com> writes:

On Thursday, 28 November 2013 at 10:30:36 UTC, Chris wrote:
 There are voice analysis and speech processing toolkits like 
 Covarep and Voicebox (see links below) that were coded in 
 Matlab, because they were originally only prototypes. There has 
 been talk of porting them to C++. My first thought, as you 
 might imagine, was why not use D? However, I don't know if 
 there are any performance issues, especially for real time 
 systems (in speech recognition), talking about GC, or in fact 
 any other issues (number grinding etc.).

 A lot of the analysis tools are based on some sort of HMM 
 (http://en.wikipedia.org/wiki/Hidden_Markov_model) and I think 
 D could handle that elegantly.

 https://github.com/covarep/covarep
 http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

The GC is the stop the world collector and a huge issue for real 
time processing. It is one of the major drawbacks of D for use in 
(near) real time software. Luckily you can get D to work without 
the GC but it is a huge pain in the ass.

But given the drawbacks it may be well worth it to start with D. 
The more interest in D the more likely these issues will be fixed.

Unfortunately I'm in a similar scenario and I hope that the GC 
will not be an issue in the end.

Nov 28 2013

"Baz" <burg.basile yahoo.com> writes:

On Thursday, 28 November 2013 at 10:30:36 UTC, Chris wrote:
 There are voice analysis and speech processing toolkits like 
 Covarep and Voicebox (see links below) that were coded in 
 Matlab, because they were originally only prototypes. There has 
 been talk of porting them to C++. My first thought, as you 
 might imagine, was why not use D? However, I don't know if 
 there are any performance issues, especially for real time 
 systems (in speech recognition), talking about GC, or in fact 
 any other issues (number grinding etc.).

 A lot of the analysis tools are based on some sort of HMM 
 (http://en.wikipedia.org/wiki/Hidden_Markov_model) and I think 
 D could handle that elegantly.

 https://github.com/covarep/covarep
 http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

Hi, I have a little experience in dsp programming using oop 
languages, so I'll try to give you my mind, but my mind is more 
related to entertainment dsp softwares (asio, vst, etc...).

 talking about GC

In "pseudo" real time (RT) audio (one or many buffer are 
overlapped) you are a in a loop (interesting example is 
bufferswitch in asio). It's time critical and performance 
critical, so you'll never create a class neither allocate a 
buffer here...The idea is: what does trigger the GC: memory 
allocation and dynamic class instance creation. It's like in GUI 
programming: you don't destroy and recreate many objects in the 
"resize/realign" message handler...So the GC problem is solved: 
there is no GC problem because in RT dsp you won't do something 
stupid that'll trig a GC pass.

In speech recognition you'll mostly use some frequency-domain 
technics (not to name the fft), so basically if you don't want to 
trigger a GC pass, don't use build-in array and make your own 
array using alloc/malloc/free. For the classes it's the same, you 
can still make your own class allocator/deallocator, like 
specified in the manual (even if they say it's deprecated). With 
user managed classes and array you'll avoid most of the GC 
passes...But it doesn't mean that the most important stuff is: 
not to allocate in the audio buffer loop.

Nov 29 2013

"ponce" <contact g3mesfrommarsSPAM.fr> writes:

On Friday, 29 November 2013 at 16:58:47 UTC, Baz wrote:
 In speech recognition you'll mostly use some frequency-domain 
 technics (not to name the fft), so basically if you don't want 
 to trigger a GC pass, don't use build-in array and make your 
 own array using alloc/malloc/free. For the classes it's the 
 same, you can still make your own class allocator/deallocator, 
 like specified in the manual (even if they say it's 
 deprecated). With user managed classes and array you'll avoid 
 most of the GC passes...But it doesn't mean that the most 
 important stuff is: not to allocate in the audio buffer loop.

 From what I hear, Protools for Mac is especially demanding and 
needs no allocation in the audio thread. Some people have 
problems merely by acquiring a lock.

Nov 29 2013

"Chris" <wendlec tcd.ie> writes:

On Friday, 29 November 2013 at 16:58:47 UTC, Baz wrote:
 On Thursday, 28 November 2013 at 10:30:36 UTC, Chris wrote:
 There are voice analysis and speech processing toolkits like 
 Covarep and Voicebox (see links below) that were coded in 
 Matlab, because they were originally only prototypes. There 
 has been talk of porting them to C++. My first thought, as you 
 might imagine, was why not use D? However, I don't know if 
 there are any performance issues, especially for real time 
 systems (in speech recognition), talking about GC, or in fact 
 any other issues (number grinding etc.).

 A lot of the analysis tools are based on some sort of HMM 
 (http://en.wikipedia.org/wiki/Hidden_Markov_model) and I think 
 D could handle that elegantly.

 https://github.com/covarep/covarep
 http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

 Hi, I have a little experience in dsp programming using oop 
 languages, so I'll try to give you my mind, but my mind is more 
 related to entertainment dsp softwares (asio, vst, etc...).

 talking about GC

 In "pseudo" real time (RT) audio (one or many buffer are 
 overlapped) you are a in a loop (interesting example is 
 bufferswitch in asio). It's time critical and performance 
 critical, so you'll never create a class neither allocate a 
 buffer here...The idea is: what does trigger the GC: memory 
 allocation and dynamic class instance creation. It's like in 
 GUI programming: you don't destroy and recreate many objects in 
 the "resize/realign" message handler...So the GC problem is 
 solved: there is no GC problem because in RT dsp you won't do 
 something stupid that'll trig a GC pass.

 In speech recognition you'll mostly use some frequency-domain 
 technics (not to name the fft), so basically if you don't want 
 to trigger a GC pass, don't use build-in array and make your 
 own array using alloc/malloc/free. For the classes it's the 
 same, you can still make your own class allocator/deallocator, 
 like specified in the manual (even if they say it's 
 deprecated). With user managed classes and array you'll avoid 
 most of the GC passes...But it doesn't mean that the most 
 important stuff is: not to allocate in the audio buffer loop.

Thanks. That's very interesting, I'll look into it.

Nov 29 2013

"Nick" <nmsmith65 gmail.com> writes:

I'm also waiting on D to address garbage collection better. As 
has been brought up multiple times before (then quickly 
forgotten), a  nogc attribute would do wonders for those D 
programmers looking to avoid sneaky GC allocations. That's a 
great start, and then we can work out how best to handle the 
parts of Phobos that currently allocate. I would tackling this 
myself if I had the skill.

Nov 29 2013

D Programming

C/C++ Programming

Other

digitalmars.D - D for Speech and Signal Processing