A few weeks ago, I joined the Audio Forensic Training which was jointly conducted between Forensic Laboratory Centre of Indonesian National Police Headquarters and Cedar Cambridge. In this training, we developed the latest techniques on noise filtering by using Cedar instrument which was installed in the Audio Laboratory at my office. According to Dr. David Robinson who was also the instructor at this training, the audio lab we have is the best Cedar lab in South East Asia.
In this training, we praticed to remove a wide range of noise. In some cases, the voice recorded is not clear because of the noise, even it can not be heard at all. The noise sounds are much louder than the human voice. With the assisstance of Cedar providing many powerful filtering modules, we were successful to remove the noise and to make the human voice to be clear to listen to. Besides this, Cedar also provides feature to recognise the editing line in the case of when the voice recording is edited. Cedar can detect the time when the editing occured by displaying a vertical line. Through this line, we can know there is a change before and after this line. If it happens, it means that the voice recording is not original anymore. The editing could be done in the purpose of to remove unwanted parts or to add some parts. In this case, the recording could be rejected and could not be accepted to be analysed forensically because the content has been changed.
Cedar also provides spectogram for each words said. It is useful for the pruposes of voice identification or verification. With this feature, we can apply phonetics forensic in order to compare spectogram between questioned voice and known voice, so that we can know whose voice it is. In this case, we develop a technique based on FBI procedure on phonetics forensic. This procedure was described by Bruce E. Koenig on the journal "Spectographic Voice Identification: A Forensic Survey" created in 1986. In this journal he also explained that the comparison should be performed on at least 20 different words which are pronouced similarly for meaningful results. Below the complete quotation from his journal about the comparison procedure:
(1) Only original recordings of voice samples were accepted for examination, unless the original recording had been erased and a high-quality copy was still available.
(2) The recordings were played back on appropriate professional tape recorders and recorded on a professional full-track tape recorder at 7 1/2 ips. When possible, playback speed was adjusted to correct for original recording speed errors by analyzing the recorded telephone and AC line tones on spectrum analysis equipment. When necessary, special recorders were used to allow proper playback of original recordings that had incorrect track placement or
(3) Spectrograms were produced on Voice Identification, Inc., Sound Spectrographs, model 700. in the linear expand frequency range (0-4000 Hz), wideband filter (300 Hz) and bar display mode. All spectrograms for each separate comparison were prepared on the same spectrograph. The spectrograms were phonetically marked below each voice sound.
(4) When necessary, enhanced tape copies were also prepared from the original recordings using equalizers, notch filters, and digital adaptive predictive deconvolution programs13,14 to reduce extraneous noise and correct telephone and recording channel effects. A second set of spectrograms was then prepared from the enhanced copies and was used together with the unprocessed spectrograms for comparison.
(5) Similarly pronounced words were compared between two voice samples, with most known voice samples being verbatim with the unknown voice recording. Normally, 20 or more different words were needed for a meaningful comparison. Less than 20 words usually resulted in a less conclusive opinion, such as possibly instead of probably.
(6) The examiners made a spectral pattern comparison between the two voice samples by comparing beginning, mean and end formant frequency, formant shaping, pitch, timing, etc., of each individual word. When available, similarly pronounced words within each sample were compared to insure voice sample consistency. Words with spectral patterns that were distorted, masked ‘by extraneous sounds, too faint, or lacked adequate identifying characteristics were
(7) An aural examination was made of each voice sample to determine if pattern similarities or dissimilarities noted were the product of pronunciation differences, voice disguise, obvious drug or alcohol use, altered psychological state, electronic manipulation, etc.
(8) An aural comparison was then made by repeatedly playing two voice samples simultaneously on separate tape recorders, and electronically switching back and forth between the samples while listening on high-quality headphones. When one sample had a wider frequency response than the other, bandpass filters were used to compensate during at least some of the aural listening tests.
(9) The examiner then had to resolve any differences found between the aural and spectral results, usually by repeating all or some of the comparison steps.
(10) If the examiner found the samples to be very similar (identification) or very dissimilar (elimination), an independent evaluation was always conducted by at least one, but usually two other examiners to confirm the results. If differences of opinions occurred between the examiners, they were then resolved through additional comparisons and discussions by all the examiners involved. No or low confidence decisions were usually not reviewed by another examiner.
According to his survey, only 1 false identification case (i.e. 0.31%) was found from 318 cases of phonetics forensic, while only 2 false eliminations (i.e. 0.53%) were found from 378 phonetics forensic cases. From this data, it means that the FBI technique is reliable for voice identification or verification.
In order to run this procedure of phonetics forensic, Cedar is reliable as well as noise filtering and editing line recognition.