Noise Reduction and Dereverberation

Last updated
Save as PDF

Noise reduction is critical in most conferencing audio systems. Unless your room is exceptionally quiet, some of the natural background noise in the room will be picked up by microphones and transmitted to the far end of your conference, increasing distractions and decreasing speech intelligibility. Noise reduction can help by rejecting noise signals while leaving desirable speech signals unaffected.

This article provides an overview of Tesira's noise reduction features in a custom configuration. For details of the features in a Biamp Launch configured system, please see this article. There are two types of noise reduction available in Tesira: Standard Noise Reduction is available on an AEC input, and AI Noise Reduction is available with a TesiraFORTÉ X processor.

Standard Noise reduction

AEC Inputs.PNG Tesira Standard Noise Reduction (NR) is a subset of the AEC input block because NR is implemented using the AEC Input's onboard DSP processing hardware. NR is available on each AEC input channel, even when AEC is disabled.

Standard Noise Reduction is most effective in managing cyclical, repetitive mechanical noises such as projector cooling fans or refrigerator motor noises. When people are speaking, the suppressed noise will be more audible as the NR filters are relaxed to reduce affecting the spoken word.

There are four user-defined NR levels: Off, Low, Medium, and High. Low can affect up to about a 12dB reduction of noise, Medium can affect up to an approximately 18dB reduction of noise, and High can affect up to about a 24dB reduction of noise. Results may vary depending on the properties of the noise being reduced.

AI Noise Reduction

Systems containing one or more TesiraFORTÉ X processors may benefit from Biamp's flagship AI Noise Reduction (AI NR) and Dereverberation technologies. AI NR reduces non-continuous noise sources that are short in duration or have a high variability such as computer keyboard typing noise, paper rustling, coughing; any noise common to a conference room that isn't human speech. The effect is similar to steady-state noise reduction, though AI NR achieves its results using a deep learning algorithm trained to isolate the speech signal from noise. Dereverberation offers a reduction of a signal's reverberation time. The technology is available in a Biamp Launch or custom-configured Tesira system running firmware v4.7.0 or later.

The AI Noise Reduction Block

AI NR Level.png A TesiraFORTÉ X processor is required to host each single-channel AI NR block in a system. The AI NR block runs on a dedicated processing core of the device, thus it's addition does not impact regular DSP utilization.

AI NR is added to the layout from the Dynamics Blocks dropdown in the Audio Objects toolbar. Settings are provided to configure the intensity of the AI noise reduction and deverberation processes. Use the dropdown menus to select the desired level for each parameter. The Bypass button allows quick A/B testing of the processing's effect.

The block may be controlled via presets to allow for external control of a system requiring multiple configurations.

Note: Each AI block in a layout requires a TesiraFORTÉ X DSP per block

AI Noise reduction

There are four user-defined AI Noise Reduction levels: Off, Low, Medium, and High. The AI noise reduction is an adaptive filter in that it will be transparent when no noise is present but will remove more and more content as non-speech sounds increase.

AI NR, like AEC, is a destructive filter, and noise by definition is random, overlapping frequency components of the audio we wish to retain.

In environments with poor signal-to-noise ratios Biamp recommends using the lowest setting that achieves the desired outcome to avoid negative interaction with speech content. Negative interactions would result from noises being nearly equal to or louder than speech. In removing noise content in poor signal-to-noise environments there may be audible artifacts in speech if the frequency content of the noise significantly overlaps the frequency content of the speech. If the noise is in a different frequency range than the speech then interaction will be minimal.

It may be desirable to select a lower level of reduction for a specific applications and acoustic environments. This will allow more ambient noise to intrude but gives the far end a truer sense of what is happening in the near end space. Disabling all noise reduction allows noises to pass to the far end unimpeded. Disabling may be required if music content is being transmitted, to avoid the noise reduction suppressing tones.

Low will act on static and dynamic noise sources, providing a reduction in their intensity. Some room noise is expected to pass through to the output with this setting, however, the natural tone of the source signal (usually speech from near end microphones) is preserved.

Medium significantly reduces the level of noise passed to the block's output. In most cases, reasonably static noise sources will be removed, while dynamic noise sources will be greatly reduced. This setting offers a good balance between noise reduction and the preservation of speech quality.

High effectively removes unwanted noise from the audio path. The trade-off is audio artifacts may be experienced as the algorithm works harder to remove high-intensity noise sources from the speech component.

The better the signal-to-noise ratio, the more effective the noise cancellation will be, with fewer artifacts. With any new system designs, the Conference Room Designer & Classroom Designer are invaluable tools with which to achieve proper signal-to-noise ratio.

Deverberation

The reverberation time of a room has a significant impact on audio clarity, particularly when microphones are used for far-end conferencing. The RT60 measurement of the space defines the decay time, and generally shorter values (<~500ms) are desirable for this application. When the ratio between direct and reflected speech is too low, voice clarity suffers as words begin to "smear". The microphone choice, distance to the talker, and axial placement have the greatest impact on the direct-to-reverberant ratio at the microphone.

Biamp's new Deverberation algorithm offers a reduction in the signal's perceived decay time for the far end listener. Deverberation is an advanced process that employs a deconvolution algorithm to separate direct acoustic paths from reflected/diffuse paths. The result may be described as the source sounding more "dry". The impact of the processing becomes more evident in rooms with a higher RT60 time. There are four user-defined Deverberation levels: Off, Low, Medium, and High.

Low provides reverb reduction while retaining some of the natural decay in environments with medium RT60 times. In spaces with a shorter RT60, a focused, dry signal may be achieved. This setting best preserves the natural tone of speech.

Medium offers a good balance between reverb reduction and natural tone.

High offers the most significant reduction in reverb and can be used to achieve a very "dry" output in rooms with a low to medium RT60 time. If a room has an excessive decay time (800ms or more), or the direct-to-reverberant ratio is too low (mics too far from talkers), the High setting may impart audio artifacts on the speech signal.

Block placement

When an AI NR block is included in a layout, the compiler will add the most appropriate TesiraFORTÉ X model to the Equipment Table. A TesiraFORTÉ X will be added to the Equipment Table for each AI block present. There are no limitations as to where a block may be placed within a layout (except for preceding an AEC processing block, as with all time-based DSP blocks). However, there are placements that best utilize the technology, and placements which should generally be avoided. In a multi-device system, implicit AVB provides decentralized processing to place the block on a path otherwise hosted on a different DSP processor.

Common use cases

The typical placement for AI noise reduction in a conferencing system is the mixed output of one or more microphones, prior to routing to the far end of a call. Room noise and reflections are reduced so the far-end experiences clear speech and a call free from distraction.

In conferencing applications, Biamp recommends placing a mute control after the AI NR block. This allows constant analysis of the input for noise reduction, resulting in a more consistent performance as microphones are muted and unmuted. Muting at the USB output is ideal since this is often synchronized to the connected UC device.

AI NR may be placed in the signal path that routes speech to the local loudspeakers, such as a conference room or public address system.

AI Noise Reduction should not be placed in the signal path of any music that will be played through the system.

The AI processing time should be taken into consideration for applications that require low latency, such as a performance venue. While an audience member is unlikely to notice a small amount of added latency, a presenter may perceive the time difference between their own voice and the sound reinforcement. This can be distracting, so steps should be taken to mitigate this. Tesira automatically synchronizes outputs using Delay Equalization and supports custom grouping to allow different audio paths to pass through the DSP at different latencies. An example solution is to create multiple microphone mixes, one post-AI NR for the audience loudspeakers, and one pre-AI NR feeding the stage monitors. When each output is added to a separate Delay Equalization group, the shortest route to output is taken, dependent on the required processing latency:

Delay EQ Example.png

Paths to avoid

Inserting AI processing into the AEC reference path is not recommended unless that path leads to the local loudspeakers. In this scenario, the AEC reference informs the AEC filter to remove a different signal than is present at the microphones.

AI on AEC Path.png

Processing program material may be detrimental depending on the content. Consider the two signal paths below that represent the same system. Content and microphone audio is mixed and sent to a recorder. In the second example, if the program material contains music, it will be negatively impacted as the AI process attempts to separate voice from anything not considered voice, such as instruments.

AI on PGM.png

Tuning

You will need to be able to hear the output of the mics to properly tune them. This is often achieved by calling into a conference room from a remote location, or by implementing a cue output. A pair of headphones directly connected to an output provides the opportunity to verify performance before any additional processing is applied by a telephony service, VC codec, or UC engine.

Standard Noise Reduction

Applying Standard NR processing and AI NR to the same audio path can be a valid and useful solution in some situations. Biamp recommends using the Low or Medium setting for each NR stage when using Standard NR to mitigate steady-state noises (HVAC, etc.) and dynamic noise sources with AI NR.

Setting mics with proper input gain structure is important to avoid creating an artificially high noise floor. Interaction between the Noise Reduction process and the noise floor can result in audible artifacts as the NR attempts to chase random sounds, with varying success.

Before applying NR, listen to the mic output and raise the High Pass Filter until most of the low-frequency "rumble" is cleaned up. This setting is found in the AEC block > Ch Processing > Advanced Filters tab. The default is 70Hz but commonly rooms benefit from a change to between 125Hz and 195Hz.

If Standard NR is to be used in conjunction with AI NR & Deverberation, the former should be set first with AI NR & Dereverberation disabled. Listen with NR Off, then move the setting to Low and listen, then repeat as needed with Medium and High, choosing the best solution for each mic in the room. Localized sound sources may result in differing levels of NR being used for each mic.

If there are audible artifacts in the mic output (often perceived as water dripping or ghostly whispers) there are two ways to attempt to resolve this issue:

Reduce the amount of Noise Reduction being applied by one step and listen to the output. Is it clear? If not, reduce it again. If the results are acceptable move on. If not go to step 2.
Reduce the input gain of the microphone to reduce the noise floor. Listen anew with NR Off, then move to Low and listen, then repeat as needed with Medium and High, choosing the best solution for each mic in the room.

AI Noise Reduction & Deverberation

Once Standard NR performance is acceptable (if used), then begin tuning the AI processes. There are additional factors to consider when tuning AI processing:

AI NR and Deverberation may be acting on a microphone mix, or a mix and other sources, depending on the block's placement in the layout. When tuning the processing, all inputs must be tested, as their acoustic paths will be different. A satisfactory audition for one microphone may not match another, particularly if the signal-to-noise ratio differs significantly between the two.
AI NR and Deverberation are trained to act on dynamic inputs: Noise sources may be ever-changing, and the reverberation path will differ between microphones. With this in mind, ensure every microphone location is tested, with a wide array of noise stimuli.

Ensure all microphones are configured for proper gain structure, then disable Deverberation and begin with AI NR. Listen at the output, and move from Off to Low. Test every microphone position, confirm steady state noise reduction first, then simulate other noise sources that could be common to the space. Compare each processing level between Low to High and choose the lowest setting that provides the desired result.

Once noise reduction is satisfactory, move on to Deverberation. Be sure to test every microphone location, choosing the Deverberation level that achieves the best balance of reverb reduction across all inputs being processed.

If the noise and reverberation performances are satisfactory, but there are audible artifacts affecting speech quality, revisit the AI processing. The AI filters act on frequency components that overlap the speech band. Artifacts occur when AI is working too hard as a result of a low signal-to-noise ratio, or when the processing level is set higher than required for a given input signal. If this is the case, reduce the applied Noise Reduction and Deverbertion levels and retest. If acceptable speech quality and processing filter impact cannot be obtained, revisit microphone placement and distance to the talker.

Noise reduction offload

The dedicated noise reduction processing functions of Biamp DSPs are valued features. However, UC platforms often do not natively support noise reduction offload, even when connected to certified peripherals like a Biamp DSP. Many vendors support noise reduction handoff to the DSP via manual configuration. More information on vendor support is available in our Biamp UC Compatibility Matrix.

Processing Library (.tlf) files for use with Parle mics and AI NRD

To simplify adding AI Noise Reduction to manually configured Tesira Forte X room files we've built a new Tesira Processing Library file with custom blocks. It allows drag and drop import of the recommended AI NRD signal chain, with optimal block settings, into your designs. It has been refined for the Parle TCM-X mics but will work with TCM-1 and TTM-X mics as well.

These custom blocks are recommended instead of the Parle Processing block when AI Noise Reduction is used.

Save the linked "Biamp Parlé Beamtracking Mics with AI NRD v4b.tlf" file to your PC, import it to the Processing Library, then drag and drop the custom blocks into layouts as needed.

Biamp Parlé Beamtracking Mics with AI NRD v4b.tlf

The use of AI NRD allows Tesira Forte X to dynamically adjust to the amount of noise reduction and deverberation required. In practice this allows for a wide range of good behavior without manual intervention. The default settings in the AI NRD block of Medium for noise suppression and Medium for deverberation are a good fit for most rooms. Due to these improvements, we do not employ the manual toggles of Poor, Fair, Good, Great, and Perfect in this block as we do in the standard Parle Processing block.

For conferencing applications, each mic still requires a channel of AEC processing. We recommend leaving the Noise Reduction in the AEC processing block set to Low as this setting allows additional noise floor gains when used in tandem with the AI NRD block.

Only one AI NRD block instance is supported per Forte X device. The AI NRD block is not available in other Tesira devices.

Since only one AI NRD block is available per device the placement of the block must allow it to act on all mics that need noise reduction. This is typically after the Gating Auto Mixer or Gain Sharing Auto Mixer brings all the room mic signals together, but before they are split out to multiple transmit paths.

The AI Noise Reduction block should be placed before AGC or compression. This allows the noise reduction to optimize signal-to-noise performance before dynamic processing is applied.

Notice that in our AEC processing block, noise reduction is applied at the beginning of the signal chain - prior to dynamic processing. Combining multiple mic channels and sub-mixing the mics before noise reduction is fine. It is preferable to have the noise reduction done prior to AGC and compression to provide the best signal-to-noise ratio for dynamic processing. If dynamic processing is done prior to noise reduction then noise levels may be raised artificially and unintentionally, resulting in lower audio quality.

The AI Noise Reduction block should be placed before the mic signal path mute, allowing the algorithms continue to monitor and adapt to noise changes while the mute is active.

The blocks are sized for 1, 2, 3, 4, 6, 8, 10, 12, or 16 mics. Choose the nearest fit for your application, if less mics are used simply leave the unused input nodes disconnected.

The custom blocks do not provide a Mix Out node from the Gating Auto Mixer for Auto Mixer Combiner scenarios. This is because the single AI NRD block will not support noise reduction in 2 or more rooms in a divided space scenario. If multiple AI NRD blocks are required then multiple processors are needed, otherwise our standard noise reduction in the AEC processing block can be used.