Personal View site logo
Need advice with audio editing
  • I usually prefer "learn by doing" but in case of voice audio, I'm pretty lost. I have been watching youtube tutorials etc. but every author seems to have their own opinions how to do things and the results haven't really impressed me.

    So if someone who could tell me how to edit this clip on practice: http://tonalt.kapsi.fi/audiosample.wav

    It's interview type clip. Recorded with shotgun microphone (NTG-3, placed just above subject pointing down) through MixPre and from there directly to GH2. Yes, shotgun is not the best option indoors because of echoes but this is now what we have. I have Sony Vegas Pro but lately I have been using Audacity especially to pre-process the audio.

    I want it simply sound as good as possible. To get rid of background noise (couldn't avoid that), do somekind of equalization, compress and maybe even lower the echoes somehow (possible?).

    Here's what I have figured out about work steps (or maybe I haven't figured out anything ;) )

    1. Remove noise: In Audacity, paint part of the clip where is no other noises than the background noise choose: Effect -> Noise Removal -> Get Noise Profile. Then select all -> Effect -> Noise Removal -> (I decreased sensitivity to -4,5dB, otherwise effect was too obvious) -> OK, done.

    2. Equalization: This is probably one of the most difficult tasks and takes years of experience. Effect -> Equalization -> ???

    3. Compression: Effect -> Compressor -> (default values sounded pretty good actually, but what you recommend) ???

    4. Remove echoes: Possible ???

    If I would just get these real world instructions for this clip and information why some settings were chosen, I believe I would be much wiser already. I could also consider getting some better tools for audio if it really makes huge difference. Thanks!

  • 27 Replies sorted by
  • I don't really hear big echoes, so maybe you should check your playback system There is an electrical hum for which I recommend the waves plugin x hum. If you add compression, it will boost the hum, so you will need a noise gate, or remove the hum, then add compression. It doesn't need a lot of compression. When there is another voice that is farther away, you can certainly cut the file into clips and process those clips independently to make them closer in sound, but they won't ever sound the same. You can EQ out the low frequencies below 100 Hz for starters.

  • @tonalt. A few questions: Are we supposed to hear the person in the background? Do you have the footage that goes with the audio available for downlaod? It's an interview for what kind of format, documentary, reality show, short film, corporate web video, etc.?

  • Ah, playback system is fine, I think I confused the echo with other terms. I mean, the sound is little "hard", like it's coming from tin box or something. Well it's natural, because it was recorded in small room where one wall is almost all window. Maybe "reverb" could be better term? But what ever it is, it's definitely not that bad.

    Waves plugin X-Hum, like this? http://www.waves.com/plugins/x-hum . Pretty expensive, I think the Audacity noise removal worked pretty ok for the hum already.

    Yep I remove the hum before compression so there is no hum to be boosted. The noise gate at least in Sony Vegas pretty much sucks.

    Remove low frequencies - why? Is this common habit with all voice editing? Isn't the voice more appealing if there is lot of low frequencies, like "deep radio voice" ?

    @spacewig Nope, we are going to remove the background person completely. I have the footage but I don't really find it necessary to be uploaded, this is about audio editing. Then I would need to ask permissions etc.

    It's interview scene, part of one comedy type of material for private event. I'm looking for the sound of those great documentaries where persons are interviewed.

  • The human voice normally produces no combination tones or subsonics (although it can), so the lowest it goes is about 110 Hz. So even your deep radio voice is above 110, maybe 100 if you are a heavy smoker. If you use a rolloff, don't use a steep one, that means that even at 70 or 80 Hz, there will still be some sound to provide some ambiance, but your hum--in the 60-70 Hz range, plus all the junk in the sound will just fade away.

    If you don't want X hum you can make an octave notch filter, that's all it is, except the notches are tunable. Much, much better than a noise filter. So if your hum is 60 Hz, you want notches, tiny dips in the EQ at 60, 120, 240, 480 and so on. The narrower the notch, the less audible it is. If your hum--electrical--is 50Hz, then it is 50, 100, 200 and so on.

    Love audacity, open source and so on. But use something else.

  • I suggest to check spectrum audio editors / plugins. We have topics about them in same category.

    http://www.personal-view.com/talks/discussion/4160/sony-spectralayers/p1

  • Sorry but i can not explane this in english.

    Ensimmäiseksi mikrofoni ja sen asento suht pielessä. Tällä tavalla äänität tosi paljon huonetta ym. Mutta asiaan. Noi noise remoovalit ym poistaa kyllä kohinaa ym mutta samalla lähtee aika paljon muutakin, esim osa hunekaiusta, jonka jälkeen soundi voi olla yllättävän metallinen. Mä lähtisin hommaan seuraavasti. 1 Normalisointi = saat kaiken irti audiosta ilman säröjä. 2 EQ = Lähde poistamaan taajuuksia alhaalta ylöspäin niin kauan, kunnes eq alkaa vaikuttamaan itse puhujan ääneen. Lähde poistamaan taajuuksia ylhäältä alaspäin, kunnes tapahtuu sama, näin saat lisävaraa audion käsittelyyn ja onnistut poistamaan osan kohinasta ja hummista (nopeesti kun testasin audio on noin välillä 100 - 16Khz). Seuraavaksi kokeile buustata taajuuksia pienellä q:lla. Aina kun löydät taajuuden joka buustattuna kuulostaa tosi ilkeältä laske sitä jokunen db siis alle nollan (ihan kokeilu juttu). Aina välillä vertaa alkuperäiseen, ettei puhujan soundi muutu liikaa (yleensä siinä keskialueella). Eq:ssa kannattaa yleensä poistaa tai laskea taajuuksia ei korostaa. Editiori = laske äänen voimakkuus nollaan aina kun joku ei puhu, tee se feidaamalla, niin kohinat ym menee nätisti (puheen aikana ne ei oikeesti häiritse niin paljon). Noisgate tekee yleensä liian jyrkkää jälkee. Compressointi = Käytä mielellään multiband kopressoria, niin pääset tarkemmin säätämään kompressiota eri taajuuksilla (voit kokeilla presettejä esim brodcast tai cd master ym). Kompressio on kuitenkin aika tilanne kohtainen / makuasia. Haluatko saada ns radio soundin vai enemmän dynamiikkaa sisältävän soundin. Optimazion = Härveli joka yrittää ottaa kaiken irti käytettävästä dynamiikasta. Toimivat yleensä aika automaattisesti. Itse prosessissa käytetäviin härpäkkeisiin en niin paljon halua ottaa kantaa, koska itse pääasiassa käytän Sonic-Coren systeemejäj, jotka on aika tyyriitä, mutta kyl nois plugareissa on ihan hyviä. Enemmän on usein kiinni käyttäjästä. Itse huone kaikua on aika mahdoton poistaa, eli kannattaa keskittyä yleensä itse äänityksen onnistumiseen. Tovottavasti tästä ny on mitään apua.

  • @otcx why don't you put your text into the translator? Google can do it for you :-)

  • I think Google is not abel to translate that, but if in my way of doing this helps tonalt or any kind make any sense. I try to trnslate that. But it was just basic stuff.

  • @tonalt the audio quality of speech isn't bad at all. It would be worth maybe to work only on the parts of audio where the person isn't talking, it might improve the general feeling about audio quality. Cut before and right after talking, you may use some denoising method, but possibly also record again similar room sounds in good quality and replace the most disturbing ones.

    @otcx sorry, didn't know tonalt also understands finnish. But as we are english speaking forum, you can send him also PM instead.

  • @tonalt Sorry, but I disagree. You're not audio editing, you're doing post-production audio editing which means working on sound that's married to an image. This has consequences as whatever you do to the audio has to relate and support what's in the visuals. For example , we hear a few thuds/squeeks in the audio file you posted. Normally you would want to remove those but if it corresponds to the subject's obvious movement you probably want to keep it there. However you would most likely want to separate it from the dialogue track as all processing you do on dialogue will be done equally on production fx sounds if they are part of the same tracks. Whatever you do remove, though, you'll need to replace with room tone which has to integrate seamlessly with the audio leading up to it and coming after it in order for it not to jump out.

    I somewhat get the impression that you're not interested in the methodology so much as a list of quick settings that will make everything sound better. Unfortunately, these aren't individual tracks in a song that you tweak and twist until it sounds cool. You need to make sure that all material preceding and following this clip will flow together as transparently as possible.

    Anyhow, I was going to edit your track and post my steps to be as clear as possible but since there's no video to work with here's a list of steps you, and others, might find useful:

    • Room tone. Make some before anything else as you'll need it to replace whatever you remove and to tie scenes together. A pain in the ass but if you do it right away you don't have to think about it again. This might involve using the audio of multiple shots taken with the same set-up.
    • Remove all unnecessary sounds and replace with room tone. Sounds you want to keep should be moved to another track. You might want to name that track production fx.
    • Remove/reduce noise once your track flows as you want it to and you've removed all undesirable sounds. Notch filters are very good for removing hums (steady tones and their harmonics, like a fridge) Try to use as steep a Q as possible if not you'll probably cut too much out of the vocals. Izotope hum remover is a very good tool for this, offers Q up to 1000. Once hum is removed you can address broadband noise, if there is any. Many different plugins are available, I prefer those in the Izotope RX bundle. I always use earphones for this type of work (and for creating room tone) as they allow you to see deep into the sound whereas with speakers you'd have to crank them to crazy, unsafe levels, to pick up the same amount of details. This is also where the self-noise of the preamps you were using will become glaringly obvious. Try whenever possible to listen to the noise you are removing (most plugins have a button for output noise only) to make sure you're not damaging your dialogue. If you hear part of the dialogue in this mode, roll back the amount of reduction you are applying. All of these noise reduction plugins can cause severe artifacts.
    • reduce reverb if necessary. There are now a few plugins that can do this. I like zynatiq unveil. SPL de-verb can also work well. I don't think your track really needs it. Forget what mic you used, just listen to the track. All rooms have some reverb, it's their signature and what gives us a sense of their space. It's only really a problem when it is distractingly obvious (not in your case) or when you have actors in different positions of a room and the reverb in the dialogue tracks from different shots don't match. This does not seem to apply to your audio tracks.
    • Expander/gate. The above two can often be achieved by using an expander. I don't like gates for dialogue as I find when you cut room tone out the dialogue sounds artificial and unnatural, like taking a fish out of his aquarium whenever he's not moving. I much prefer an expander and usually set it to reduce room tone to about half its volume. You'll need to set the threshold by ear.
    • Compression. Is it really necessary? One of the most misunderstood tools in audio, I'd say forget about it for now unless you know exactly what you want to accomplish with it. I'd advise to focus on making your dialogue EBU R128 compliant instead. Tons of articles about this standard online as well as plugin EBU meters. If you CANNOT get your dialogue to sit at -24LUFS because parts of it are clipping, then it's time to use compression.
    • EQ. Also easy to misuse. Deep vocals are nice on radio but what's the purpose in your film? Usually lower frequency means closer up, more intimate (proximity effect). Does the subject even have a deep voice? Most audio tracks will have plenty of low frequency garbage so I would follow DrDave's advice and cut below 80Hz. If this ends up reducing the low end spectrum of your subject's vocal tone then boost around 120Hz to make up for it. Between 1.2k and 2.5k can add some clarity if the hi-mids are lacking (and I'm not saying they are in your clip) and 10k and up will add some 'air' to the sound. On the other hand, once you clean and boost your dialogue you may find the esses 's' are harsh, in which case you'll want to de-ess the dialogue. Don't forget, as with everything else, whatever EQ you apply will affect all sounds on the track, not just the dialogue. This is why step 1 & 2 are crucial.
    • Brickwall limiter. Add one to the output. Clipped tracks sound like ass.

    If your video is somewhat documentary in nature then perhaps you should aim for presenting the subject as naturally and clearly as possible with boost or cuts intended to achieve this objective.

    Don't start audio post until your edit is locked. It can be a nightmare trying to re-sync after editing changes.

    BTW, is the language Finnish? Sounds like you're interviewing Kimi Raikkonen...

    Edit: for clarity, typos, etc

  • @spacewig Great list! Even though I'm not editing anything right now, thanks.

  • Part of the post production process is deciding what you want the audio to sound like. In this case, the hum was the thing that I noticed first, and then you have to ask the question, do I want to remove the hum, and, if so, how to I remove it without making the remaining audio sound rubbery or processed. So you can use one of these "denoiser" things, and some of these will do a pretty good job, especially the more expensive ones. But I myself would start with a notch filter and a rolloff or high pass or whatever, because that removes the least amount of original material. Then you move on to the other problems in the sound.

    @otcx normalization really is not the answer here, but if used it should be used as part of the final filter stack.

    Most people don't know how to set up a notch filter, and sometimes the hum is an odd frequency, but in many cases a notch filter will remove a lot of the noise and leave the rest untouched. I can make my own notch filter but I use the X Hum from Waves because it is really good. The secret to a good notch filter is that you can change the dB of each notch in the harmonic series to avoid the sound you want to keep. So the first notch can be pretty steep, like 16 dB, and the second and third notches, in the 100-240 range, will be morse shallow, and then you can just with a little trial and error sort the harmonics out for the best balance. As @Spacewig says, you want to keep those notches really narrow. Most EQ filters won't go that narrow, and, if they do, they don't sound quite right. And it may turn out that a really good noise remover, like Cedar, is going to work better for your kind of noise, so you have to try a few things and see if you like them

    @Spacewig's suggestion to EQ back in some frequencies if you feel that they are missing is good.

    How do you tell what pitch the hum is? Good NLEs like Samplitude have a tuner built in, or you can use a tuner like a Korg, or one of those tuner apps for iPhones or Android.

    I favor a combination app like the Waves L2 limiter for compression and limiting, plus noise shaping, if desired. The Waves L2 is based on the work of Michael Gerzon, so it has a natural sound that you won't find elsewhere. You just will have to see how much noise you can take out, and then what the effect of some light L2 will be on the final product.

    Then some acoustical modelling, if desired, to shape the sound into a particular space/sound concept.

    As Vitaliy points out, you should look at Spectral editing. Obviously, some DAWs have this built in, like Sequoia, but you can also go the plugin or standalone method. And the reason you should look at these is because that's what a lot of people are using to fix noise problems. So you should learn how to use these tools because they are powerful tools for audio. They don't, however, use acoustical modelling, so you need something for that as the final sound concept. Sony has some acoustical modelling built in, but it isn't as good as programs that are designed to do that. However, a notch filter is less destructive than a spectral plugin in most cases. Depends on the harmonics in the noise and also how even the sound is.

    This leads me to my last point, which is you really should consider paying someone for an hour of DAW time to just fix it, assuming you can find someone who knows what they are doing. I can take the motor out of my car and replace the bearings, but I don't want to. I want someone else to do it. And the reason is I would have to rent all that gear, plus I might drop a 10mm socket into the motor and not realize it until later. I mean, it would take probably half an hour to just set a few filters and render it out.... The fact is that someone with some high end filters, a real DAW and some experience is going to just cruise through it. So, yes, it is fun to do it yourself, absolutely, but it might be more fun to have someone else do it in this case, and you can pick up the process faster by watching how they do it.

  • You're not audio editing, you're doing post-production audio editing which means working on sound that's married to an image. This has consequences as whatever you do to the audio has to relate and support what's in the visuals. For example , we hear a few thuds/squeeks in the audio file you posted. Normally you would want to remove those but if it corresponds to the subject's obvious movement you probably want to keep it there. However you would most likely want to separate it from the dialogue track as all processing you do on dialogue will be done equally on production fx sounds if they are part of the same tracks. Whatever you do remove, though, you'll need to replace with room tone which has to integrate seamlessly with the audio leading up to it and coming after it in order for it not to jump out.

    That's a solid piece of knowledgeable perspective, thanks @spacewig =)

  • Wow, that's a lot of information. Thanks for all, I need some time to study it.

    @spacewig Yep Finnish.

  • @DrDave yes normalization really is not the answer here, but i use it a lot to get most out from audio spectrum. In what stage of process you use it is a bit mather of taste and way how you workflow goes.

  • @otcx if you normalize the noise, the noise will be louder. Assuming you are recording in 24 bit, multitrack environment, your output mix plus the final filter in the stack will bring you at -0,5dB. So there is no need to normalize any audio, and you avoid any artifacts or boosting of unwanted sounds associated with normalization.

    Even two channel, stereo, 24 bit audio will reach -0.5 dB at output stage even if you accidentally set your levels 16dB too low.

    So for example a nice combination in the stack is noise removal+physical modelling+L2 or multiband limiter (or parallel compression in multitrack)=output at limit. You can add the physical modelling as the last stage, but then you usually cannot set your limit as accurately, depending on the output driver of the reverb or modelling filter. You set your limit where you want the audio to wind up.

  • For the clip above, I ended up just removing the noise and adding very basic EQ (removing below 80Hz and above 16kHz). The EQ didn't really have any effect that I could have noticed but maybe there's no harm either. Tried to boost some higher frequencies around 10k but it boosted also the background noise a lot so I left it untouched.

    I tried Waves and iZotope stuff and the iZotope R3 noise removal seemed superior when compared to Waves Z-Noise.

    Then I added also brickwall limiter around -0.2db to the audio master track in Sony Vegas. This was trick that I really needed, didn't think about it earlier but makes now life much easier. I have also in the same plugin low compression, not really needed so I might remove it at some point.

    Removing the "reverb" without affecting too much to the audio seemed to be almost impossible so I let it be.

    Anyway, another question. I want to replace the person's voice in the background. So I replace it with room tone, and then I do voice over. But how to make the voice over sound like it's coming from the same room?

  • Surprised you would try Waves Z noise instead of X hum, but maybe the noise was not an electrical hum. Certainly Izotope makes some good stuff, although the only thing I use of theirs is the "dither".

    I have a very simple rule, which is that if it sounds acceptable, I move on to the next project.

    Having said that...

    Your sample has some noise below 80Hz. So if the EQ did not do anything, there is a problem with the stack of effects, or, if you removed most of the noise, there was nothing left to EQ. Not all EQ is the same, I happen to like the Gerzon EQ but there are several that are decent, you can find the Gerzon in the "renaissance EQ". Personally, I would do the roll off before adding the noise reducer filter in the stack, but then again I would probably try a notch filter as as well. To make the new audio sound like the old room, the easiest way is usually to go back to the room and use the same mic. But if that is not possible, then you are looking at what we call physical modelling. You can download the free SIR plugin, then start looking for room simulators that match the room you were in. You could also generate an impulse response wav file for the room, in the event that you have access to the room, but the speaker is in another place. If you are using a stack of effects, you want the limiter, like the L2 limiter, positioned either last or next to last in the stack, after the room simulator, or, you can use the limiter at the track level. If you use the room simulator, make sure the "bass boost" is off in this case, if it has one. You will have to experiment with the stack to see what gives you the best results e.g., source--roll off--X hum or notch filter--noise print--room simulator--L2 or multiband

    BUT if you are using a DAW with spectral noise reduction built in, like Samplitude, you have some different options. You can remove noise differently on different tracks, or right and left, and so on. So if you have a loud noise on the right, you process that track independently. Makes a difference. And some people do the NR at the end so they can remove coughs and clicks and so on working on the bounced wave file for faster processing.

    I don't see a noise gate in your workflow, I usually do the noise gate manually anyways.

    Feeding the limiter: You should fee the limiter with gain. Don't use the limiter instead of gain. That way, you "limit" the effects of the limiter.

    Anyway there is no "right" way to adjust the stack, just try a few things out, but depending on the DAW the limiter may have to be last in the food chain for the best effect. You can always go back to the original, the layers of effects simply sit on top of the original audio.

    One more trick: you can use progressive reduction limiting, say three passes, first 3dB, then 2dB, then 1dB. Interesting in some cases.

    Other plugins: Waves Renaissance EQ--Gerzon algorithms L2 Limiter--ppl have tried to make a better one. Steinberg Portico 5033--has more of a "hardware" sound. Think Neve. 5034 is the compressor. Waves H-EQ--can add a warming effect if desired. Cool display.

  • Found this video, it's pure gold as it shows how audio professional post-processes short film's audio and gives great tips:

  • Lots of good info here, don't reach instantly reach for the compressor or limiter though, and I wouldn't normalise ever, but horses for courses :) I hardly ever use compression and only use a IS capable limiter to catch any brief spikes to pass QC unless I am mixing something for the States, as their "sound" is slightly different from here in the UK. For noise reduction and hum removal and "room tone fill" generation, take a peek at Izotope, it has a few more parameters to focus into the problem areas than Waves IMHO, also a nice free artefact removal tool is ISSE, and its free, and can provide astonishing results on noises off and stuff like mic clunks etc.

    http://isse.sourceforge.net/

    Rule of thumb is do it right at source and then you don't have to polish the turd, or sprinkle it with glitter later lol

    There are no fixed rules - have fun experimenting :)

  • I can't imagine using this workflow, but different horses for different courses as @soundgh2 says. The fact is, a lot of ppl do things differerntly. The one thing I would watch out for in dialog is grainy audio; it's hard to listen to. A lot of the audio in the video (and I haven't heard the final, production version) is quite grainy. I would want to chase that down.
    I must point out that a lot of broadcast companies in Europe require conforming to the new (not so new anymore) loudness standards. Everyone needs to get up to speed on that, if you haven't already.

  • Isn't it a little too much effort for 48 hour film with quite shitty contents?

  • @DrDave

    I must point out that a lot of broadcast companies in Europe require conforming to the new (not so new anymore) loudness standards. Everyone needs to get up to speed on that, if you haven't already.

    Hearing many broadcasters on TV I am sure most people will conclude that they have almost no standards considering loudness. :-)

  • You say standard, but even within the UK broadcasters there's still some slight deviation on the R128 spec - some are +-1dB some +-0.5 - Sky diverge dialogue in 5.1 mixes (cheap M&E lol) Discovery are -24 on some -23 on others and most others don't know what they want, so still always a good idea to check what they've decided this weeks flavour is hehe. I mixed an MTV show last week that was still PPM 6, so go figure lol you never know! At least with R128 you have a bit more headroom (to -1dBTP) rather than squishing everything to -10dB these days so you can allow some room for loud FX and let the dialogue sit in the space it should be in, not fighting with everything.

  • Right, but there are some standards elsewhere and it's a good idea to check. There is an EU loudness spec--EBU R128 as @soundgh2 noted--that's what I was referring to, but it isn't in use everywhere. Loudness isn't just db's.