Another speech sample reached 216kbps. That's weird (v6 abr > v6 vbr), because my tests showed the opposite, and you yourself said v6 vbr had decreased bit usage considerably (so it should imply higher efficiency, which is what I noticed). ABR with -aac_coder anmr But I got a solution. And what makes 320kbps particularly bad? Does Yellow Mold continue to damage a creature that's already at 0 HP (causing failed death saves)? The error msg are "av_interleaved_write_frame(): Not enough space" or "Audio encoding failed (avcodec_encode_audio2)". So, for VBR, you make psy work at transparent settings, and compensate bit allocation based on RD scaling. -aac_pred 1 enables -profile:a aac_main and -profile:a aac_main enables -aac_pred 1. ffmpeg77758 -i in.wav -c:a aac -profile:a aac_pred out.mp4 The black bar is 256 samples. I can fix it (and I might post a patch fixing it), but I'm not sure how long it will take, so I don't want to delay committing these advances even more just for this that isn't even a regression. I find it works better, the other was was pretty dull for 64k/ch, which ought to be transparent for AAC. I managed to reproduce comment:459, but I'm still investigating it. As for N-54096-ge41bf19 I've got from git -- the first sample collapses at 256-432kbps. q=2-31, 200 kb/s, 90k tbn, 29.92 tbc This is certainly better, although exact optimal value is debatable. A new issue, or just an existing problem? It is possible that they need to be updated to avoid the crashing, although I don't see how exactly. Kinda like coding dyslexia. This paper, fig. Now the patch does not apply to the git head. I tracked it all the way to codebook_trellis_rate and encode_window_bands_info with short windows. Mahler (brass I guess in general) still sounds artifacty. Then, make sure that average tested sample bitrate isn't very far from the "standard" bitrate. Just doing some regression ABX testing, since the objective (PSNR) A/B script pointed out some seemingly significant regressions (until now I couldn't confirm anyone with ABX but I'm not done testing yet). I'd like to use options like -b:a 256k -vbr. v9b version, based on v9, matched behavior against v7. When the optimum allocation spans more than SCALE_MAX_DIFF sf, anmr is carefull not to create allocations that result in deltas to be encoded greater than SCALE_MAX_DIFF, but codebook_trellis_rate and the other both undo this. Speech still takes more bits, but less. It shouldn't be hard. N-54096-ge41bf19 at 352kbps is the worst quality. The ancient FFmpeg set bitrate by the unit of kbps. This seems to work better, but it's a bit rushed. ffmpeg_r62950_v8f -i %i -c:a aac -strict experimental -q:a 0.7 -aac_coder faac %o Average Speed: 7.5x By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy. I'll probably do that tonight. Oops, you said original release 1.2. It causes the assertion error at aacenc.c line 399 by -aac_coder anmr on all -b:a and -q:a 0.1695 or bigger. ​ In the VBR encoding, the speech takes many bits and the music takes less bits. In that kind of test, v6 vbr sometimes requires lots more bits for some pathological files (techno seems to drive it crazy, can't blame it). I did manage to find a few low-hanging-bugs in ANMR, but, and this is quaint, fixing them makes ANMR 10x slower. This process will be massively faster than decoding and reencoding everything: Which means FFmpeg can’t convert audio (stream #0.1) to AAC. Std.Average is the average bitrate of my large collection of CDs encoded. While Claudio expresses interest to work on ANMR later, I don't think committing a patch that makes something worse than they should be is a good idea. I wouldn't want to waste your time without a warning: this hack can most assuredly be improved. The patch added another entry, so it'd now be size 13. It's related to -Y switch, and LAME sometimes encodes 16~18kHz contents. The stereo images are still somewhat buggy, especially in ItCouldBeSweet?.wv This only affects heavy synth samples but should fix a lot of bugs which might be related currently. I need CBR,,,82865.0.html, Alright, i'm attaching a new VBR patch. Now, I let bit allocation zero out beyond 1.2. FDK-AAC still beats FFmpeg's native AAC encoder by a significant margin. Another bug, typically happens when hi-hats are present. So while hackish and probably suboptimal, I'll probably leave it as-is since it works well enough. What should be the next opponents in the next blind listening test including the newer patch? If you require a low audio bitrate, such as ≤ 32kbs/channel, then HE-AAC would be worth considering if your player or device can support HE-AAC decoding. Can schedule it for later. This has the worst quality. Keep up the good work on a great piece of software. ​ I've improved TNS and have made it the default (-aac_tns 1). My current impression, from non-blind test of non-samples: v5 abr = v6 abr > v6 vbr >> v4 abr. Stream #0.0: Video: mpeg4, yuv420p, 320×240 [PAR 1:1 DAR 4:3], BTW... on which revision are you applying v9b? And some transients are indeed encoded up to 22kHz. aacenc.c line 106 and 133, this properly worked, and the result 7350Hz aac mp4s were playable on FFmpeg, foobar2000 v1.2.9 and Media Player Classic. Whitenoise.flac without the sound of right channel. Is applying new LPF method comment:51 easy? Is it discardable? This causes the assertion error on -q:a 1 on v9b. It is correct since the option aac_pred will enable AAC-Main prediction, even though it's not the name of the profile. @cehoyos: this patch makes ANMR worse than default twoloop, even it is theoretically better and takes more time. v5 seems to be stable. A sound that degrades on FFmpeg native aac encoder. Anyway, looking forward to hearing about the results of those tests. If you are only going to play it on your computer, or you are sure that your hardware player supports HE-AAC, you can aim for a bit rate of 160kb/s for version 1, or 128kb/s for version 2. But I don't think I'll waste time improving a hack, since the real solution is to implement a dynamic programming coder, which I intend to do in the future. It did in all the cases I tested. I'm only going to check whether it's an issue with M/S coding (doesn't seem to be), because I'd like the patch set to end up making M/S coding the default. Two crash bugs on 240kbps on both -profile:a aac_ltp and default. I doubt the sanity of lower spreading function at the highest band, because using -cutoff 18000 option improves the quality on problematic samples, and these problematic samples always includes strong 20-22kHz sounds. so a novice user of modern FFmpeg may set bitrate like -b:a 128. ​ I will try to hunt for that fatboy issue again tonight, and if no progress is made, I will separate the patch into progressive improvements and test each separately. Hm, I can't seem to replicate either. Avoid the situation that one can hear the 12-20kHz content in some part of the music, and hearing the dull 12kHz LPF-like sound in the other part of the music. I have just finished the blind listening test and the result is here. With all the quantization noise I don't think we care that much about ripple, but even if we did, FFTs can be made to minimize it. I failed to reproduce the results after cutting to distributable short clips. ​ ​ Actually quite well. This encoder is the default AAC encoder, natively implemented into FFmpeg. The whitenoise I mention is generated with the random generator, I'll try with the flac first chance I get. Another type of holes. Yes, the speech bug I noticed, because VBR was unconstrained. ​ I'd like to hear your opinion. Stream #0.0: Video: h264, yuv420p, 320×240 [PAR 1:1 DAR 4:3], 66 kb/s, Might look into a quick fix but it's a lower priority than reviewing that patch, considering how many artifacts it fixes. mybloodrusts.flac encoded at -b:a 128k by ffmpeg75043-gb31041a. v9 VBR performs better too except in fatboy, I'll analyze the differences to v7 next to see why that's so. I thought of using float 32bit as the intermediate format, but FFmpeg's float pcm_f32le had the gain half of what it should be, and even after adjusting gain, much error(average of |lossy-original|) existed, unlike faad or madplay. I'm thinking of conducting a personal listening test of the stable v7 or the experimental M/S enabled v8g (or anything latest). Maybe I am missing something. ​ No, the current git master contains no changes from the previous v9b or the future v9c yet. This VBR is experimental and likely to get even worse results than the CBR. Great to see that the native AAC encoder is getting some attention, and trying to make it mainstream. Male speech have more bitrates than the female one. I have tested -b:a 128k -ar 44100, -b:a 192k -ar 32000, -b:a 320k -ar 48000, -q:a 1 -ar 44100, -q:a 2 -ar 48000, without additional -aac_pns enable nor -aac_is enable settings. But for Nero, vbr may not be noticeably superior to abr. These settings target a specific bit rate, with less variation between samples. These six outputs were shuffled and I listen to them without knowing which is which. The v6 patch is already a very good one, so I'm very satisfied with the current quality, but fairness can be a problem, so if you have the version that reduces bitrate on speech samples, I'd like to test the one with reduced bitrate. Powered by Trac 1.0.9 Still, everything seems quite specific to Opus. My checkout has no such commit hash. If you can attach the rare samples I can debug. The patch does not apply here to current git head. It sounds like a stopwatch. I'll try my best to find problems before the test. Kamedo2: Not sure how but I got an email invitation from Shion to Slack (Audio Video Encoding Community) which you are apparently a member of. I reproduced the aacenc.c assertion errors on ARM, but not the audio_frame_queue.c assertion error on comment:484. Feel free to choose. However, I tested most of the samples in your session, and they've all improved. EDIT 1: as @Gyan suggested, I've tried to reduce -bufsize parameter, but I still have a too high bitrate. When the native aac encoder calcs a masking curve, almost inaudible sounds like 18kHz, 20kHz, 22kHz is taking into account, and audible sound like 14kHz is masked by the inaudibles. Anything higher may benefit more from AAC-LC due to less processing. Throwing a fair die until most recent roll is smaller than previous one. And on the RD-reduction step, both v7 and v8g were assuming decreasing scalefactors had a predictable effect on distortion, and v9 just recomputes distortion, which proved to be a big improvement on VBR. It also has a more robust tonality boost (form factor in this patch) method, which accounts for what psy already does (I noticed it does its own bit). Release 1.2 and 1.2.1 have much narrower degradation range, and the 1.2* is less severe at the range. sounds good, that resolves my concerns I think it will contribute to higher quality in 160kbps and 192kbps. [duplicate], © 2014 - All Rights Reserved - Powered by, Convert a video to MP4 (H.264/AAC) with ffmpeg,;a=summary, Converting FLV to MP4 With FFmpeg The Ultimate Guide, recompile ffmpeg (from source) so that it supports x264.