ffplay源码分析6-音频重采样

作者:leisure 本文为作者原创,转载请注明出处:https://www.cnblogs.com/leisure_chn/p/10312713.html ffplay是FFmpeg工程自带的简单播放器,使用FFmpeg提供的解码器和SDL库进行视频播放。本文基于FFmpeg工程4.1版本进行分析,其中ffplay源码清单如下: https://github.com/FFmpeg/FFmpeg/blob/n4.1/fftools/ffplay.c 在尝试分析源码前,可先阅读如下参考文章作为铺垫: [1]. 雷霄骅,视音频编解码技术零基础学习方法 [2]. 视频编解码基础概念 [3]. 色彩空间与像素格式 [4]. 音频参数解析 [5]. FFmpeg基础概念 “ffplay源码分析”系列文章如下: [1]. ffplay源码分析1-概述 [2]. ffplay源码分析2-数据结构 [3]. ffplay源码分析3-代码框架 [4]. ffplay源码分析4-音视频同步 [5]. ffplay源码分析5-图像格式转换 [6]. ffplay源码分析6-音频重采样 [7]. ffplay源码分析7-播放控制 6. 音频重采样 FFmpeg解码得到的音频帧的格式未必能被SDL支持,在这种情况下,需要进行音频重采样,即将音频帧格式转换为SDL支持的音频格式,否则是无法正常播放的。 音频重采样涉及两个步骤: 1) 打开音频设备时进行的准备工作:确定SDL支持的音频格式,作为后期音频重采样的目标格式 2) 音频播放线程中,取出音频帧后,若有需要(音频帧格式与SDL支持音频格式不匹配)则进行重采样,否则直接输出 6.1 打开音频设备 音频设备的打开实际是在解复用线程中实现的。解复用线程中先打开音频设备(设定音频回调函数供SDL音频播放线程回调),然后再创建音频解码线程。调用链如下: 1 2 3 4 5 6 main() --> stream_open() --> read_thread() --> stream_component_open() --> audio_open(is, channel_layout, nb_channels, sample_rate, &is->audio_tgt); decoder_start(&is->auddec, audio_thread, is); audio_open()函数填入期望的音频参数,打开音频设备后,将实际的音频参数存入输出参数is->audio_tgt中,后面音频播放线程用会用到此参数,使用此参数将原始音频数据重采样,转换为音频设备支持的格式。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 static int audio_open(void *opaque, int64_t wanted_channel_layout, int wanted_nb_channels, int wanted_sample_rate, struct AudioParams *audio_hw_params) { SDL_AudioSpec wanted_spec, spec; const char *env; static const int next_nb_channels[] = {0, 0, 1, 6, 2, 6, 4, 6}; static const int next_sample_rates[] = {0, 44100, 48000, 96000, 192000}; int next_sample_rate_idx = FF_ARRAY_ELEMS(next_sample_rates) - 1; env = SDL_getenv("SDL_AUDIO_CHANNELS"); if (env) { // 若环境变量有设置,优先从环境变量取得声道数和声道布局 wanted_nb_channels = atoi(env); wanted_channel_layout = av_get_default_channel_layout(wanted_nb_channels); } if (!wanted_channel_layout || wanted_nb_channels != av_get_channel_layout_nb_channels(wanted_channel_layout)) { wanted_channel_layout = av_get_default_channel_layout(wanted_nb_channels); wanted_channel_layout &= ~AV_CH_LAYOUT_STEREO_DOWNMIX; } // 根据channel_layout获取nb_channels,当传入参数wanted_nb_channels不匹配时,此处会作修正 wanted_nb_channels = av_get_channel_layout_nb_channels(wanted_channel_layout); wanted_spec.channels = wanted_nb_channels; // 声道数 wanted_spec.freq = wanted_sample_rate; // 采样率 if (wanted_spec.freq <= 0 || wanted_spec.channels <= 0) { av_log(NULL, AV_LOG_ERROR, "Invalid sample rate or channel count!\n"); return -1; } while (next_sample_rate_idx && next_sample_rates[next_sample_rate_idx] >= wanted_spec.freq) next_sample_rate_idx--; // 从采样率数组中找到第一个不大于传入参数wanted_sample_rate的值 // 音频采样格式有两大类型:planar和packed,假设一个双声道音频文件,一个左声道采样点记作L,一个右声道采样点记作R,则: // planar存储格式:(plane1)LLLLLLLL...LLLL (plane2)RRRRRRRR...RRRR // packed存储格式:(plane1)LRLRLRLR...........................LRLR // 在这两种采样类型下,又细分多种采样格式,如AV_SAMPLE_FMT_S16、AV_SAMPLE_FMT_S16P等,注意SDL2.0目前不支持planar格式 // channel_layout是int64_t类型,表示音频声道布局,每bit代表一个特定的声道,参考channel_layout.h中的定义,一目了然 // 数据量(bits/秒) = 采样率(Hz) * 采样深度(bit) * 声道数 wanted_spec.format = AUDIO_S16SYS; // 采样格式:S表带符号,16是采样深度(位深),SYS表采用系统字节序,这个宏在SDL中定义 wanted_spec.silence = 0; // 静音值 wanted_spec.samples = FFMAX(SDL_AUDIO_MIN_BUFFER_SIZE, 2 << av_log2(wanted_spec.freq / SDL_AUDIO_MAX_CALLBACKS_PER_SEC)); // SDL声音缓冲区尺寸,单位是单声道采样点尺寸x声道数 wanted_spec.callback = sdl_audio_callback; // 回调函数,若为NULL,则应使用SDL_QueueAudio()机制 wanted_spec.userdata = opaque; // 提供给回调函数的参数 // 打开音频设备并创建音频处理线程。期望的参数是wanted_spec,实际得到的硬件参数是spec // 1) SDL提供两种使音频设备取得音频数据方法: // a. push,SDL以特定的频率调用回调函数,在回调函数中取得音频数据 // b. pull,用户程序以特定的频率调用SDL_QueueAudio(),向音频设备提供数据。此种情况wanted_spec.callback=NULL // 2) 音频设备打开后播放静音,不启动回调,调用SDL_PauseAudio(0)后启动回调,开始正常播放音频 // SDL_OpenAudioDevice()第一个参数为NULL时,等价于SDL_OpenAudio() while (!(audio_dev = SDL_OpenAudioDevice(NULL, 0, &wanted_spec, &spec, SDL_AUDIO_ALLOW_FREQUENCY_CHANGE | SDL_AUDIO_ALLOW_CHANNELS_CHANGE))) { av_log(NULL, AV_LOG_WARNING, "SDL_OpenAudio (%d channels, %d Hz): %s\n", wanted_spec.channels, wanted_spec.freq, SDL_GetError()); // 如果打开音频设备失败,则尝试用不同的声道数或采样率再试打开音频设备,这里有些奇怪,暂不深究 wanted_spec.channels = next_nb_channels[FFMIN(7, wanted_spec.channels)]; if (!wanted_spec.channels) { wanted_spec.freq = next_sample_rates[next_sample_rate_idx--]; wanted_spec.channels = wanted_nb_channels; if (!wanted_spec.freq) { av_log(NULL, AV_LOG_ERROR, "No more combinations to try, audio open failed\n"); return -1; } } wanted_channel_layout = av_get_default_channel_layout(wanted_spec.channels); } // 检查打开音频设备的实际参数:采样格式 if (spec.format != AUDIO_S16SYS) { av_log(NULL, AV_LOG_ERROR, "SDL advised audio format %d is not supported!\n", spec.format); return -1; } // 检查打开音频设备的实际参数:声道数 if (spec.channels != wanted_spec.channels) { wanted_channel_layout = av_get_default_channel_layout(spec.channels); if (!wanted_channel_layout) { av_log(NULL, AV_LOG_ERROR, "SDL advised channel count %d is not supported!\n", spec.channels); return -1; } } // wanted_spec是期望的参数,spec是实际的参数,wanted_spec和spec都是SDL中的结构。 // 此处audio_hw_params是FFmpeg中的参数,输出参数供上级函数使用 audio_hw_params->fmt = AV_SAMPLE_FMT_S16; audio_hw_params->freq = spec.freq; audio_hw_params->channel_layout = wanted_channel_layout; audio_hw_params->channels = spec.channels; audio_hw_params->frame_size = av_samples_get_buffer_size(NULL, audio_hw_params->channels, 1, audio_hw_params->fmt, 1); audio_hw_params->bytes_per_sec = av_samples_get_buffer_size(NULL, audio_hw_params->channels, audio_hw_params->freq, audio_hw_params->fmt, 1); if (audio_hw_params->bytes_per_sec <= 0 || audio_hw_params->frame_size <= 0) { av_log(NULL, AV_LOG_ERROR, "av_samples_get_buffer_size failed\n"); return -1; } return spec.size; } 打开音频设备,涉及到FFmpeg中音频存储的基础概念,为稍显清晰,将相关注释摘抄如下: 6.1.1 音频格式相关 1 2 3 4 5 6 7 **planar&packed** 音频采样格式有两大类型:planar和packed,假设一个双声道音频文件,一个左声道采样点记作L,一个右声道采样点记作R,则: planar存储格式:(plane1)LLLLLLLL...LLLL (plane2)RRRRRRRR...RRRR packed存储格式:(plane1)LRLRLRLR...........................LRLR 在这两种采样类型下,又细分多种采样格式,如AV_SAMPLE_FMT_S16、AV_SAMPLE_FMT_S16P等,注意SDL2.0目前不支持planar格式 SDL中定义音频参数数据结构定义如下: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 /** * The calculated values in this structure are calculated by SDL_OpenAudio(). * * For multi-channel audio, the default SDL channel mapping is: * 2: FL FR (stereo) * 3: FL FR LFE (2.1 surround) * 4: FL FR BL BR (quad) * 5: FL FR FC BL BR (quad + center) * 6: FL FR FC LFE SL SR (5.1 surround - last two can also be BL BR) * 7: FL FR FC LFE BC SL SR (6.1 surround) * 8: FL FR FC LFE BL BR SL SR (7.1 surround) */ typedef struct SDL_AudioSpec { int freq; /**< DSP frequency -- samples per second */ SDL_AudioFormat format; /**< Audio data format */ Uint8 channels; /**< Number of channels: 1 mono, 2 stereo */ Uint8 silence; /**< Audio buffer silence value (calculated) */ Uint16 samples; /**< Audio buffer size in sample FRAMES (total samples divided by channel count) */ Uint16 padding; /**< Necessary for some compile environments */ Uint32 size; /**< Audio buffer size in bytes (calculated) */ SDL_AudioCallback callback; /**< Callback that feeds the audio device (NULL to use SDL_QueueAudio()). */ void *userdata; /**< Userdata passed to callback (ignored for NULL callbacks). */ } SDL_AudioSpec; SDL音频格式定义如下: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 /** * \brief Audio format flags. * * These are what the 16 bits in SDL_AudioFormat currently mean... * (Unspecified bits are always zero). * * \verbatim ++-----------------------sample is signed if set || || ++-----------sample is bigendian if set || || || || ++---sample is float if set || || || || || || +---sample bit size---+ || || || | | 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 \endverbatim * * There are macros in SDL 2.0 and later to query these bits. */ typedef Uint16 SDL_AudioFormat; /** * \name Audio format flags * * Defaults to LSB byte order. */ /* @{ */ #define AUDIO_U8 0x0008 /**< Unsigned 8-bit samples */ #define AUDIO_S8 0x8008 /**< Signed 8-bit samples */ #define AUDIO_U16LSB 0x0010 /**< Unsigned 16-bit samples */ #define AUDIO_S16LSB 0x8010 /**< Signed 16-bit samples */ #define AUDIO_U16MSB 0x1010 /**< As above, but big-endian byte order */ #define AUDIO_S16MSB 0x9010 /**< As above, but big-endian byte order */ #define AUDIO_U16 AUDIO_U16LSB #define AUDIO_S16 AUDIO_S16LSB /* @} */ FFmpeg中定义音频参数的相关数据结构为: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 // 这个结构是在ffplay.c中定义的: typedef struct AudioParams { int freq; int channels; int64_t channel_layout; enum AVSampleFormat fmt; int frame_size; int bytes_per_sec; } AudioParams; /** * Audio sample formats * * - The data described by the sample format is always in native-endian order. * Sample values can be expressed by native C types, hence the lack of a signed * 24-bit sample format even though it is a common raw audio data format. * * - The floating-point formats are based on full volume being in the range * [-1.0, 1.0]. Any values outside this range are beyond full volume level. * * - The data layout as used in av_samples_fill_arrays() and elsewhere in FFmpeg * (such as AVFrame in libavcodec) is as follows: * * @par * For planar sample formats, each audio channel is in a separate data plane, * and linesize is the buffer size, in bytes, for a single plane. All data * planes must be the same size. For packed sample formats, only the first data * plane is used, and samples for each channel are interleaved. In this case, * linesize is the buffer size, in bytes, for the 1 plane. * */ enum AVSampleFormat { AV_SAMPLE_FMT_NONE = -1, AV_SAMPLE_FMT_U8, ///< unsigned 8 bits AV_SAMPLE_FMT_S16, ///< signed 16 bits AV_SAMPLE_FMT_S32, ///< signed 32 bits AV_SAMPLE_FMT_FLT, ///< float AV_SAMPLE_FMT_DBL, ///< double AV_SAMPLE_FMT_U8P, ///< unsigned 8 bits, planar AV_SAMPLE_FMT_S16P, ///< signed 16 bits, planar AV_SAMPLE_FMT_S32P, ///< signed 32 bits, planar AV_SAMPLE_FMT_FLTP, ///< float, planar AV_SAMPLE_FMT_DBLP, ///< double, planar AV_SAMPLE_FMT_S64, ///< signed 64 bits AV_SAMPLE_FMT_S64P, ///< signed 64 bits, planar AV_SAMPLE_FMT_NB ///< Number of sample formats. DO NOT USE if linking dynamically }; 1 2 **channel_layout** channel_layout是int64_t类型,表示音频声道布局,每bit代表一个特定的声道,参考channel_layout.h中的定义: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 /** * @defgroup channel_masks Audio channel masks * * A channel layout is a 64-bits integer with a bit set for every channel. * The number of bits set must be equal to the number of channels. * The value 0 means that the channel layout is not known. * @note this data structure is not powerful enough to handle channels * combinations that have the same channel multiple times, such as * dual-mono. * * @{ */ #define AV_CH_FRONT_LEFT 0x00000001 #define AV_CH_FRONT_RIGHT 0x00000002 #define AV_CH_FRONT_CENTER 0x00000004 #define AV_CH_LOW_FREQUENCY 0x00000008 #define AV_CH_BACK_LEFT 0x00000010 #define AV_CH_BACK_RIGHT 0x00000020 #define AV_CH_FRONT_LEFT_OF_CENTER 0x00000040 #define AV_CH_FRONT_RIGHT_OF_CENTER 0x00000080 #define AV_CH_BACK_CENTER 0x00000100 #define AV_CH_SIDE_LEFT 0x00000200 #define AV_CH_SIDE_RIGHT 0x00000400 #define AV_CH_TOP_CENTER 0x00000800 #define AV_CH_TOP_FRONT_LEFT 0x00001000 #define AV_CH_TOP_FRONT_CENTER 0x00002000 #define AV_CH_TOP_FRONT_RIGHT 0x00004000 #define AV_CH_TOP_BACK_LEFT 0x00008000 #define AV_CH_TOP_BACK_CENTER 0x00010000 #define AV_CH_TOP_BACK_RIGHT 0x00020000 #define AV_CH_STEREO_LEFT 0x20000000 ///< Stereo downmix. #define AV_CH_STEREO_RIGHT 0x4000000
50000+
5万行代码练就真实本领
17年
创办于2008年老牌培训机构
1000+
合作企业
98%
就业率

联系我们

电话咨询

0532-85025005

扫码添加微信