Pyav代码分析

PyAV提供了ffmpeg的python接口，但实际是它只是使用ffmpeg做后端，使用Cython封装了ffmpeg的接口，所以实际调用的还是ffmpeg。

也就是说，PyAV用类封装了ffmpeg提供的API，如果想要使用，关键还是要看懂其整体架构。
PYAV用类封装了ffmpeg的几个关键结构体

名称	作用
packet	封装了ffmpegde AVPacket
frame	封装了ffmpeg的AVframe
stream	封装了ffmpeg的AVStream
option	封装了ffmpeg的AVOption
InputContainer	封装了ffmpeg的avformat_open_input demux
OutputContainer	封装了ffmpeg的av_interleaved_write_frame mux
CodecContext	封装了ffmpeg codec相关代码

具体使用的，如果你有自己的ffmpeg，那么先编译安装自己的ffmpeg，然后：

pip install av --no-binary av

如果没有自己的ffmpeg::

pip install av

安装好之后就可以使用了。

下面先看几个简单的案例：

import os
import subprocess
import logging
import timelogging.basicConfig(level=logging.DEBUG)
logging.getLogger('libav').setLevel(logging.DEBUG)import av
import av.datasets# We want an H.264 stream in the Annex B byte-stream format.
# We haven't exposed bitstream filters yet, so we're gonna use the `ffmpeg` CLI.
h264_path = "libx264_640x360_baseline_5_frames.h264"
# if not os.path.exists(h264_path):
#     subprocess.check_call(
#         [
#             "ffmpeg",
#             "-i",
#             av.datasets.curated("zzsin_1920x1080_60fps_60s.mp4"),
#             "-vcodec",
#             "copy",
#             "-an",
#             "-bsf:v",
#             "h264_mp4toannexb",
#             h264_path,
#         ]
#     )fh = open(h264_path, "rb")codec = av.CodecContext.create("h264_efcodec", "r")
codec.options={"hw_id":"15"}print(codec.name)
first= True
count=0
while True:chunk = fh.read(1 << 16)packets = codec.parse(chunk)print("Parsed {} packets from {} bytes:".format(len(packets), len(chunk)))for packet in packets:print("   ", packet)frames = codec.decode(packet)if first:time.sleep(2)first=Falsefor frame in frames:print("       ", frame)count+=1print('--count:%d--'%count)frame.to_image().save("night-sky.{:04d}.jpg".format(count),quality=80,)# We wait until the end to bail so that the last empty `buf` flushes# the parser.if not chunk:breakp=av.Packet(None)
print("send eos:", p)
frames = codec.decode(p) 
for frame in frames:print("       ", frame)count+=1print('--count:%d--'%count)frame.to_image().save("night-sky.{:04d}.jpg".format(count),quality=80,)print('all count:%d'%count)

上面是通过创建codec来进行解码的案例，可以在创建的时候指定解码器名称以及可以设置option。这里注意一点就是这里只能parse annexb格式的视频流 AVCC的视频流是不能在这里解析的。

下面是另外一个demux codec的案例：

import timeimport av
import av.datasetscontainer = av.open('ocr_400_400_5_frames.mp4')
first=Truecount=0
start_time = time.time()
for packet in container.demux():print(packet)for frame in packet.decode():print(frame)count+=1print('---frame:%d---'%count)if first:time.sleep(2)first=Falseauto_time = time.time() - start_time
container.close()
print('all frame:%d',count)

这里的codec是container中内置的一个解码器，这里的解码器是无法自主选择具体使用那个解码器的。

综合上面两个案例，我们可以使用下面的方法来解码：

import os
import subprocess
import logging
import timeimport avlogging.basicConfig(level=logging.DEBUG)
logging.getLogger('libav').setLevel(logging.DEBUG)h264_path = "ocr_400_400_5_frames.mp4"
input_ = av.open(h264_path,options={"vsync":"0"})
in_stream = input_.streams.video[0]codec = av.CodecContext.create("h264_efcodec", "r")
codec.options={"hw_id":"15"}
# codec.options={"device_id":"0"}
print(codec.name)
# print(codec.extradata_size)
codec.extradata =in_stream.codec_context.extradatafirst=True
num = 0for packet in input_.demux(in_stream):print('----packet---')packet.dts =0packet.pts = 0print("   ", packet)frames = codec.decode(packet)print('---after decode---')if first:time.sleep(2)first=Falsefor frame in frames:print("       ", frame)num+=1print('-----frame:%d-----'%num)print('all:%d'%num)

上面这个案例结合了第一个和第二个解码的使用方法，在这里我们采用demux+decode(自己设置的解码器)。不要觉得这里很简单，这是我看完整个封装源代码才搞清楚的，当然这里唯一的缺点是inpoutcontainer内部为了分配condec，多占用了一些内存，不过这也无所谓了。

看完上面实例，可能发现一个sleep(2)，为什么要加这句？主要是因为，我们硬件解码器open()的时候花费的时间较长，这里增加sleep函数来等待底下硬件解码器完全启动，不然会出现所有的输入数据送完了，解码器一帧数据都还没有解码出来。这里又引出PyAV的一个局限，它只能调用封装后的decode()接口，无法调用更加细粒度的ffmpeg接口，导致无法像ffmpeg那样通过循环调用avcodec_receive_frame（）来取解码后的数据。

static int decode(AVCodecContext *dec_ctx) {int ret;AVPacket packet;AVFrame *p_frame;int eos = 0;p_frame = av_frame_alloc();while(1) {ret = av_read_frame(g_ifmt_ctx, &packet);if (ret == AVERROR_EOF) {av_log(g_dec_ctx, AV_LOG_INFO, "av_read_frame got eof\n");eos = 1;} else if (ret < 0) {av_log(g_dec_ctx, AV_LOG_ERROR, "av_read_frame failed, ret(%d)\n", ret);goto fail;}if (packet.stream_index != video_stream_idx) {av_packet_unref(&packet);continue;}ret = avcodec_send_packet(dec_ctx, &packet);if (ret < 0) {av_log(dec_ctx, AV_LOG_ERROR,"send pkt failed, ret(%d), %s, %d\n", ret, __FILE__, __LINE__);goto fail;}
//这里就是最后循环取出解码器中的yuv数据while (ret >= 0 || eos) {ret = avcodec_receive_frame(dec_ctx, p_frame);if (ret == AVERROR_EOF) {av_log(g_dec_ctx, AV_LOG_INFO, "dec receive eos\n");av_frame_unref(p_frame);av_frame_free(&p_frame);return 0;} else if (ret == 0) {save_yuv_file(dec_ctx, p_frame);av_frame_unref(p_frame);} else if (ret < 0 && ret != AVERROR(EAGAIN)) {av_log(dec_ctx, AV_LOG_ERROR, "receive frame failed\n");goto fail;}}av_packet_unref(&packet);}fail:av_frame_free(&p_frame);return -1;
}

到这里为止，Pyav基础用法基本完成。接下来讲一下架构。
Packet类：
主要封装了AVPacket，提供了一些可以set/get packet成员的一些property，里面函数有to_bytes（）可以将data数据转为bytes对象，另外还有一个decode()，是通过其内部stream的指针去指向codec，然后去解码。

序号	Value
成员变量-1	AVPacket* ptr
成员变量-2	Stream _stream
Property	stream_index
Property	stream
Property	time_base
Property	pts
Property	dts
Property	pos
Property	size
Property	is_keyframe
Property	is_corrupt
Property	buffer_size(也就是packet的datasize)
Property	buffer_ptr
Property	to_bytes(将data转为python的bytes)
Fun	decode(self._stream.decode(self))
构造	self.ptr = lib.av_packet_alloc()
析构	lib.av_packet_free(&self.ptr)

Frame类,这是一个基类，所以里面只有基础信息

序号	Value
成员变量-1	AVFrame *ptr
成员变量-2	int index
成员变量-3	AVRational _time_base
成员变量-3	_SideDataContainer _side_data
Property	dts
Property	pts
Property	time(The presentation time in seconds for this frame)
Property	time_base(fractions.Fraction)
Property	is_corrupt( Is this frame corrupt?)
Property	side_data
构造	self.ptr = lib.av_frame_alloc()
析构	lib.av_frame_free(&self.ptr)

VideoFrame类：
该类继承了Frame类，除了提供了获取avframe类中的变量外，还提供了几个函数，可以csc颜色空间转换，保存jpg，或者从jpg，nump数组中转为Frame.

序号	Value
成员变量-1	VideoReformatter reformatter
成员变量-2	VideoFormat format
Property	width
Property	height
Property	key_frame
Property	interlaced_frame
Property	pict_type
Property	planes
Fun	to_rgb()( return self.reformat(format=“rgb24”, **kwargs))
Fun	to_image(可以保存为jpg)
Fun	to_ndarray()
Fun	from_image(img)
Fun	from_ndarray()

Stream类：
在stream类中还包含了两个其它的类：Container 和CodecContext

序号	Value
成员变量-1	AVStream *ptr
成员变量-2	Container container
成员变量-3	CodecContext codec_context
成员变量-4	dict metadata
Property	id
Property	profile
Property	index
Property	average_rate
Property	base_rate
Property	guessed_rate
Property	start_time
Property	duration
Property	frames(The number of frames this stream contains.)
Property	language
Property	Type（ Examples: `'audio'`, `'video'`, `'subtitle'`.）
Fun	encode()
Fun	decode()
Fun	get()/det() att

Pyav代码分析

相关文章

山西电力市场日前价格预测【2023-07-12】

RocketMQ5.0消息消费＜二＞ _ 消息队列负载均衡机制

数据结构--哈夫曼树

amd一键超频怎么用_AMD新版显卡驱动为“肾上腺素 2019”：支持一键超频，语音截屏...

三种SQL实现聚合字段合并（presto、hive、mysql）

MBP禁用AMD显卡

Mac 远程控制(远程桌面)工具推荐

amd核芯显卡控制面板自定义分辨率_经常升级显卡驱动有必要吗？实测告诉你