利用python对视频字幕进行识别

时间：2023-05-26

import easyocrimport cv2 as cvfrom PIL import Imagevideo_file = cv.VideoCapture(r"f4459201ee68667a36dee475fe96159c.mp4")video_fps=video_file.get(cv.CAP_PROP_FPS)print(video_file.get(cv.CAP_PROP_FPS))total_frames = int(video_file.get(cv.CAP_PROP_frame_COUNT))image_size = (int(video_file.get(cv.CAP_PROP_frame_HEIGHT)), int(video_file.get(cv.CAP_PROP_frame_WIDTH)))frames_height,frames_weight=image_size[0],image_size[1]print(frames_height)count_frame_start=0count_frame_end=0thresh = 220#设定阈值进行二值化temporary_frame=[]reader=easyocr.Reader(["ch_sim","en"],gpu=False)#用于识别文字"""下面函数分别计算是否有字幕的区别字幕是否相同的相当于从无----->有计算每张图像与 0 值图像的误差return ((img - imgo) ** 2).sum() / img.size * 100 可以添加到代码中，然后变成属于你自己的代码有----->变化接着计算相同字幕和不同字幕图像直接的误差"""def cal_video(img, imgo=None): return ((img - imgo) ** 2).sum() / img.size * 100while True: success, frames = video_file.read() print("打开第{}帧".format(count_frame_start)) frames_cut = frames[:, :, 0]#[(486, 864)] frames_wh_cut = frames_cut[frames_height-75: frames_height-6, :] _, frames_threshold = cv.threshold(frames_wh_cut, 220, 255, cv.THRESH_BINARY) temporary_frame.append(frames_threshold) if count_frame_start>1: del temporary_frame[0] if cal_video(temporary_frame[1],temporary_frame[0])>2: print("程序运行！第{}帧".format(count_frame_start)) result = reader.readtext(frames_wh_cut) if len(result)>0: f=open("嫦娥奔月.txt", "a", encoding="utf-8") f.write(str(count_frame_end) + "--->" + str(count_frame_start)+"n") f.write(str(result[0][1])+"n") f.close() count_frame_end = count_frame_start count_frame_start+=1 #cv.imwrite(r"D:PycharmProjectspythonProjectfeijiimage{}.jpg".format(count_frame_start),frames_wh_cut) else: continue else: pass count_frame_start+=1

大家运行程序注意一下几点：

1.更改视频地址，选择你需要进行字幕识别的视频地址

2.对所截取的视频字幕图片进行二值化，其中阈值可以自己更改

3.相邻图片的相似值的阈值可以进行更改。

总体来说，识别还是有一些问题的，大家将这个代码跑完之后就会发现问题所在，如果谁能够提升效果，希望告知，谢谢。

上一篇：[001]Python交互式程序-用户输入|PythonGuide

下一篇：【PyCharm快捷键】