Minor fixes

Add audio & video export
Add multiline search
2017-11-06 20:42:37 +03:00 · 2017-11-06 20:09:14 +03:00 · 2017-10-14 23:18:50 +03:00 · 2017-10-08 19:41:04 +03:00 · 2017-10-08 19:40:32 +03:00 · 2017-08-27 01:26:58 +03:00
2 changed files with 176 additions and 49 deletions
@@ -1,16 +1,12 @@
-# Playphrase
-Search for specific words in srt files and watch with mpv.  
+# PlayPhrase
+
+Search for specific words or phrases in subtitle files and watch video fragments with [mpv](https://mpv.io/).
+
 Inspired by [videogrep](http://lav.io/2014/06/videogrep-automatic-supercuts-with-python/) and [playphrase.me](http://playphrase.me/).

-# Download
+# Video

-See [https://github.com/kelciour/playphrase/releases](https://github.com/kelciour/playphrase/releases)
-
-# Requirements
-
-* Python 2.7
-* grep
-* mpv
+[![YouTube: PlayPhrase for Movies](http://i.imgur.com/QZ9QSiO.png)](http://youtu.be/ciMEY3moATU)

 # Usage

@@ -19,17 +15,20 @@ Run ```python playphrase.py -i <media_dir> _init_``` to generate txt files from
 After that use 
 ```python playphrase.py -i <media_dir> <phrase>```

-### Keyboard Shortcuts 
-Use ```Enter``` to move to the next clip or ```Shift + <``` and ```Shift + >``` to switch between clips, ```q``` to close player.
+Regular expressions can be used in search, for example, \b for word boundary.

-More info: [https://mpv.io/manual/master/#keyboard-control](https://mpv.io/manual/master/#keyboard-control)
+### Keyboard Shortcuts 
+
+Use ```Enter``` to move to the next clip or ```Shift + <``` and ```Shift + >``` to switch between clips, ```Ctrl + Left``` and ```Ctrl + Right``` to move to the prev/next subtitle, ```q``` to close video player.
+
+More info: [https://mpv.io/manual/stable/#keyboard-control](https://mpv.io/manual/master/#keyboard-control)

 ### Batch Scripts

-There's ```.bat``` (Windows) and ```.sh``` (Linux) files to simplify user input. First time before running edit them and update ```media_dir``` path.
-Use ```quit```, ```exit``` or ```q```, ```x``` to exit from batch script.
+There's ```videogrep.bat``` (Windows) and ```videogrep.sh``` (Linux) files to simplify user input. First time before running edit them and update ```media_dir``` path. Use ```quit```, ```exit``` or ```q```, ```x``` to exit from the batch script.
+
+### Additional Options:

-### Additional options:
 * ```-ph, --phrases GAP_BETWEEN_PHRASES``` 
 move start time of the clip to the beginning of the current phrase. Value is optional (default=1.75 seconds)
 * ```-l, --limit``` 
@@ -39,13 +38,57 @@ padding in seconds to add to the start and end of each clip (default=0.0 seconds
 * ```-e, --ending``` 
 play only matching lines (or phrases)
 * ```-r, --randomize``` 
-randomize the clips
+randomize clips
 * ```-o, --output``` 
 name of the file in which output of \'grep\' command will be written
 * ```-d, --demo``` 
 only show grep results
+* ```-a, --audio```
+create audio fragments
+* ```-v, --video```
+create video fragments
+* ```-s, --video-sub```
+create video fragments with subtitles
+* ```-m, --mpv-options OPTIONS```
+mpv player options
+
+### Optional Configuration Changes
+
+For example, you can modify [mpv.conf](https://mpv.io/manual/stable/#configuration-files)
+
+
+```
+autofit=900
+geometry=50%:50%
+```
+
+and [input.conf](https://mpv.io/manual/stable/#interactive-control)
+
+
+```
+ENTER playlist-next force
+```
+
+More info: [https://mpv.io/manual/](https://mpv.io/manual/)
+
+# Download
+
+See [https://github.com/kelciour/playphrase/releases](https://github.com/kelciour/playphrase/releases/latest)
+
+# Usage with AudioBooks
+
+It's possible to use audiobooks as media input. For that purpose there's ```audiogrep.bat``` and ```audiogrep.sh``` files to simplify user input. But you need to generate subtitles for every audio file. It can be done almost automatically using [aeneas](https://github.com/readbeyond/aeneas). Also [csplit](https://en.wikipedia.org/wiki/Csplit) can be used to split text of the book by chapters and [Pragmatic Segmenter](https://github.com/diasks2/pragmatic_segmenter) to split chapter's content by "sentences".
+
+Here's example video how it looks like (YouTube):
+
+[![YouTube: PlayPhrase for AudioBooks](http://i.imgur.com/gUFXeVI.png)](https://youtu.be/LEyRfy7TsnE)
+
+# Requirements
+
+* python 2.7
+* grep
+* mpv

 # Note

 * playphrase requires the subtitle track and the video file to have the exact same name, up to the extension.
-* just re-run your command if there was an error with pipe
@@ -7,6 +7,7 @@ import re
 import sys
 import subprocess
 import time
+import locale

 from collections import OrderedDict

@@ -114,6 +115,59 @@ def update_mpv_player_cmd(cmd_options, mpv_options):

    return cmd

+def get_fragment_filename(phrase):
+    s = phrase.strip().replace(' ', '_')
+    s = s.replace('.*', '...')
+    max_filename_length = 30
+    if len(s) > max_filename_length:
+        s = s[:max_filename_length] + "..."
+    return re.sub(r'(?u)[^-\w\'\.]', '', s)
+
+def create_fragments(search_phrase, clips, export_mode):
+    idx = 1
+    for video_file, clip_start, clip_end in clips:
+        fragment_filename = get_fragment_filename(search_phrase)
+
+        if len(clips) > 1:
+            fragment_filename += "_" + str(idx).zfill(3)
+
+        ss = clip_start
+        to = clip_end
+        t = to - ss
+
+        t_fade = 0.2
+        af = "afade=t=in:st=%s:d=%s,afade=t=out:st=%s:d=%s" % (0, t_fade, t - t_fade, t_fade)
+        video_encoding_mode = "ultrafast"
+
+        if export_mode["audio"]:
+            cmd = " ".join(["ffmpeg", "-y", "-ss", str(ss), "-i", '"' + video_file + '"', "-loglevel", "quiet", "-t", str(t), "-map", "0:a:0", "-af", af, '"' + fragment_filename + ".mp3" + '"'])
+            p = subprocess.Popen(cmd)
+            p.wait()
+
+        if export_mode["video"]:
+            cmd = " ".join(["ffmpeg", "-y", "-ss", str(ss), "-i", '"' + video_file + '"', "-strict", "-2", "-loglevel", "quiet", "-t", str(t), "-map", "0:v:0", "-map", "0:a:0", "-c:v", "libx264", "-preset", video_encoding_mode, "-c:a", "aac", "-ac", "2", "-af", af, '"' + fragment_filename + ".mp4" + '"'])
+            p = subprocess.Popen(cmd)
+            p.wait()
+
+        if export_mode["video-sub"]:
+            srt_style = "FontName=Arial,FontSize=22"
+
+            srt_filename = video_file[:-4] + ".srt"
+            if srt_filename[1] == ":": # Windows
+                srt_filename = srt_filename.replace("\\", "\\\\\\\\")
+                srt_filename = srt_filename.replace(":", "\\\\:")
+                srt_filename = srt_filename.replace(",", "\\\\\\,")
+                srt_filename = srt_filename.replace("'", "\\\\\\'")
+
+            vf = "\"" + "subtitles=" + srt_filename + ":force_style='" + srt_style + "',setpts=PTS-STARTPTS" + "\""
+            af = "afade=t=in:st=%s:d=%s,afade=t=out:st=%s:d=%s,asetpts=PTS-STARTPTS" % (ss, t_fade, to - t_fade, t_fade)
+
+            cmd = " ".join(["ffmpeg", "-y", "-ss", str(ss), "-i", '"' + video_file + '"', "-strict", "-2", "-loglevel", "quiet", "-t", str(t), "-map", "0:v:0", "-map", "0:a:0", "-c:v", "libx264", "-preset", video_encoding_mode, "-c:a", "aac", "-ac", "2", "-vf", vf, "-af", af, "-copyts", '"' + fragment_filename + ".sub.mp4" + '"'])
+            p = subprocess.Popen(cmd)
+            p.wait()
+
+        idx += 1
+
 def play_clips(clips, ending_mode, mpv_options):
    if len(clips) != 0:
        clip_filename, clip_start, clip_end = clips[0]
@@ -127,7 +181,7 @@ def play_clips(clips, ending_mode, mpv_options):
        with open(pipe_name, 'w'): # create pipe
            pass

-        p = subprocess.Popen(cmd, shell=False) # start mpv player in idle mode
+        p = subprocess.Popen(cmd) # start mpv player in idle mode
        
        with open(pipe_name, "wb", 0) as f_pipe:
            for clip_filename, clip_start, clip_end in clips:
@@ -151,30 +205,39 @@ def play_clips(clips, ending_mode, mpv_options):
                        p.kill()
                    return

-def main(media_dir, search_phrase, phrase_mode, phrases_gap, padding, limit, output_file, ending_mode, randomize_mode, demo_mode, mpv_options):
-    cmd = " ".join(["grep", "-r", "-n", "-i", "--include", "\*.txt", "-P", '"' + search_phrase + '"', '"' + media_dir + '"'])
+def main(media_dir, search_phrase, phrase_mode, phrases_gap, padding, limit, output_file, ending_mode, randomize_mode, demo_mode, mpv_options, audio_mode, video_mode, video_with_sub_mode):
+    search_phrase = search_phrase.decode(locale.getpreferredencoding())
+    search_phrase_in_utf8_representation = repr(search_phrase.encode("UTF-8"))
+    search_phrase_in_grep = "\"(?s)\(\d\d:\d\d:\d\d,\d\d\d\, \d\d:\d\d:\d\d,\d\d\d\)\\t[^\\n]*" + search_phrase_in_utf8_representation.strip("\'") + "[^\\n]*\""
+
+    cmd = " ".join(["grep", "-r", "-z", "-o", "-i", "--include", "\*\.txt", "-P", search_phrase_in_grep, '"' + media_dir + '"'])
    p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True, bufsize=-1)
    output, error = p.communicate()

    if p.returncode == 0:
+        matches = output.rstrip("\x00").split("\x00")
+        
        if output_file != None:
            with open(output_file, 'w') as f_results:
-                f_results.write(output)
+                f_results.write("\n".join(matches))

-        matches = output.splitlines()
-        
        clips = []
        for match in matches:
            filename, line = match.split(".txt:", 1)
-            line_number, line = line.split(":", 1)
-            line_number = int(line_number)
            
-            sub_timing, sub_content = line.split("\t", 1)            
-            sub_start, sub_end = sub_timing.strip("()").split(", ")
-            
-            match_sub_start = srt_time_to_seconds(sub_start)
-            match_sub_end = srt_time_to_seconds(sub_end)
+            lines = line.splitlines()

+            def get_line_timings(line):
+                sub_timing, sub_content = line.split("\t", 1)            
+                sub_start, sub_end = sub_timing.strip("()").split(", ")
+                return (sub_start, sub_end)
+
+            sub_start, sub_end = get_line_timings(lines[0])
+            match_sub_start = srt_time_to_seconds(sub_start)
+            
+            sub_start, sub_end = get_line_timings(lines[-1])
+            match_sub_end = srt_time_to_seconds(sub_end)
+            
            phrase_start = match_sub_start
            phrase_end = match_sub_end

@@ -182,10 +245,18 @@ def main(media_dir, search_phrase, phrase_mode, phrases_gap, padding, limit, out
                with open(filename + ".txt") as f_txt:
                    txt_lines = f_txt.read().splitlines()

-                    txt_line_start_idx = line_number - 1
-                    txt_line_end_idx = line_number - 1
+                    def find_line_number(lines, line):
+                        for idx, lines in enumerate(lines):
+                            if line in lines:
+                                return idx + 1

-                    for txt_line in reversed(txt_lines[:line_number - 1]):
+                    line_number_start = find_line_number(txt_lines, lines[0])
+                    line_number_end = find_line_number(txt_lines, lines[-1])
+
+                    txt_line_start_idx = line_number_start - 1
+                    txt_line_end_idx = line_number_end - 1
+
+                    for txt_line in reversed(txt_lines[:line_number_start - 1]):
                        sub_timing, sub_content = txt_line.split("\t", 1)            
                        sub_start, sub_end = sub_timing.strip("()").split(", ")

@@ -198,7 +269,7 @@ def main(media_dir, search_phrase, phrase_mode, phrases_gap, padding, limit, out
                        else:
                            break

-                    for txt_line in txt_lines[line_number:]:
+                    for txt_line in txt_lines[line_number_end:]:
                        sub_timing, sub_content = txt_line.split("\t", 1)            
                        sub_start, sub_end = sub_timing.strip("()").split(", ")

@@ -251,7 +322,9 @@ def main(media_dir, search_phrase, phrase_mode, phrases_gap, padding, limit, out
        if randomize_mode:
            random.shuffle(clips)

-        if not demo_mode:
+        if audio_mode or video_mode or video_with_sub_mode:
+            create_fragments(search_phrase, clips, {"audio": audio_mode, "video": video_mode, "video-sub": video_with_sub_mode})
+        elif not demo_mode:
            play_clips(clips, ending_mode, mpv_options)

    elif p.returncode == 1:
@@ -316,7 +389,7 @@ def parse_args(argv):
        print "Search phrase can't be empty"
        sys.exit()

-    args = {"padding": 0, "limit": 15, "output_file": None, "phrase_mode": False, "phrases_gap":1.75, "search_phrase":search_phrase, "ending_mode":False, "randomize_mode":False, "demo_mode":False, "mpv_options":""}
+    args = {"padding": 0, "limit": 60, "output_file": None, "phrase_mode": False, "phrases_gap":1.25, "search_phrase":search_phrase, "ending_mode":False, "randomize_mode":False, "demo_mode":False, "mpv_options":"", "audio_mode":False, "video_mode":False, "video_with_sub_mode":False }
    
    argv = argv[:-1]
    idx = 0
@@ -347,6 +420,12 @@ def parse_args(argv):
            args["randomize_mode"] = True
        elif argv[idx] == "--demo" or argv[idx] == "-d":
            args["demo_mode"] = True
+        elif argv[idx] == "--audio" or argv[idx] == "-a":
+            args["audio_mode"] = True
+        elif argv[idx] == "--video" or argv[idx] == "-v":
+            args["video_mode"] = True
+        elif argv[idx] == "--video-sub" or argv[idx] == "-s":
+            args["video_with_sub_mode"] = True
        elif argv[idx] == "--phrases" or argv[idx] == "-ph":
            args["phrase_mode"] = True
            if idx + 1 < len(argv):
@@ -372,22 +451,27 @@ def parse_args(argv):
    return args

 def usage():
-    print "python videogrep.py -i <media_dir> <phrase>"
-    print "python videogrep.py -i <media_dir> _init_"
+    print "Usage: playphrase -i <media_dir> <phrase>"
+    print ""
+    print "Init: playphrase -i <media_dir> _init_"
    print ""
    print "Additional options:"
-    print "-ph, --phrases GAP_BETWEEN_PHRASES", "\t", "move start time of the clip to the beginning of the current phrase. Value is optional (default=1.75 seconds)"
-    print "-l, --limit", "\t", "maximum duration of the phrase (default=30 seconds)"
-    print "-p, --padding", "\t", "padding in seconds to add to the start and end of each clip (default=0.0 seconds)"
-    print "-e, --ending", "\t", "play only matching lines (or phrases)"
-    print "-r, --randomize", "\t", "randomize the clips"
-    print "-o, --output", "\t", "name of the file in which output of \'grep\' command will be written"
-    print "-d, --demo", "\t", "only show grep results"
-    print "-m, --mpv-options OPTIONS", "\t", "mpv player options"
+    print "-ph, --phrases GAP_BETWEEN_PHRASES", " ", "move start time of the clip to the beginning of the current phrase. Value is optional (default=1.25 seconds)"
+    print "-l, --limit", "     ", "maximum duration of the phrase (default=60 seconds)"
+    print "-p, --padding", "   ", "padding in seconds to add to the start and end of each clip (default=0.0 seconds)"
+    print "-e, --ending", "    ", "play only matching lines (or phrases)"
+    print "-r, --randomize", " ", "randomize the clips"
+    print "-o, --output", "    ", "name of the file in which output of \'grep\' command will be written"
+    print "-d, --demo", "      ", "only show grep results"
+    print "-a, --audio", "     ", "create audio fragments"
+    print "-v, --video", "     ", "create video fragments"
+    print "-s, --video-sub", " ", "create video fragments with subtitles"
+    print "-m, --mpv-options OPTIONS", " ", "mpv player options"

 if __name__ == '__main__':
    os.environ["PATH"] += os.pathsep + "." + os.sep + "utils" + os.sep + "grep"
    os.environ["PATH"] += os.pathsep + "." + os.sep + "utils" + os.sep + "mpv"
+    os.environ["PATH"] += os.pathsep + "." + os.sep + "utils" + os.sep + "ffmpeg"

    args = parse_args(sys.argv[1:])
    if args != False:
@@ -395,8 +479,8 @@ if __name__ == '__main__':
            init(args["media_dir"], args["limit"])
        else:
            if need_update(args["media_dir"]):
-                print "WARNING: number of '.srt' and '.txt' files doesn't match. Maybe you need to use 'videogrep <media_dir> _init_'."
+                print "WARNING: number of '.srt' and '.txt' files doesn't match. Maybe you need to use 'playphrase <media_dir> _init_'."
            
-            main(args["media_dir"], args["search_phrase"], args["phrase_mode"], args["phrases_gap"], args["padding"], args["limit"], args["output_file"], args["ending_mode"], args["randomize_mode"], args["demo_mode"], args["mpv_options"])
+            main(args["media_dir"], args["search_phrase"], args["phrase_mode"], args["phrases_gap"], args["padding"], args["limit"], args["output_file"], args["ending_mode"], args["randomize_mode"], args["demo_mode"], args["mpv_options"], args["audio_mode"], args["video_mode"], args["video_with_sub_mode"])
    else:
        usage()
Author	SHA1	Message	Date
kelciour	424db15167	Minor fixes	2017-11-06 20:42:37 +03:00
kelciour	c2c622f0ce	Add audio & video export	2017-11-06 20:09:14 +03:00
kelciour	51620fc3a5	Add multiline search	2017-10-14 23:18:50 +03:00
kelciour	c6990ad164	Merge branch 'master' of https://github.com/kelciour/playphrase	2017-10-08 19:41:04 +03:00
kelciour	e23e8ef947	Search by using non-latin characters in Windows console	2017-10-08 19:40:32 +03:00
kelciour	2f5b419010	Fix typo in README	2017-08-27 01:26:58 +03:00
kelciour	8f878e2e35	Update README	2017-08-27 01:22:21 +03:00