Screencasts are an increasingly common way of explaining software products. I probably preferred Turbogears to Django because of the 20 minute wiki screencast by Kevin Dangoor – http://files.turbogears.org/video/20MinuteWiki2nd.mov. So, this month, we will create a movie. We have chosen a programmed slide show as a simple illustration. The ideas can be expanded and generalised for creation of an effective and compelling screencasts. The same concept can transform your digital images into an exciting audio-video treat for your parents. A movies is really a sequence of images displayed at a predefined rate. The showing of the images is synchronised with audio. A group of images and the corresponding audio are grouped together and regarded as a scene in a movie. Independently created scenes can be pieced them together to create the illusion of a longer movie. Defining the ScreencastIn this tutorial, you will take a sequence of screen shots you wish to explain. You will write a script for each of the screen shots. The application, festival or espeak, will be the actor which converts the script into a voice. Each scene will be the displaying of a screen shot for as long as it takes to speak the associated script. Create a set of screen shots for the product you wish to explain in a directory, numbering them sequentially, e.g. PhotoApp00.png, ..., PhotoApp05.png. You could write the script in a separate text file for each or in a single file. In this tutorial, you will write it in a single file, a header line followed by the script and a blank line for each of the slides in order. The first two characters of the header will be the image number. 00 Start the python application my_photos from the terminal
01 A new image will be displayed to you.
02 Type the text you would like to appear as a caption in the text box.
03 Once you press enter, the text will be displayed on the image as you can see.
04 Now, click on the save and next button. The image will be saved.
05 And you will be shown the next picture. Repeat the steps until all the photographs are processed. Note, that if you do not wish to put a caption on a picture and save it, you can press the next button. The ImplementationThe core logic of your application will be: #!/usr/bin/env python import os, sys import wave import Image, ImageTk, ImageDraw
script_file = open('Script.txt') # iterate over each scene for scene_id, image, text in scene_data(script_file): duration = text_to_speech(text) # create frames assuming 25 frames per sec for frame_no in range(25*duration): image.save(scene_id + "%03d"%frame_no +".jpg") # convert the frames into a scene os.system('mencoder -audiofile ' + scene_id + 'text.wav -oac mp3lame "mf://' \ + scene_id +'*.jpg" -mf fps=25 -o out_' \ + scene_id + '.avi -ovc lavc -lavcopts vcodec=mpeg4') # Create an animated scene to end using the last image animated_scene(image) # combine the scenes into a single film os.system('mencoder -ovc copy -oac mp3lame -o output.avi out_*.avi') The script file is opened. It is best to use the scene id to be a numeric string of fixed number of digits. That will ensure that the order of scenes is easily maintained. An image and the corresponding text are selected. The text is converted to a speech file. The image is copied as many times as the number of frames will be needed for the duration of the speech file. The speech file and the images (using the mf://xx*.jpg url) are combined and converted into an avi file by using mencoder. The sound file is converted to mp3. If you are familiar with ffmpeg, you may use that instead of mencoder. Finally, all the avi files are combined into a single avi file. The code for the generator for fetching the image and the text file will be: def scene_data(script_file): while True: # the first two characters in the script are the scene id scene_id = script_file.readline()[:2] # readline will return an empty string after EOF if scene_id.strip() == '': break # the images are png files in screencast subdirectory im_file = 'screencast/PhotoApp' + scene_id + '.png' image = Image.open(im_file) frame = image.resize((640,480)) # read lines until an empty line text = '' while True: line = script_file.readline() if line.strip() == '': break # Append replacing new line by a space text += line.replace('\n', ' ') yield scene_id, frame, text The script file structure was explained above. The code keeps reading the script file until there is no more data. The first two characters of the header line are the screen id. The images must be named as per a fixed format with two characters being the scene id. The image is resized to a fixed size. The generator yields the values of the scene id, the resized image and the text associated with that image. Next step is the code to convert the text to speech. def text_to_speech(text): # uncomment ESPEAK or FESTIVAL command and system call #ESPEAK = 'espeak -w text.wav -s120 "%s"' #os.system(ESPEAK % (text)) FESTIVAL = 'echo %s | text2wave -o text.wav -F 44100 -scale 2.0' os.system(FESTIVAL % (text)) win = wave.open('text.wav') # modify the wave file to add a short silence # before the start and at the end wout = wave.open(scene_id + 'text.wav','w') # create the wave file with same parameters in the input file wout.setnchannels(win.getnchannels()) wout.setsampwidth(win.getsampwidth()) wout.setframerate(win.getframerate()) # half a second of silence silence_frames = win.getframerate()/2 # mono 16bit sound frames silence_data = silence_frames*'\x00\x00' wout.writeframes(silence_data) data = win.readframes(win.getnframes()) wout.writeframes(data) wout.writeframes(silence_data) # divide the number of frames by frame rate duration = float(wout.getnframes())/wout.getframerate() win.close() wout.close() return duration The code to get a wave file of the speech is a mere two lines. You can use the espeak command or the text2wave command from the festival package. The latter's voice quality is better. (I needed the frequency of the wave file to be 44100 for the sound for various scenes to be synchronised after conversion to mp3 audio.) You can use the wave module to improve the presentation by inserting short silences at the start and the end of the wave file. This makes the presentation more natural. A Little Animation to EndOn the final image, a red square moves from left to right with the text 'The' written on it. You also create a green circle which moves from right to left with the text 'End' written on it. The two merge at the centre. The image is frozen for a second. You add the logout sound of the desktop to the scene. As it is the final image, you can ignore differences in the duration of the sound file and the video. def animated_scene(bg_image): """A square with 'the' and a circle with 'end' float across a background image from opposite sides and merge. """ duration = 2 nframes = 25*duration box_size = (100,100) x_step = (640 - 100)/(2*nframes) # create a red square 100x100 im_square = Image.new('RGB', box_size) draw_s = ImageDraw.Draw(im_square) draw_s.rectangle([(0,0), box_size], fill='RED') draw_s.text((25,25),'The') # create a green circle with diameter 100 im_circle = Image.new('RGB',box_size) draw_c = ImageDraw.Draw(im_circle) draw_c.ellipse([(0,0),box_size], fill='GREEN') draw_c.text((25,75),'End') # create a mask to show only the (green) circle r,mask,b = im_circle.split() # create the frames scene_id = '99' x_s = 0 x_c = 640 - 100 for frame_no in range(nframes - 1): image = bg_image.copy() image.paste(im_square, (x_s,200)) image.paste(im_circle, (x_c,200), mask) x_s += x_step x_c -= x_step image.save(scene_id + "%03d"%frame_no +".jpg") # freeze the final frame for a second for frame_no in range(nframes, nframes + 25): image = bg_image.copy() image.paste(im_square, (270,200)) image.paste(im_circle, (270,200), mask) image.save(scene_id + "%03d"%frame_no +".jpg") # convert the frames into a scene. Use the logout sound os.system('mencoder -audiofile /usr/share/sounds/logout.wav \ -oac mp3lame "mf://' + scene_id +'*.jpg" -mf fps=25 -o out_' \ + scene_id + '.avi -ovc lavc -lavcopts vcodec=mpeg4') In our restless world, time is at a premium. So, go ahead and create 30 second just-in-time tutorials for your application and it is sure to be a hit.
|
Python For Friends >