Translation of videos by subtitling

Forum Forums antiX-development Translations Translation of videos by subtitling

  • This topic has 5 replies, 3 voices, and was last updated Jun 8-1:16 am by Robin.
Viewing 6 posts - 1 through 6 (of 6 total)
  • Author
    Posts
  • #61145
    Member
    Robin

    Videos linked in antiX help menu are spoken in English language. I have to admit I have difficulties to understand the very flavour of language @dolphin_oracle speaks. I’m not a native speaker, so anything but Oxford English is quite a problem to me. Even when being able to understand most of literature audio books spoken by professional British English actors without any problems (even when spoken very fast) the understanding of these antiX videos is way above my pay grade. So English native speakers may step in here for the first point of processing as described below.

    I reckon I’m not the single non native English speaker having problems with this. And I believe I have found a really simple solution: Subtitling. Once a native English speaker has done a transcription from the spoken words using a subtitling software, everybody reading English language as a foreign language will be able to understand the videos. And secondly it is quite easy for native speakers in other languages to create a set of translated subtitles from the basic original language transcription, once it exists.
    Single thing left would be to provide the resulting small subtitle files with antiX, making these available depending on the language user has chosen in his system settings, and bind the use of “his” language specific subtitles to the start command of a corresponding help menu entry. This could be done by an additional community driven translation package in deb format for example.

    But first things first. First we need subtitles in original language to start with all this.

    How to create the translated subtitles? This task is stunningly easy. The workflow is split in two (three) main steps:

    1.) write down sentences in original in gaupol editor.
    2.) translate these sentences to a target language.
    (3.) You may have somebody reading the texts in target language while recording creating an translated audio track this way. But this is another construction area not covered here.)

    First you need to install a subtitling software. I have chosen “gaupol”.

    sudo apt-get install gaupol gstreamer1.0-x gstreamer1.0-alsa gstreamer1.0-libav gstreamer1.0-plugins-base gstreamer1.0-plugins-good gir1.2-polkit-1.0

    After this was successfully done, start „gaupol” from antiX menu, submenu “Programs–>Media”.

    Step A: watching video file and writing down sequently all spoken words and text popups.

    Select from menu “File” the entry ”New”.
    Selcet from menu “Video” the entry “Load Video…” and open the video you’d like to subtitle. Arrange the display as you like, menu “View–>Layout”. For first practising it might turn out to be expedient to chose a video with long speech pauses between short sentences not spoken too fast, as found in some documentary films which tend to let speak the images themselves.

    [see attached screenshots 01, 02 and 03]

    Now you need to know about the keys. (You could do all the following just as well by clicking around with your mouse, but this would result in excruciatingly slow progress only.)

    Most important: qwer,op,jk
    Read on to get their meaning.

    “p” starts and pauses your video immediately. (You may want adjust settings in correlation to your reaction time by pressing “h“ before once, so position will get corrected on processing)

    1.) Now start video with “p” and hit “j” the very moment speaker starts his first sentence.
    — Short sentences: press „p“ to pause video after sentence ceased.
    — Long sentences: press ”j” whenever speaker takes breath within his sentence, or sentence has a comma. You may have to split up even more on very long breathless sentences. Once the sentence was finished, press “p” to pause video.

    2.) Select the first of your new entries in list you have created and press “o” to check the result. Press “k” the moment the speaker reaches the end of the very word you intended as the point of separation.

    3.) Click in list on your entry, column “text” and enter what you just have heard.
    You will figure soon how much letters you want to have per subtitle entry. The letter count is displayed on the end of each entry. Please consider, an entry has to fit in one, two or few bottom lines on screen. But you will see the result immediately so you can decide whether you like the way you split up sentences.

    4.) You probably need to adjust start- end end time of the entry to make it fit to spoken words. You may either enter directly the estimated corrected values (it’s about seconds and parts of seconds) manually in the respective field in columns “start” or “end”. But sometimes it is much more convenient to use the buttons
    “q” and “w” to decrease/increase the start time of the selected entry by 1/50 second,
    “e” and “r” to decrease/increase the end time by 1/50 second.
    You may press these buttons successively and alternatively more than once to set the values to your estimation.

    5.) Press “o” again to check, whether subtitle position fits to spoken sentence or sentence part well. Press “q,w,e,r” keys again if further correction is needed for this entry. And use “o” again and again to re check.
    — Long sentences: Once first part fits, select next entry (next part of sentence) and repeat the procedure (step 2-5): enter text first, and check by pressing “o“ as often as needed while adjusting times with “qwer” to make subtitle fit to spoken words.

    6.) Once a spoken sentence is completed, having all it’s entries filled in and adjusted, move on by pressing “p” for listening the next sentence. Start over with step 1, hitting “j” to create a new subtitle entry.

    Click „Save“ from “File” menu regularly (at least once after completing each single sentence), so you will not lose your work on accidentally shutdown caused by whatever.

    It is always the same procedure: While watching next film sequence by pressing “p”, use “j” to create one or more additional entries in list at the respective current position, Check with “o”, set end by “k” if needed, type in words heard, and adjust times precisely. And so on, until you are done with the video.

    Once you have got familiar with the keys, you can manage this task quite fast, depending on how fast your typewriting scores are…

    Having all the subtitles entered, you may want to check your subtitles file using a normal video player: e.g. the command
    mpv --screen=0 --fs --fs-screen=1 --sub-file='/path/to/subtitle-file' '/path/to/video-file'
    will display the video with subtitles on a secondary (tv-)monitor connected to your computer. (exact arguments for screen options depend on your graphics configuration).

    If you want to check your result on default pc screen use e.g.:
    mpv --sub-file='/path/to/subtitle-file' '/path/to/video-file'
    instead.

    You don’t need to close gaupol necessarily while watching the result if your pc has sufficient resources to manage both tasks without delay.

    Observation: On tv-out subtitles will get displayed within the bottom black border (if existing, dependent of the ratio of screen and video) whereas when playing in a window on pc-screen you will find the subtitles overlaid to the bottom of video images itself.

    You may want to make minor corrections or timing adjustments by editing the corresponding entries of the subtitle-file in gaupol again.

    Now having a transcription of spoken words in original language, let’s move on. This should be done by somebody familiar with the target language, preferable being a native speaker in this language himself.

    Step B: Translation.

    Still in gaupol, uncheck “Display video” in menu “View”.
    This will make vanish the video display and free up space on screen for translation. (Not necessary if you didn’t load a video file previously)
    Click on “Save translation“ from file menu, and chose a new filename for translated subtitle file. Probably you want to mark it by its language identifier (e.g. like fr_BE), added to original subtitle filename. [see attached screenshot 04]

    In menu “View–>Columns” check the box “translations”. You will get an additional column, named “translation”. [see attached screenshot 05 and 06]

    I believe, what comes next is self evident: Click in an empty field of this new column and enter what you feel best fitting as translation to the original string seen left from it… [see attached screenshots 07]

    Save your translation frequently by choosing “save translation” from file menu while working.

    Once finished with all entries you may use the translation file the same way in media player programs as your original subtitle file. You can use your original subtitle file as a base for any number of foreign language subtitle files to be created the same way always. Make sure to add the language identifier to file name to be able to distinguish all the language subtitle files.

    ’nuff said. It’s your turn.

    #61153
    Member
    skidoo
    Helpful
    Up
    0
    :D

    For youtube -hosted videos, simpler would be to raise awareness regarding availability of auto-generated CC sutbititles.

    .

    FWIW, the auto-generated English subtitles for most of the Spanish and Russian native videos I have watched, they have been imperfect (but passable, overall).

    For handcrafted subtitles, I believe a YT channel owner can delegate the task of uploading translations to selected (trusted) YT account holders. You might investigate the prospect of directly assisting the various content creators (channel owners).

    • This reply was modified 2 weeks ago by skidoo.
    • This reply was modified 2 weeks ago by skidoo. Reason: edited for brevity
    #61154
    Member
    Robin
    Helpful
    Up
    0
    :D

    @skidoo Yes, you are right. But this feature seems to be available on recent fast multicore (64bit?) hardware only. I have never seen it by now. On my machines (single core, 32bit, but fast) at least it is simply impossible to watch any videos in web browsers directly, since the websites don’t care for wasting resources and simply over-exhaust the cpu when playing video. Even when choosing extremely low resolution they will power up the cpu to 100% while skipping frames. I have to view all video in a dedicated video player program, performing much more efficiently and using gpu acceleration. This way cpu reaches even on quite high resolutions not more than 15-20%, with no frames dropped at all, and allowing to have opened some other programs and browser + email-client parallel.

    It seems there are no subtitles available when accessing the videos that way. I believe, me is not the single antiX user not owning that high-performance equipment. And after all, at least me rather would prefer to handle all the translation tasks without need of creating a google account.
    So the main question is: Is there a way to get the subtitles from the video storage provider while watching the videos outside of browsers?

    Btw, I’m not quite sure whether all the packages listed in the installation step above are actually needed to make it work. This is what I tentatively installed after my installation question in the other thread and before you’ve headed me to the missing one over there. So you might be able to correct my instructions from above by reducing the list possibly for the benefit of other users.

    #61157
    Moderator
    ModdIt
    Helpful
    Up
    0
    :D

    Hi Robin. Free Tube is maybe/certainly not the lightest player but does allow to translate from english to German subtitles.
    Probably other languages too.
    The appimage should run on 32 bit as well as 64.
    You tube ads seem to sometimes cause freezing on my icewm Runit setup.

    #61158
    Member
    skidoo
    Helpful
    Up
    0
    :D

    I have never “logged into” youtube and do not foresee doing so in the future… so I cannot personally test this “bright idea”, but

    While not logged in, I can request auto-subtitles and they are presented to me in English.
    Possibly this detail is based on my browser’s user-agent string (or other http request header).
    Possibly a logged-in user can choose different, specific, locales for the UI (and subtitles).

    In addition to the overlaid subtitles… the text along with (or without) timestamps is displayed to a scrollbox at the right side of the page and can be captured (copypasted). Harvesting these, for various languages, might provide a head start for your proposed project.

    #61162
    Member
    Robin
    Helpful
    Up
    0
    :D

    For youtube -hosted videos, simpler would be to raise awareness regarding availability of auto-generated CC sutbititles.

    Yes. But we can do more than this, with small budget of resources.

    Moreover I remember your original posting contained some remarks about scalability of the approach, providing all the translations. This was simply true.

    Meanwhile, you know me well enough to know that I will keep digging. And found already. I do have to eat my words from above.
    “It seems there are no subtitles available when accessing the videos that way.”
    Since there IS a way to retrieve automatically generated subtitles directly from youtube without need to watch video in browser.

    It will take some time, but I’m quite sure we can have a script, presenting all existing help videos to antiX users subtitled in their native language (in case youtube provides the service in their language). Automatically.

    The command

    youtube-dl --write-sub --write-auto-sub --sub-lang 'fr' --sub-format 'vtt' -f '(136+251)' 'https://www.youtube.com/watch?v=jpdl1ES1zOg'

    will download the video along with french subtitles, and

    mpv --sub-file="antiX 17 - What's New-jpdl1ES1zOg.de.vtt" "antiX 17 - What's New-jpdl1ES1zOg.mkv"

    will present you one of the antiX help videos subtitled accordingly.

    To see a list, in which languages and in which formats subtitles are available just enter:

    youtube-dl --list-subs 'https://www.youtube.com/watch?v=jpdl1ES1zOg'

    which will show all subtitle languages and files present for the video file.

    So my script will be outlined like this:

    -- get system language
    -- check whether original video language is identical to system language
    	--if true download video file only, to temp folder
    -- download subtitle language and format list
    -- check whether system language is available in downloaded list
    	-- if true, download subtitle file along with video file to temp folder
    	-- if false present user a list of avialable subtitle languages and make him chose one.
    		download chosen subtitle file along with video file to temp folder
    -- present video either with or without subtitle file, according to results from above.

    Many thanks for your great inspiration skidoo!
    I do hope this concept will scale better now. Maybe we can get along without temporary download simply by feeding the download addresses directly to the player software. I’ll check this.

    So we can reserve the gaupol method from my original posting for special tasks, when automatically translations are not available (or of that poor quality we can’t use them).

    So long
    Robin

Viewing 6 posts - 1 through 6 (of 6 total)
  • You must be logged in to reply to this topic.