Convert the English apostrophe in all Transifex files

Forum Forums antiX-development Translations Convert the English apostrophe in all Transifex files

  • This topic has 15 replies, 5 voices, and was last updated Apr 12-8:38 pm by Wallon.
Viewing 15 posts - 1 through 15 (of 16 total)
  • Author
    Posts
  • #103755
    Member
    Wallon

      Dear Translators,

      I can’t find a quick way to convert “straight” English apostrophes to “typographic” apostrophes which have a curved shape.

      At the moment, I’m going through all the lines one by one at Transifex by deleting and pasting the right apostrophe. Fortunately, in Transifex, there is a function to search for bad apostrophes. This is already a good start.

      I tried with Geany to open the *.po file and used the search/replace function. The problem, the English strings are mixed with the French strings. You have to be really careful not to break the English code. But then, how to export the *.po file to overwrite the old translations.

      All the French translators before me used the English apostrophe. I understand why some cli… programs do not always react well in French. Nobody told me, be careful, don’t use that in translations, your keyboard will put the wrong apostrophe directly… I was even told, press the “translate” button and go ahead!

      I have thousands of lines to correct for antiX and MX Linux. It’s a big job.

      Do you have an idea?

      Best regards,
      Wallon

      #103758
      Member
      Robin
        Helpful
        Up
        0
        ::

        Dear Wallon,

        I can’t find a quick way […] Do you have an idea?

        Unfortunately not. In my opinion it can be done only manually. The reason for this is, the ASCII Apostrophe is used for some more functions than an apostrophe (e.g. as an opening and closing quoting sign), all of them need some different kind of typographic replacement for this single ASCII character. Only a human or at least a KI AI (sorry, in English it reads AI) could decide which typographic replacement to use in which position. A really skilled programmer could probably write a program for this task, setting up a very sophisticated net of rules for replacement (if the Apostrophe is preceded by a blank, then… if the apostrophe is not followed by either a blank or a newline or a tab or… then … If the Apostrophe is followed by a full stop, a colon, a semicolon, an exclamation mark, a … then etc.) which would need to know about the specific usage of typographic replacements in all languages (e.g. opening and closing quotes are in French written « … » with an additional small-width blank properly, while in German they are written just the other way around »…« without the additional blank properly). This is something which is way beyond my skills, I have tried this already for use in the automatic translations sections of some my scripts.

        All the French translators before me used the English apostrophe. I understand why some cli… programs do not always react well in French.

        This might be easily the reason for a misbehaving script or program when run in translation. Just countercheck whether it behaves well when you call it from a console window with one of the prefixes:

        LANG=C <command>
        LANG=en_US.UTF-8 <command>

        Replace the full string <command> by the actual command to start the program or script., e.g.
        LANG=en_US.UTF-8 roxterm
        (Try starting some programs or commands within the very roxterm window which has just opened, they should come up in English language for testing then.)

        You have to be really careful not to break the English code. […] Nobody told me, be careful, don’t use that in translations, your keyboard will put the wrong apostrophe directly

        Marcelo and me have preached exactly this, again and again, here in forum and as well in the IRC antiX-translators channel. This must have missed to hit you 🙂

        Many greetings
        Robin

        Windows is like a submarine. Open a window and serious problems will start.

        #103770
        Member
        Wallon
          Helpful
          Up
          0
          ::

          Dear Robin,

          Don’t shoot the translator.
          Eleven years ago, anticapitalista translated strings of characters into French using the English apostrophe.
          You can imagine, it took 11 years to discover the right way to do a good translation.
          I felt quite alone to understand what was wrong with the translations of the yad-updater program.
          I am not a computer scientist. I do my best and share my findings with everyone. I hope this will help other translators.

          Cordialement,
          Wallon

          #103774
          Forum Admin
          anticapitalista
            Helpful
            Up
            0
            ::


            Eleven years ago, anticapitalista translated strings of characters into French using the English apostrophe.

            No I didn’t. I haven’t touched the French translations at Transifex.

            Philosophers have interpreted the world in many ways; the point is to change it.

            antiX with runit - leaner and meaner.

            #104003
            Member
            Wallon
              Helpful
              Up
              0
              ::

              My dear friends, Robin and Marcelocripe,

              Please don’t put any more automatic translations with the English apostrophe and the English inverted commas. As a reminder, Google Translation and Deepl put the wrong characters in the French translations. Even if the translation was good, I have to review everything too! You have to review the procedure, establish a new strategy for automatic translations for French.

              Imagine what I have to correct for antiX and MX Linux;

              For the French [fr]:
              ++++++++++++++++++++
              Antix-development, 8.944,00 strings
              Antix-contrib, 4.874,00 strings.
              ———————————–
              Total 13.818 strings of characters

              For French [fr_BE]:
              +++++++++++++++++++
              Total 13.818 strings also

              Total [fr] + [fr_BE] = 13.818,00 x 2 = 27.636,00 strings

              I’ll be done by Debian 13 or 14 or 15…. I don’t know

              Sincerely,
              Wallon

              #104007
              Moderator
              Brian Masinick
                Helpful
                Up
                0
                ::

                My dear friends, Robin and Marcelocripe,

                Please don’t put any more automatic translations with the English apostrophe and the English inverted commas. As a reminder, Google Translation and Deepl put the wrong characters in the French translations. Even if the translation was good, I have to review everything too! You have to review the procedure, establish a new strategy for automatic translations for French.

                Imagine what I have to correct for antiX and MX Linux;

                For the French [fr]:
                ++++++++++++++++++++
                Antix-development, 8.944,00 strings
                Antix-contrib, 4.874,00 strings.
                ———————————–
                Total 13.818 strings of characters

                For French [fr_BE]:
                +++++++++++++++++++
                Total 13.818 strings also

                Total [fr] + [fr_BE] = 13.818,00 x 2 = 27.636,00 strings

                I’ll be done by Debian 13 or 14 or 15…. I don’t know

                Sincerely,
                Wallon

                Hi Wallon,

                Is the issue simply the incorrect apostrophe value or symbol?
                If that’s the only error, couldn’t that be fixed by globally checking the instances of this in the files and running a few test replacement programs on a small sample, putting the results in a temporary location?

                If the sample works correctly, put the outputs to their permanent location and repeat, perhaps with an increasingly large sample until the process is completed.

                Since it’s already incorrect, doing it in this manner isn’t any riskier than what’s there now, especially if the “test” isn’t directly sent to the final location until it’s verified iteratively until satisfied.

                I don’t know what the correct sequence is. Are the apostrophe characters always surrounded by other characters, or can they be in arbitrary places within a word, a sentence, or a paragraph? The more deterministic the changes are defined, the more possible it would be to automate it, even if 3, 4, 5, or some reasonably finite number of possibilities is explored, sampled, and tested.

                Thoughts?

                --
                Brian Masinick

                #104008
                Moderator
                Brian Masinick
                  Helpful
                  Up
                  0
                  ::

                  By the way, tools like Perl, Python, and a few others could literally rip through thousands of possibilities.
                  Care needs to be taken, lest a script make the situation far worse; however, if you know what the English apostrophe and the English inverted commas are and what the correct apostrophe is, one easy thing is to search for all instances and dump that into a temporary file.

                  Analyze if making a change to the apostrophe and comma values would adversely affect anything else.
                  If yes, modify the condition and test again.
                  Repeat until some valid cases are established.

                  This approach would take some work and some care, but it’s an awful lot easier than manually combing through all of those files.
                  If someone other than Google (and whatever other tools there are) is responsible for the code, the careful sorting and listing
                  of the information to a temporary location for analysis, and repeating the analysis until it’s correct, that beats the other options in my opinion.

                  I’m not an expert coder, but with some help I might be able to either design it or at least help.
                  PPC, though he claims not to be an expert either, has still done some pretty good work with his scripts.
                  Robin is another sharp individual; there’s three of us right there to collectively put our heads together with thoughts, ideas, and code to test.

                  --
                  Brian Masinick

                  #104015
                  Member
                  Wallon
                    Helpful
                    Up
                    0
                    ::

                    Dear Brian,

                    A little bit of history.
                    Do you know the origin of the apostrophe in English?
                    Once again, the English wanted to imitate the French who used the right apostrophe (=typewriter apostrophe) like the bash code.
                    It’s not me who says it, it’s Wikipedia.

                    https://en.wikipedia.org/wiki/Apostrophe
                    https://en.wikipedia.org/wiki/Geoffroy_Tory

                    FR = Apostrophe
                    EN = Apostrophe

                    This is called elision in French. If the article (du, de, la, le, ce…) is followed by a word that starts with a vowel, the vowel is removed from the article. This is like mini connected language in English in the USA.

                    The user = Le utilisateur = 2 vowels (e) followed by (u).
                    Elision gives this = L’utilisateur.

                    The computer = Le ordinateur = 2 vowels (e) followed by (o)
                    Elision gives this = L’ordinateur.

                    All software uses the old French apostrophe (typewriter apostrophe).
                    Google Translation
                    Deepl
                    Geany
                    Leafpad
                    FeatherPad
                    Nano
                    Vim
                    The antiX forum
                    The MX forum
                    …. same for all Windows editors.

                    I think only ChatGPT is able to make the corrections for the human being.
                    Here are two bad examples. I give the sentence in English and French for your understanding.

                    English sentence to be translated into French -> The computer user must be ‘root’

                    1st example

                    
                    Bad translation for bash code or HTML -> L'utilisateur de l'ordinateur doit être 'root'.
                    Should be converted like this;
                    Good translation for bash code or HTML -> L’utilisateur de l’ordinateur doit être « root ».
                    

                    2nd example

                    
                    Bad translation for bash code or HTML -> L'utilisateur de l'ordinateur doit être "root".
                    Should be converted like this;
                    Good translation for bash code or HTML -> L’utilisateur de l’ordinateur doit être « root ».
                    

                    Honestly, I’ve been to French developer forums for websites and such… They have to proofread all French translations to manually remove dangerous characters. Exactly as I do in Transifex.

                    To get the “typographic” apostrophe, with my Belgian azerty keyboard;
                    – under linux, I use 3 keys (Shift key + AltGr + B)
                    – under Windows, it’s 5 keys (Alt + 0,1,4,6)

                    I’m lucky to know a little bit about Linux, there are less keys to use.

                    I thank PPC again for finding the problem of compatibility for languages that use the typerwriter apostrophe or the double right inverted comma.

                    Cordialement,
                    Wallon

                    #104016
                    Moderator
                    Brian Masinick
                      Helpful
                      Up
                      0
                      ::

                      If you really want to do manual translation, that’s up to you.

                      Is it possible to change all instances of ‘ to ’ and all instances of ‘. to ».
                      or are there any other considerations.

                      If it’s that simple, automation could take care of that with only two sed or Perl statements (at least for the translation itself).

                      So are there any other combinations of characters and are they in any locations other than mid word for the apostrophe and end of sentence for the ». ?

                      --
                      Brian Masinick

                      #104031
                      Member
                      Wallon
                        Helpful
                        Up
                        0
                        ::

                        Dear Brian,

                        I don’t understand how a PC can do this. Zoom in on your screen to analyse the apostrophes.

                        Double inverted commas (” ” converted in typographic French = « ») can be around a word, several words, a sentence or a paragraph.

                        Cordialement,
                        Wallon

                        Attachments:
                        #104149
                        Member
                        marcelocripe
                          Helpful
                          Up
                          0
                          ::

                          Dear Wallon.

                          It took about 4 hours to change almost all straight double quotes and straight single quotes to curly quotes for the pt_BR language in Official Transifex and Contribs.

                          Thanks to Robin’s teachings on how to do research on Transifex, the job could be done.

                          Some text I didn’t change to smart quotes, because I’m not sure if it will work with smart quotes.

                          I know you have many more occurrences in your language, however, you can copy and paste the characters whenever you need, this was a way to decrease the amount of times I needed to press the keys “Shitf + Alt Gr + V” , “Shitf + Alt Gr + B”, “Alt Gr + V” or “Alt Gr + B”. The “Alt Gr” key has never been used as many times as it has been used.

                          But I only did that after completing the translation of the “antix23-desktop-files” resource.

                          – – – – –

                          Caro Wallon.

                          Foram cerca de 4 horas para fazer a alteração de quase todas as aspas duplas retas e as apas simples retas para as aspas curvas para o idioma pt_BR no Transifex Oficial e no Contribs.

                          Graças aos ensinamentos do Robin em como fazer a pesquisa no Transifex, o trabalho pôde ser feito.

                          Alguns textos eu não alterei para as aspas curvas, porque não tenho certeza se irá funcionar com as aspas curvas.

                          Eu sei que você possui muito mais ocorrências no seu idioma, contudo, você pode copiar o caracteres e colar sempre que você precisar, esta foi uma forma de diminuir a quantidade de vezes que eu precisava pressionar as teclas “Shitf + Alt Gr + V”, “Shitf + Alt Gr + B”, “Alt Gr + V” ou “Alt Gr + B”. A tecla “Alt Gr” nunca havia sido utilizada tantas vezes como foi utilizada.

                          Mas eu só fiz isso depois de concluir a tradução do recurso “antix23-desktop-files”.

                          #104187
                          Member
                          Wallon
                            Helpful
                            Up
                            0
                            ::

                            Dear Marcelocripe,

                            Yes, thanks to Robin, I have found how to put all the strings on the screen.
                            I have so many apostrophes to change that I have to use the Ctrl + F keys to help me find them all. I use Ctrl + V to copy the right apostrophe directly.
                            I don’t understand why my keyboard in antiX 23 is not the same as antiX 21/22. I have different symbols when I use AltGr + another key.

                            I use Geany a lot for my translations. I changed the font to see the differences between the apostrophes. I use the Microsoft Times font. It gives me the best result on the screen.

                            I do the same as you do. We have to work in two stages.
                            First, translate the texts. Then I look for the wrong apostrophes.

                            We have become little developers. Linux is not compatible with Latin languages. We have to tinker with our translations.

                            Cordialement,
                            Wallon

                            #104201
                            Member
                            Robin
                              Helpful
                              Up
                              0
                              ::

                              You have to review the procedure, establish a new strategy for automatic translations for French.

                              Well, these are not automatic translations. Even when google returns these (along with a bunch of other stuff not suitable for our purposes) my automatic translation scripts filter for these and replace them (stupidly) following a simple replacement logic since I first was aware of this issue a year ago. (Since then I’m warning people) The result might look a bit funny in other languages but English, but at least doesn’t cause programs from failing:
                              Teststring with 'single' and "double" quotings and apostrophe (l'ordinateur)
                              Will be automatically changed to
                              Teststring with ’single’ and ”double” quotings and apostrophe (l’ordinateur)
                              before inserting it back to translated file.
                              So manual translators will have to replace it still, according to the needs of the respective language.

                              Please don’t put any more automatic translations with the English apostrophe and the English inverted commas.

                              I’m not running automatic translations scripts on antiX-development, only on antiX-contribs. But as said, they won’t use the show-stopper apostrophe anyway. Some resources translated long ago might still suffer from this issue.

                              I have to use the Ctrl + F keys to help me find them all

                              You can use following filter strings in transifex editor to filter a resource for all entries containing the “wrong” type of apostrophe and quoting.

                              translation_text:\'

                              translation_text:"

                              Best regards
                              Robin

                              Windows is like a submarine. Open a window and serious problems will start.

                              #104205
                              Member
                              Wallon
                                Helpful
                                Up
                                0
                                ::

                                Dear Robin,

                                Thanks for the explanation but you didn’t understand what I was writing.

                                I use the filters as you said.

                                In column 1, the apostrophes are in yellow. In this string, I have more than 30 apostrophes to replace. There are several screens for one string.

                                In column 2, the apostrophes are not set to yellow. I use Ctrl + F to turn the apostrophes in column 2 yellow. There are too many apostrophes to replace.

                                Cordialement,
                                Wallon

                                #104213
                                Member
                                Robin
                                  Helpful
                                  Up
                                  0
                                  ::

                                  In column 2, the apostrophes are not set to yellow. I use Ctrl + F to turn the apostrophes in column 2 yellow.

                                  I see. Yes, you are right. Unfortunately transifex editor doesn’t highlight them in the editable field, only in the list. For this Ctrl+F is a great help, as you have suggested.

                                  There are several screens for one string.

                                  This annoyance should not happen again. More recent resources have split these giant entries up to multiple small entries instead. This depends on how the program author has cut the original strings.

                                  Windows is like a submarine. Open a window and serious problems will start.

                                Viewing 15 posts - 1 through 15 (of 16 total)
                                • You must be logged in to reply to this topic.