Forum › Forums › New users › New Users and General Questions › txt encoding
Tagged: encoding detect convert
- This topic has 24 replies, 7 voices, and was last updated Feb 25-5:43 pm by andfree.
-
AuthorPosts
-
February 18, 2023 at 6:10 am #99915Member
andfree
Hi. I open a txt file that I created in a laptop with old windows, but there is an encoding problem. I try “set encoding” with Geany, but I don’t see any difference.
February 18, 2023 at 7:55 am #99918Member
sybok
::Hi, I’ve used console-based tools ‘enca’ (detection, requires sufficient number of characters to perform a reliable estimate) and ‘iconv’ (conversion) in the past.
February 18, 2023 at 8:28 am #99922MemberRJP
February 18, 2023 at 8:29 am #99923Memberandfree
::Thank you. I installed enca, I read this & I tried the commands below:
$ file mg.txt mg.txt: ISO-8859 text, with no line terminators $ enca -L none mg.txt Unrecognized encoding $ iconv --from-code=ISO-8859 --to-code=UTF-8 mg.txt > mg1.txt iconv: failed to start conversion processing $ iconv --from-code=ISO-8859-1 --to-code=UTF-8 mg.txt > mg1.txtThe last command created a new “mg1.txt” file, but, although this file appears to be “UTF-8 Unicode text, with no line terminators”, the displayed text seems to be the same.
February 18, 2023 at 10:41 am #99935MemberRobin
::You can set in geany line-ends separately from encoding in the document menu. Available settings for a document are CR/LF (classical Windows style), LF (unix, default in antiX), CR (classical apple style). To get an idea about which line-endings are actually used in your original file you could set the checkbox „show lineends” from menu „view” active.
Additionally check
iconv -l
to see a complete list of available input (and output) encodings. Try some of the WINDOWS-xxxx encodings instead of ISO-8859, or try some of the ISO-8859-x encodings.Windows is like a submarine. Open a window and serious problems will start.
February 19, 2023 at 6:06 am #100013Memberandfree
::Thank you.
You can set in geany line-ends separately from encoding in the document menu.
It doesn’t seem to make any difference.
Try some of the WINDOWS-xxxx encodings instead of ISO-8859, or try some of the ISO-8859-x encodings.
“iconv: illegal input sequence at position x” is the output for these encodings. Only for WINDOWS-1252 a new file has been created, but it doesn’t make any difference.
February 19, 2023 at 8:12 am #100016Member
sybok
::What is the output of ‘enca’ on the newly created file(s)?
This situation reminds me of encoding troubles occasionally experienced with downloaded subtitles…
BTW, there is an utility ‘flip’ to switch between different types of line endings.
February 19, 2023 at 9:15 am #100018Memberandfree
::What is the output of ‘enca’ on the newly created file(s)?
$ enca -L none mg1.txt Unrecognized encodingthere is an utility ‘flip’ to switch between different types of line endings.
$ flip -u mg.txt mg.txt: binary file, not converted $ flip -m mg.txt mg.txt: binary file, not converted $ flip -ub mg.txtThe last command seems to have converted the initial file, but this doesn’t seem to have helped.
February 19, 2023 at 10:18 am #100021MemberPPC
::Hi, I have a simple, low tech, suggestion: rename (example: new.txt), and then open a file that if encoded as you want your target file to be encoded. Open the target file. Copy it’s contents and paste them in the new.txt file. Save the new file. Close the files, try to open new.txt file.
This always works for me… Not ideal, but effective.If all else fails, try opening the .txt file in LibreOffice Writer…
P.
February 20, 2023 at 5:46 am #100111Memberandfree
::Thank you. I’m not sure I understood. I created an empty “new.txt” file. I opened it with geany and noted that encoding was already UTF-8. I opened the mg.txt file, copied its content, pasted it in the new.txt file, saved the new.txt file, closed both files & opened again the new.txt file. Unfortunately, it doesn’t seem to have worked.
Encoding problem remains when I open the file with LibreOffice Writer & I can’t find how I set encoding there.February 20, 2023 at 10:53 am #100117Member
sybok
::Could you please post a screenshot of what do you see in LibreOffice (LO) if the content is not too private?
Selecting encoding with LO: https://ask.libreoffice.org/t/how-do-you-change-the-encoding-for-certain-file-types/52833
It seems that you may also try to convert encoding of a file using LO: https://unix.stackexchange.com/questions/259361/specify-encoding-with-libreoffice-convert-to-csvFebruary 20, 2023 at 10:58 am #100118Member
zblsv
::andfree, can you show the contents of the file? At least part of it.
This converts binary data to Base64 and thus can be posted here as is:
dd bs=56 count=1 status=none if=mg.txt | base64
First 56 bytes (value of bs parameter).Words are carried away by the wind...
February 22, 2023 at 6:12 am #100289Memberandfree
::Thanks for the replies. I attach a sample of how it seems in LO and how in leafpad (or geany). The language is greek.
I can’t see how “File -> Open and then select Text or Text – Choose Encoding as the file type” helps me to set encoding.$ dd bs=56 count=1 status=none if=mg.txt | base64 4fDv9PHd8OXpCurh9OHq8d7s7enz5wrz+evn7dzx6eEK4fXu5+zd7efyCurh9OHt3Ov58+fyCuM=- This reply was modified 2 months, 2 weeks ago by andfree.
Attachments:
February 22, 2023 at 6:45 am #100293Member
sybok
::The File -> Open … selection in the case of LO would mean that you do the job of selecting input file encoding. The expectation is that LO would display the text correctly then.
Perhaps, something is missing from your antiX system or it is the way of (old) Windows and encoding.
The first thing that comes to mind is fonts (reminds me of a poster informing Hitchcock’s “The Birds is coming” 🙂 ), https://packages.debian.org/stable/fonts/
But should not this be handled using unicode?
It turns out my understanding of these matters is … in need of expanding.February 22, 2023 at 7:30 am #100300Memberandfree
::The File -> Open … selection in the case of LO would mean that you do the job of selecting input file encoding. The expectation is that LO would display the text correctly then.
Does this means that it would automatically select the appropriate encoding form or that there would be a list of encoding forms for me to select? Because, I can’t see such a list.
-
AuthorPosts
- You must be logged in to reply to this topic.

