|
OpenBSD Packages and Ports Installation and upgrading of packages and ports on OpenBSD. |
|
Thread Tools | Display Modes |
|
||||
Package or program to convert Msword .doc files
Besides "libre Office", does any one know if there are other packages /software for OpenBsd that can read, and convert MSword documents/files to a normal readable text format ? EG:
my_example_file.DOC, convert to normal .txt Thank you
__________________
My best friends are parrots Last edited by PapaParrot; 2nd June 2020 at 05:59 AM. |
|
|||
On OpenBSD - packages:
catdoc docx2txt On FreeBSD and Linux the best program to do it is: pandoc |
|
|||
Since LibreOffice installed on my station, i use too convert into console:
Code:
$ libreoffice --headless --convert-to odt *.docx Code:
$ libreoffice --headless --convert-to pdf file Code:
$ libreoffice --headless --convert-to text file @victorvas: thanks for your tips!
__________________
GPG:Fingerprint ed25519 : 072A 4DA2 8AFD 868D 74CF 9EA2 B85E 9ADA C377 5E8E GPG:Fingerprint rsa4096 : 4E0D 4AF7 77F5 0FAE A35D 5B62 D0FF 7361 59BF 1733 |
|
|||
Quote:
__________________
May the source be with you! |
|
||||
Thank you very much, "catdoc" works perfectly for my needs, this is very very rare I even need to view/read this type of document. Just tried catdoc, and it is perfect. Thank you alos for the other responses as well,..
====edit==== Thanks, I will look at those as well, but like I said the"catdoc" seems to be fine.
__________________
My best friends are parrots Last edited by PapaParrot; 2nd June 2020 at 07:26 PM. Reason: posted almost the same time I did, |
|
||||
Yea I ran into that myself, so I tried the "docx2txt", and it was able to convert some, but still there were a couple that failed, so I tried the "antiword", it was able to convert a few more. There was still 2 or 3 , that none of these could convert, "antiword" said they were not "Word" files, in any event I found some online converter site that was able to convert the remaining files. Fortunately this is something I do not need to do often. It does not make sense to me that MSwindows can not produce files that are universal and can be read, edited ,etc...with a general text editor. There is no reason any one should be forced to use MS windows, and Word, if and when some secretary sends them documents, but that would be a whole other topic,... any way,I did convert all of them and now have them in text format, I mean .txt.
Quote:
These files were pretty old, about 5 years, don't know if that is " these days" or not. What seems strange to me is Why some docx files were convertible, and others no,seems like there is a consistancy problem with the mal-ware used to produce these type of files. Another topic as well
__________________
My best friends are parrots |
|
||||
Well 5 years old is still well past 2007, so it seems likely they would be unconvertible by catdoc. But it is also possible that files today could sometimes/often(?) be produced to the pre-2007 standard. Don't know how common that would be.
I'm lucky I don't run into them often. There is one friend of the family that once a year sends a letter about what they did in the past year. It always comes in .docx format. I always politely ask if they could send a .pdf, and they send that and I can read it. Next year, same thing again. I think for typical Windows users there just isn't much mindspace for anything outside their ecosystem, understandably. So we must adapt, and if we can avoid being absorbed it's a victory. |
|
|||
Don't forget that .docx files are just "Open XML" files, in other words zipped XML files. So, if all else fails, you can always
$ tar xf file.docx and parse the XML files. The main contents are stored in a file called document.xml IIRC. Of course, XML is an abomination, but I did say "if all else fails".
__________________
May the source be with you! |
|
|||
Ah my bad. It supports zip files on FreeBSD and I assumed it did on OpenBSD too.
__________________
May the source be with you! |
|
||||
Thanks jggimi , I do have
Quote:
__________________
My best friends are parrots |
|
||||
You might be able to get GNU tar to support zip using the -I (capital 'eye', or --use-compress-program) option to call a shell script designed to call zip or unzip as necessary, but I'm not sure I'd really want to go there, lol.
Last edited by IdOp; 5th June 2020 at 03:28 AM. |
|
|||
Quote:
Code:
#!/usr/local/bin/python import os, re with open('document.xml') as f1: text = re.sub('<[^<]+w:p ?>', '\n', f1.read()) text = re.sub('<[^<]+>', '', text) with open('output.txt', 'w') as f2: f2.write(text)
__________________
May the source be with you! |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
New Convert NFSv3 NFSv4 | bigearsbilly | OpenBSD General | 1 | 6th March 2020 12:37 PM |
sh script to convert inches to mm and cm | J65nko | Programming | 6 | 8th August 2019 11:33 PM |
Package lists/Config files for Lightweight Desktop/Funding Option | shep | Feedback and Suggestions | 1 | 16th December 2013 07:24 PM |
Problem with convert | aleunix | OpenBSD Packages and Ports | 2 | 10th May 2012 01:52 PM |
Sizes of Package files, All & the rest | jaymax | FreeBSD Ports and Packages | 3 | 16th July 2008 08:36 PM |