|↑↑↑ Home||↑↑ UNIX||↑ Updateware|
unmht is an unpacker for MIME HTML (.mht / .mthml) archives. I wrote it because mht-rip did not work for me, because mht-rip and the mhtconv seemed inactive for some time, and because Perl makes programming this kind of thing ridiculously easy.
unmht saves all except the primary HTML file to a subdirectory and rewrites HTML links to point to the saved files for offline viewing.
Get unmht: Download
unmht has the following Perl module dependencies: HTML::PullParser, HTML::Tagset, MIME::Base64, MIME::QuotedPrint and Getopt::Long. After checking you have these (try perldoc HTML::PullParser etc.), put unmht somewhere in your path and extract the manual page with pod2man unmht > /usr/local/man/man1/unmht.1 (or similar).
unmht - Unpack a MIME HTML archive
unmht unpacks MIME HTML archives that some browsers (such as Opera) save by default. The file extensions of such archives are .mht or .mhtml.
The first HTML file in the archive is taken to be the primary web page, the other contained files for "page requisites" such as images or frames. The primary web page is written to the output directory (the current directory by default), the requisites to a subdirectory named after the primary HTML file name without extension, with "_files" appended. Link URLs in all HTML files referring to requisites are rewritten to point to the saved files.
Print a brief usage summary.
List archive contents instead of unpacking. Four columns are output: file name, MIME type, size and URL. Unavailable entries are replaced by "(?)".
If the argument ends in a slash or is an existing directory, unpack to that directory instead of current directory. Otherwise the argument is taken as a path to the file name to write the primary HTML file to. If the output directory does not exist, it is created.
unmht is Copyright (c) 2012 Volker Schatz. It may be copied and/or modified under the same terms as Perl.
TOS / Impressum