Convert::TNEF − Perl module to read TNEF files
use Convert::TNEF;
$tnef = Convert::TNEF−>read($iohandle, \%parms)
or die Convert::TNEF::errstr;
$tnef = Convert::TNEF−>read_in($filename, \%parms)
or die Convert::TNEF::errstr;
$tnef = Convert::TNEF−>read_ent($mime_entity, \%parms)
or die Convert::TNEF::errstr;
$tnef−>purge;
$message = $tnef−>message;
@attachments = $tnef−>attachments;
$attribute_value = $attachments[$i]−>data($att_attribute_name);
$attribute_value_size = $attachments[$i]−>size($att_attribute_name);
$attachment_name = $attachments[$i]−>name;
$long_attachment_name = $attachments[$i]−>longname;
$datahandle = $attachments[$i]−>datahandle($att_attribute_name);
TNEF stands for Transport Neutral Encapsulation Format, and if you've
ever been unfortunate enough to receive one of these files as an email
attachment, you may want to use this module.
read() takes as its first argument any file handle open
for reading. The optional second argument is a hash reference
which contains one or more of the following keys:
output_dir − Path for storing TNEF attribute data kept in files
(default: current directory).
output_prefix − File prefix for TNEF attribute data kept in files
(default: 'tnef').
output_to_core − TNEF attribute data will be saved in core memory unless
it is greater than this many bytes (default: 4096). May also be set to
'NONE' to keep all data in files, or 'ALL' to keep all data in core.
buffer_size − Buffer size for reading in the TNEF file (default: 1024).
debug − If true, outputs all sorts of info about what the read() function
is reading, including the raw ascii data along with the data converted
to hex (default: false).
display_after_err − If debug is true and an error is encountered,
reads and displays this many bytes of data following the error
(default: 32).
debug_max_display − If debug is true then read and display at most
this many bytes of data for each TNEF attribute (default: 1024).
debug_max_line_size − If debug is true then at most this many bytes of
data will be displayed on each line for each TNEF attribute
(default: 64).
ignore_checksum − If true, will ignore checksum errors while parsing
data (default: false).
read() returns an object containing the TNEF 'attributes' read from the
file and the data for those attributes. If all you want are the
attachments, then this is mostly garbage, but if you're interested then
you can see all the garbage by turning on debugging. If the garbage
proves useful to you, then let me know how I can maybe make it more
useful.
If an error is encountered, an undefined value is returned and the
package variable $errstr is set to some helpful message.
read_in() is a convienient front end for read() which takes a filename
instead of a handle.
read_ent() is another convient front end for read() which can take a
MIME::Entity object (or any object with like methods, specifically
open("r"), read($buff,$num_bytes), and close ).
purge() deletes any on−disk data that may be in the attachments of
the TNEF object.
message() returns the message portion of the tnef object, if any.
The thing it returns is like an attachment, but its not an attachment.
For instance, it more than likely does not have a name or any
attachment data.
attachments() returns a list of the attachments that the given TNEF
object contains. Returns a list ref if not called in array context.
data() takes a TNEF attribute name, and returns a string value for that
attribute for that attachment. Its your own problem if the string is too
big for memory. If no argument is given, then the 'AttachData' attribute
is assumed, which is probably the attachment data you're looking for.
name() is the same as data(), except the attribute 'AttachTitle' is
the default, which returns the 8 character + 3 character extension name
of the attachment.
longname() returns the long filename and extension of an attachment. This
is embedded within a MAPI property of the 'Attachment' attribute data, so
we attempt to extract the name out of that.
size() takes an TNEF attribute name, and returns the size in bytes for
the data for that attachment attribute.
datahandle() is a method for attachments which takes a TNEF attribute
name, and returns the data for that attribute as a handle which is
the same as a MIME::Body handle. See MIME::Body for all the applicable
methods. If no argument is given, then 'AttachData' is assumed.
# Here's a rather long example where mail is retrieved
# from a POP3 server based on header information, then
# it is MIME parsed, and then the TNEF contents
# are extracted and converted.
use strict;
use Net::POP3;
use MIME::Parser;
use Convert::TNEF;
my $mail_dir = "mailout";
my $mail_prefix = "mail";
my $pop = new Net::POP3 ( "pop3server_name" );
my $num_msgs = $pop−>login("user_name","password");
die "Can't login: $!" unless defined $num_msgs;
# Get mail by sender and subject
my $mail_out_idx = 0;
MESSAGE: for ( my $i=1; $i<= $num_msgs; $i++ ) {
my $header = join "", @{$pop−>top($i)};
for ($header) {
next MESSAGE unless
/^from:.*someone\@somewhere.net/im &&
/^subject:\s*important stuff/im
}
my $fname = $mail_prefix."−".$$.++$mail_out_idx.".doc";
open (MAILOUT, ">$mail_dir/$fname")
or die "Can't open $mail_dir/$fname: $!";
# If the get() complains, you need the new libnet bundle
$pop−>get($i, \*MAILOUT) or die "Can't read mail";
close MAILOUT or die "Error closing $mail_dir/$fname";
# If you want to delete the mail on the server
# $pop−>delete($i);
}
close MAILOUT;
$pop−>quit();
# Parse the mail message into separate mime entities
my $parser=new MIME::Parser;
$parser−>output_dir("mimemail");
opendir(DIR, $mail_dir) or die "Can't open directory $mail_dir: $!";
my @files = map { $mail_dir."/".$_ } sort
grep { −f "$mail_dir/$_" and /$mail_prefix−$$−/o } readdir DIR;
closedir DIR;
for my $file ( @files ) {
my $entity=$parser−>parse_in($file) or die "Couldn't parse mail";
print_tnef_parts($entity);
# If you want to delete the working files
# $entity−>purge;
}
sub print_tnef_parts {
my $ent = shift;
if ( $ent−>parts ) {
for my $sub_ent ( $ent−>parts ) {
print_tnef_parts($sub_ent);
}
} elsif ( $ent−>mime_type =~ /ms−tnef/i ) {
# Create a tnef object
my $tnef = Convert::TNEF−>read_ent($ent,{output_dir=>"tnefmail"})
or die $Convert::TNEF::errstr;
for ($tnef−>attachments) {
print "Title:",$_−>name,"\n";
print "Data:\n",$_−>data,"\n";
}
# If you want to delete the working files
# $tnef−>purge;
}
}
perl(1), IO::Wrap(3), MIME::Parser(3), MIME::Entity(3), MIME::Body(3)
The parsing may depend on the endianness (see perlport) and width of
integers on the system where the TNEF file was created. If this proves
to be the case (check the debug output), I'll see what I can do
about it.
Douglas Wilson, dougw@cpan.org
Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.
Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.
Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.
Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.
The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.
Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.
Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.
Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.