34 Commits

Author SHA1 Message Date
ghostersk
0d12bea12d Update README.md 2025-03-16 21:41:21 +00:00
ghostersk
37a45344fe Update README.md 2025-03-16 21:38:54 +00:00
ghostersk
7dcf28f088 Update README.md 2025-03-16 21:37:28 +00:00
ghostersk
73bc53d3ae Add files via upload 2025-03-16 21:36:35 +00:00
ghostersk
b27f91bf5f Update README.md 2025-03-16 21:35:34 +00:00
ghostersk
552289b027 Create Gui-MSG-Viewer.py 2025-03-16 21:34:23 +00:00
ghostersk
10d73d5306 Create simple-PyQt6-gui.py 2025-03-16 21:24:17 +00:00
ghostersk
683d37a05b Delete setup.py 2025-03-16 21:22:13 +00:00
ghostersk
50aaa69932 Update README.md 2025-03-16 21:20:37 +00:00
ghostersk
cbcd61bf33 Create pyqt_pdf_print.py
This is simple code for accessing .msg email content 
- allow exporting it to PDF, HTML as well as all attachments.
2025-03-16 21:02:32 +00:00
ghostersk
6791a4a56c Update requirements.txt 2025-03-16 21:01:14 +00:00
ghostersk
a456dc94d8 Update and rename outlookmsgfile.py to msg-eml-convertor.py
added 2 functions, one for single file export and second for folder export to be easy to call from other modules
2025-03-16 21:00:16 +00:00
giacom0c
e08d296478 Fix a crash when you can't get an attachment filename (#32)
Co-authored-by: giacom0c <giacomoc7@protonmail.com>
2025-01-31 21:45:53 -05:00
Alan Berezin
e9a45164d6 Fix rtfparse imports broken by rtfparse 0.9.0 and add shebang to top of outlookmsgfile.py (#27)
Co-authored-by: Alan Berezin <alan.berezin1.gmail.com>
2024-04-07 07:39:42 -04:00
Eric Xanderson
8175fe3e2a Update setup.py (#26)
Need to figure out the absolute path to the install file, then load README.md
2024-03-05 08:21:57 -05:00
Joshua Tauberer
8cb06da0b8 Credit dependencies in the README 2024-02-23 09:57:17 -05:00
Joshua Tauberer
85f6573ecb Use html2text to back-fill a plain text body if an HTML body is present 2024-02-23 09:57:17 -05:00
Joshua Tauberer
4104dc937d Use rtfparse to extract HTML message bodies from RTF containers and create mutlipart/alternative messages if both plain text and HTML are available
Also fixes #20.
2024-02-23 09:57:15 -05:00
Joshua Tauberer
6fc382e9a6 Merge pull request #25 from MartijnVdS/string8_encoding
Decode byte strings in .msg files correctly
2024-02-23 09:30:54 -05:00
Joshua Tauberer
ce796116a5 Refactor how encodings are handled 2024-02-23 08:41:20 -05:00
Martijn van de Streek
674896d603 Decode byte strings in .msg files correctly
Non-Unicode strings in .msg files are encoded using an encoding that is
defined in a separate message property (PR_INTERNET_CPID for the body,
PR_MESSAGE_CODEPAGE for everything else).

The specification says that this property is required, however some real
world .msg files do not have it. This is why the decoding code has a
fallback to "cp1252" (Windows code page 1252, "Western Europe").

fixes #24
2024-02-23 08:01:46 +01:00
Martijn van de Streek
6f1a6e4b4a Use a raw string in re, so \n and \s work (#23)
Newer versions of Python complain about "\s" not being correct syntax
(SyntaxError during import); changing the string to a raw string fixes
the issue.

Co-authored-by: Martijn van de Streek <martijn.vandestreek@exxellence.nl>
2022-03-10 18:35:07 -05:00
Martijn van de Streek
5fa8976f86 Fix a crash when all 64 bits in timestamp are 1 (#22)
We've found some .msg files in the wild that have a CREATION_TIME that
has all 64 bits set: 9223372036854775807.

Adding this number of 100ns intervals to the base timestamp of
1601-01-01 results in a timestamp somewhere in the year 30828 which is
not supported by Python's datetime module, as datetime.MAXYEAR is
currently 9999.

Co-authored-by: Martijn van de Streek <martijn.vandestreek@exxellence.nl>
2022-02-10 11:41:08 -05:00
Martijn van de Streek
64c07db5b0 Use logging to log parse errors (#19)
Use `logging` to log parse errors, replacing print()
2021-07-21 10:05:27 -04:00
Martijn van de Streek
560a513349 Skip attachments without "__properties_version1.0" streams (#18)
We've found that messages with RTF formatting that contain embedded images
contain attachments without a "__properties_version1.0" stream.

As the current code is built around the "__properties_version1.0" stream,
these are skipped for now.

These image attachments do contain streams named "Ole" and "MailStream"
that should help with decoding/parsing in the future, but that's a bigger
project.
2021-07-21 10:03:19 -04:00
Martijn van de Streek
a057080bad Fix removing of Content-Type header from transport headers (#16)
The fourth argument to `re.sub` is `count`, but `re.I` (a flag) was passed
instead.

Because if this, messages with a lower-case "content-type" header would
never have their content-type header removed, leading to parse errors.

By explicity naming the parameter (`flags=`) to re.sub, the match
actually becomes case-insensitive.
2021-05-03 17:56:05 -04:00
Martijn van de Streek
d9edd0d32f Make package "pip install"able (#15)
By specifying "py_modules" instead of "packages" in setup.py, the
single-file module is found and installed in site-packages correctly.
2020-10-22 10:39:37 -04:00
Manabu Niseki
7f80b8e6bc Improve attachment filename normalization (#14)
Use `os.path.basename` instead of `urllib.parse.quote_plus` to improve filename normalization.
2020-07-05 18:47:28 -04:00
Rodrigo Salvador
d4a5944aba Include dependencies by requirements.txt (#10) 2019-09-16 19:49:04 -04:00
Alfredo
73fac36c80 Check for ATTACH_LONG_FILENAME before ATTACH_FILENAME (#7) 2019-05-22 06:21:36 -04:00
Alfredo
eee84c759f Check for key in props (#8) 2019-05-22 06:20:10 -04:00
Jeff Kerr
a8e1e8f064 Create LICENSE (#6) 2018-11-12 11:10:02 -05:00
Joshua Tauberer
4779154c8c urlencode attachment filenames to avoid some recursion depth exceeded error when message is converted to bytes 2018-03-16 17:35:24 -04:00
Joshua Tauberer
3f72102e4b initial commit 2018-03-14 16:24:47 -04:00