Home » Kerio User Forums » Kerio Connect » Email Archive Script

Messages: 1
Karma: 0
Send a private message to this user
I thought I'd share a small script I made to help with archiving Kerio mailboxes. Like the rest of you, I had users with 10, 20, 30, even 50GB of email from over the years.

This script runs on Python 2.6 (have not tested under 3, as our email server is CentOS 5 - ancient!). You can run this script from the terminal in a users mail folder and it will move the files around into folders based on year.

So, on my system I run the script from:


You'll end up with something like

├── Deleted Items
├── Inbox
| ├── foo
| └── bar
└── Sent Items

├── Deleted Items
├── Inbox
| ├── foo
| └── bar
└── Sent Items


And of course, after the script runs you will want to re-index the mailbox. And possibly move the old files to another server or backup drive, etc.

import os, sys
import email
import pyzmail

from shutil import copyfile
import datetime as datetime

def get_date_of_email(filename):
    Return the date parsed from 'Date:' in the message header

        filename: the name of the email file, typically .eml extension

        datetime object from the message header, or "NONE" if
        there is not one found, or the date is an un-standard format

    input_file = open(filename)
    msg = pyzmail.parse.message_from_file(input_file)
    date_string = msg.get_decoded_header('Date')
    if date_string == '':
        return "NONE"
        date_tuple = email.utils.parsedate(date_string)
    # handle dates like 11/16/2016 3:29:16PM
    if date_tuple is None:
        print "Date was not in RFC2822 format: ", date_string
        return "NONE"
        # handle yy dates, ie '16' instead of '2016'
        if date_tuple[0] < 100:
            # convert tuple to list so we can modify, then convert back
            l = list(date_tuple)
            l[0] += 2000
            date_tuple = tuple(l)
        return datetime.date(*date_tuple[:3])

def archive_folder(folder_name):
    Crawls the mailbox folder and moves all items to a new folder,
    by the date of the message headers, for each calendar year

        folder_name: name of the folder/directory to crawl


    print "Archiving ", folder_name
    if not os.path.isdir(folder_name):
        print "Directory for folder doesn't seem to exist: ", folder_name
    today = datetime.date.today()

    # recursively walk email directory
    for dirpath, dirnames, filenames in os.walk(folder_name):
        for f in filenames:
            # combine the full directory path with the file name
            full_filename = os.path.join(dirpath, f)
            print "* Processing ", full_filename

            # is this an .eml file?
            if os.path.splitext(f)[1].lower() != ".eml":
                print "Not an .eml file. Skipping ", full_filename
                # is 'noarchive' in the name?
                if any(s in full_filename.lower() for s in no_archive_directives):
                    print "'NO ARCHIVE' is set for ", full_filename
                    msg_date = get_date_of_email(full_filename)
                    if msg_date == "NONE":
                        print "DATE ERROR with ", full_filename
                        # is this message over 1 year old? archive it.
                        diff = today - msg_date
                        if diff.days > 365:
                            # build the new directory starting with the message year
                            # create if it doesn't exist
                            target_dir_name = os.path.join(cwd, str(msg_date.year), dirpath)
                            if not os.path.exists(target_dir_name):
                                print "Need to create the directory ", target_dir_name

                            print "copied to ", target_dir_name
                            movefile(full_filename, os.path.join(target_dir_name, f))

# main script begins

# folders to archive
folders_to_archive = ("INBOX", "Sent Items", "Deleted Items")
# ignore folders with this text in name. these will be compared case-insensitive
no_archive_directives = ("noarchive", "no archive")

cwd = os.getcwd()
print "current dir: ", cwd

for f in folders_to_archive:

[Updated on: Mon, 26 November 2018 13:53]

Previous Topic: Calender time incorrect
Next Topic: Outlook 2019
Goto Forum:

Kerio discussion forums are intended for open communication between forum members and may contain information and material posted by members which may be useful in learning about Kerio products. The discussion forums are not intended to provide technical support for any specific product. Any information implied or expressed in the discussion forums is that of the posting member. Kerio is in no way responsible for the information posted in the forums, or its accuracy. Kerio employees may participate in the discussions, but their postings do not represent an offical position of the company on any issues raised or discussed. Kerio reserves the right to monitor and maintain the forums to promote free and accurate exchange of information.

Current Time: Tue Dec 11 17:55:45 CET 2018

Total time taken to generate the page: 0.80307 seconds
.:: Contact :: Home ::.
Powered by: FUDforum 3.0.4.