Guide to Social Network Analysis of Emails
An interesting data source for a social network analysis is emails. The data is readily available in your email application and can reveal interesting aspects about your organization.
The following is a guide on how to export metadata from emails to be used in a social network analysis using Socilyzer.
Exporting Email Metadata from Outlook
This guide focuses on how to export social network data from Outlook.
The guide assumes that you are running Windows and have Python installed. The version demonstrated is Outlook 2013, however, the export functionality exists in older versions too. Follow the below steps to capture "From", "To", "CC" and "BCC" metadata from emails in a specific mailfolder.
In case you are trying to analyze and visualize Gmail metadata, Nathan Yau at Flowing Data has written a guide and a script for collecting this data.
- Click File in the ribbon then Open & Export and then Import/Export
- Choose Export to a File and proceed
- We want to export data as a Comma Separated Values (CSV) file. Select which mailfolder to export metadata from, e.g. Inbox. Continue and specify the file destination.
- Mark Export E-Mail messages from... and click Map Custom Fields...
and drag the following from the left pane to the right one:
From: (Address) From: (Name)
To: (Address) To: (Name)
CC: (Address) CC: (Name)
BCC: (Address) BCC: (Name)
- Close the window and click Finish. The CSV file is now being generated.
Next step is to process the generated CSV file. For this step we need to run a Python script that extracts and structures the relevant data. To run the script, you will need Python installed (read how to do that here).
- Download this zip-file which includes the Python script, the settings file, and a CSV example file.
- Once unzipped right-click the folder while holding shift down and select Copy as path...
- Click Windows' Start button and enter "cmd" in the searchbar. This starts the commando prompt.
- Enter "cd " (note the space) followed by right-clicking in the window and selecting Paste. Hit enter to navigate to the folder.
- Now move the CSV file generated earlier to the same folder to be processed.
- Optional: you can change the default settings by opening the settings.json file in a text editor.
- Back in the command prompt: enter "python parse_exported_csv.py " followed by the name of the file, e.g., "python parse_exported_csv.py sample_export.csv" and hit enter.
- The script will prompt you for whether or not to include CC and BCC data. It also allows you to ignore emails sent out to many people. You specify the threshold for this yourself.
- The script will output three files: email_counts.csv, attributes.csv, and groups.csv
Preparing Data for Import
The final step is to prepare the metadata for import into Socilyzer.
The attributes.csv file should be saved as an Excel file but the data does not need to be altered. The data in the groups.csv will be entered manually later.
All that needs to be processed is the email_counts.csv data so it fit's the 0-5 ratings that Socilyzer uses in the edgelist.xlsx file. This is done by following the steps in our guide on Data Extract.
Once the edgelist.xslx file is ready, the files just have to be uploaded (if in doubt see this guide) to Socilyzer.
The visualized email correspondence from the example file looks like this: