It depends on the number of people emailing each other and on the operations done on the graph.
If there is a high chance that 2 people have emailed each other then you should go with adjacency matrix.
On the other hand if the number of edges (2 people who emailed each other at least one) is small compared to the number of email addresses you should go with adjacency list.
Another thing to look at is what types of operations are you doing on the graph.
So, if the majority of the operations consist of querying if two nodes have an edge between them, then adjacency matrix would be the best choice.
On the other hand if the majority of the operations are traversing the graph or querying the list of nodes connected to a given node, then adjacency list would be better.
If you are doing a mix of both types of queries, you could represent the graph as an array of hash tables. So, it would be an adjacency list representation using hash tables instead of lists.
Update
Please check the answers to this question. They explain in detail the differences between the adjacency list and adjacency matrix.
In order to find out the number of edges
I would run a program to calculate the number of edges. It would look like the following:
mp = hash_table
for email in emails
if !mp[email.sender][email.receiver]
mp.insert({email.sender, email.receiver})
end
end
return mp.size
If the program crashed, then you might have exceeded the memory and the number of edges is big compared to the number of email addresses (since the number of email addresses is in the millions [as mentioned in the comments]) and you might wanna go with adjacency list.
If you really wanna find the exact number of edges you could segment the emails where each segment consists of emails with the same sender and run the above program on each segment, then the final answer would be around the summation of results