top of page

This is a project I pursued on my own. It is a script that parses through Google Voice HTML files and extracts meaningful data from them.

 

Using the HTML files downloaded from Google Takeout, the Python script uses the HTMLParser class to identify the different start tag data. Depending on which tags are flagged, the script will go through and handle the data accordingly, either counting the texts or calculating the duration of calls made to and from this person. All of the information is stored within a dictionary of dictionaries, with the key being each individual's name.

Currently, the script is functional. Plans are being made to automatically port the dictionary values into a CSV format, allowing for better data representation, as well as implementing a trie class for a better data structure (over the dictionary). The trie class would be organized by date, allowing for specific date ranges to be accessed (a feature that is currently not available).

Google Voice Analytics

bottom of page