In late 2017 Bitcoin, btc, was gaining popularity as the major cryptocurrency and many began to acquire and exchanges the cryptocurrency. One of the draws of btc and other cryptocurrencies is the anonymity that is afforded to the those involved in transactions. While all transactions are public and recorded on the blockchain only addresses and accounts are visible. Peoples identities can be detracted from a particular address. In addition, wallets allowed users to generate new addresses at, virtually, any time.
During this period, in 2016, sources claimed that the number of btc users was growing exponentially. This trend was also extrapolated to the number of btc transaction that occurred on the blockchain. However, during this time btc was still not respected as an actual currency and it was difficult to exchange it like money. As a result we expected to see many users but few interactions between them. In this project I took snap shots of the blockchain and performed network analyses to characterize the network, behavior, and explore hypotheses and claims made about the network.
Using a websocket API provided by Blockchain.info I was able to sample blockchain transaction data which included, sender/receiver addresses, amount, hash difficulty, etc. I was only concerned with the sender, receiver, and the amount of btc sent. I took three samples of the network which included a one, two, and six hour sample. The data, containing a little over 13,000 records was parsed and converted into a .graphml file for analysis. Using the igraph package with R I looked at the following network characteristics, transitivity, average degree, average distance, reciprocity, degree distribution, and maximal cliques. In addition, on the largest network, I evaluated community structures.
The graph file was then ported over to Gephi mainly for visualization purposes. Network analysis was also performed with Gephi because previous experiences taught me that Gephi metrics vary slightly from those gather by igraph. Gephi and igraph sometimes, for certain network metrics, utilize slightly different algorithms and thus, different values result. Gephi helped with the visualization of influential players in the network as well as community structures.
From my work on this project I was able to analyze the btc network and characterize it. More importantly, I was able to develop skills in working with API’s, parsing large data, analyzing that data, and presenting it with visual aids such as igraph and Gephi. These skills are applicable in many other situations that require the gathering, analysis, and presentation/visualization of large amounts of data. R is a powerful statistics tool and I foresee my skills with it being useful in my pursuit toward data driven research. Networks are ever present and being able to recognize and analyze them is an important skill.