Monday, January 14, 2019

Bitcoin Census trickery

I wish I could have posted this on my gh-pages blog site,
but apparently I do not have sufficient skills to embed
script tags into their markdown rendering.
So, in order to show some fancy maps I have to use this blog again.
Stay tuned, and regularily check both of my blogs for updates;
because gh-pages is better for embedded code snippets while
this seems better for embedding HTML.

I published my BTC mapping engine hoschi.

It will crawl through the BTC p2p network, fetching all
addresses it can. The obtained raw data may be used for
further analysis like mapping it to geo locations or building
graphs of the connected nodes in order to spot unusual network
setups or to locate mining farms.

For instance, scanning through the BTC testnet, yields this map:



The cool thing is that github will automatically render
geojson files and also cluster the points for me.
The map for the BTC main network is too large to be rendered
on gh.
Nevertheless, I was actually interested in big nodes:


which are connected to more than 1000 other BTC nodes in
the network. In my census, I found ~272k mainnet BTC nodes,
~1k of which are big-nodes. These could either be just
long running nodes so that a lot of other nodes have had
the chance to connect to it, or nodes located nearby
mining farms in order to distribute mined blocks quickly
across the p2p network. However, it may also be possible
that mining farms use their own BTC software not answering
to getaddr requests.
Some of the big nodes seem to be special indeed, as whois lookups showed weird registrar naming entries such as Data Bureau.

Other interesting occurences were nodes that tried to
join testnet and mainnet, heavily multihomed nodes, or
nodes distributing multicast or otherwise special IP addresses. According to my census, IPv6 is already much more widespread than
one would think. 22% of the BTC nodes were IPv6.