Causal Graph Inference on Cryptocurrencies

This post explores the use of a causal analysis tool that allows us to generate a partial causal graph from purely observational data.

And because cryptocurrency is the new investment hype, let's attempt to make sense of crypto prices using causal graph inference! 🤑


1. Correlation DOES NOT Imply Causation

People often confuse correlation with causation. The following charts should illustrate my point:

chart1

chart2

chart3

Note: You can find more entertaining correlations from Spurious Correlations.

In short:

  • Causation is hard to find, but very powerful (💪).
  • Correlation is easier to find, but less powerful (as evident in the charts above).

2. How to Quantify Causality?

We'll talk about causality from the perspective of the Pearlian causal framework, which uses graphs as the language of causality.

In other words, to represent X causes Y explicitly, we simply use graphs:

x_to_y

It's intuitive and pictorial, and lets you talk about causal pathways from one variable to another: if you can put together a chain of cause and effect going from X to Y, then X might have a causal effect on Y. In that framework, it’s easy to enumerate the consequences of actions.

But how do you find the graph? We'll use a causal analysis tool that's able to give you a partial causal graph from purely observational data. However, before we proceed to the next section, be sure to install the following:

  • pip install causality
  • pip install networkx

3. Let's "Causal Analyze" Cryptocurrencies

To retrieve crypto data, load the code from this blog post. We're only interested in the variable combined_df, which is a pandas.DataFrame object that contains the values of each cryptocurrency (in USD) w.r.t. time.

You should see the following table (i.e. last 5 rows) when you execute combined_df.tail():

DASH ETC ETH LTC SC STR XEM XMR XRP BTC
date
2017-12-18 1093.617265 37.237390 743.536101 329.416791 0.014948 0.265881 0.755417 350.778285 0.731127 18684.557924
2017-12-19 1155.879642 39.012279 811.639229 346.905402 0.015313 0.272750 0.923820 363.436691 0.762403 18015.201393
2017-12-20 1347.509546 38.341707 780.718146 315.366494 0.018790 0.235122 0.889440 403.312159 0.701708 16628.158457
2017-12-21 1402.437089 38.373994 791.643752 302.663058 0.022633 0.251669 0.945312 425.916859 0.943081 15938.493055
2017-12-22 1247.069623 32.871011 748.259744 283.881877 0.019761 0.245783 0.867034 365.363276 1.127947 15438.632917

As always, we'll need to import the required packages.

from causality.inference.search import IC
from causality.inference.independence_tests import RobustRegressionTest
import networkx as nx

import matplotlib.pyplot as plt
%matplotlib inline
  
# Set figure width to 15 and height to 12
plt.rcParams["figure.figsize"] = [15., 12.]

We'll proceed to perform causal analysis on our cryptocurrency prices.

The output will be a variable called graph.

# Define the variable types: 'c' is 'continuous'.  
# The variables defined here are the ones the search is performed over, 
# i.e. the columns in our DataFrame that represent various cryptocurrencies.
variable_types = {x:'c' for x in combined_df.columns}

# Run the IC* algorithm (IC = Inductive Causation)
ic_algorithm = IC(RobustRegressionTest, alpha=0.1)
graph = ic_algorithm.search(data=combined_df, variable_types=variable_types)

Let's take a peek at the nodes in the graph:

graph.nodes()

Output:

['DASH', 'ETH', 'SC', 'XEM', 'XRP', 'BTC', 'STR', 'XMR', 'LTC', 'ETC']

And then the edges in the graph:

graph.edges(data=True)

Output:

[('DASH', 'ETH', {'arrows': ['ETH', 'ETH'], 'marked': False}),
 ('DASH', 'ETC', {'arrows': ['ETC'], 'marked': False}),
 ('DASH', 'XMR', {'arrows': [], 'marked': False}),
 ('ETH', 'XEM', {'arrows': ['ETH', 'XEM'], 'marked': False}),
 ('ETH', 'ETC', {'arrows': ['ETH', 'ETH'], 'marked': True}),
 ('ETH', 'XRP', {'arrows': ['ETH', 'ETH', 'XRP', 'XRP'], 'marked': False}),
 ('SC', 'XRP', {'arrows': ['XRP', 'XRP', 'XRP'], 'marked': False}),
 ('SC', 'ETC', {'arrows': ['ETC'], 'marked': False}),
 ('XEM', 'XRP', {'arrows': ['XRP', 'XRP'], 'marked': True}),
 ('XEM', 'LTC', {'arrows': ['XEM', 'LTC'], 'marked': False}),
 ('XRP', 'STR', {'arrows': ['XRP', 'XRP', 'STR'], 'marked': False}),
 ('BTC', 'STR', {'arrows': ['BTC', 'STR'], 'marked': False}),
 ('BTC', 'XMR', {'arrows': ['BTC'], 'marked': False}),
 ('STR', 'LTC', {'arrows': ['LTC'], 'marked': True}),
 ('XMR', 'LTC', {'arrows': ['LTC'], 'marked': False})]

4. Sanitize the Data

In order to generate a visually appealing graph, we'll perform the following:

  1. Add colors to the nodes (representing various cryptocurrencies).
  2. Remove duplicated arrows from the edges.
  3. Remove unwanted labels from the edges.

4.1 Nodes: Generate unique colors per node. This is based on a previous blog post on generating a range of "n" colors.

from colour import Color

def get_color_range(n, output_type='hex'):
    red = Color('red')
    blue = Color('blue')
    color_range = list(red.range_to(blue, n))
    if output_type == 'hex':
        return [c.get_hex_l() for c in color_range]
    else:
        return [c.get_rgb() for c in color_range]
        
n_nodes = len(graph.nodes())
n_colors = get_color_range(n_nodes)

4.2 Edges: Sanitize the edges by removing duplicated arrows.

# Sanitize graph edgges: remove duplicated arrows
sanitized_edges = []
for t in graph.edges(data=True):
    attr = t[2]
    attr['arrows'] = list(set(attr['arrows']))
    sanitized_edges.append((t[0], t[1], attr))

4.3 Edge Labels: Sanitize edge labels by removing labels marked as False, while converting labels marked as True into CAUSAL labels.

edge_labels = [((u,v,),'CAUSAL') if d['marked'] else ((u,v,),'') for u,v,d in graph.edges(data=True)]
edge_labels = dict(edge_labels)

5. Plot the Graph

pos = nx.spring_layout(G, k=2000, iterations=1000)

# Add nodes
nx.draw_networkx_nodes(G, pos, node_color=n_colors, node_size=2000)

# Add labels to nodes
nx.draw_networkx_labels(G, pos, font_size=12)

# Add edges
nx.draw_networkx_edges(G, pos, edgelist=sanitized_edges, 
                       arrows=True, 
                       width=2.0,
                       edge_color='#1C2833',
                       style='dotted', 
                       alpha=0.8)

# Add labels to edges
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels,
                             font_size=12,
                             font_weight='bold')

# plt.draw()
plt.show()

causal_inference_crypto

Analysis:

  • Each node represents a unique cryptocurrency, namely: Ethereum, Litecoin, Ripple, Ethereum Classic, Stellar, Dash, Siacoin, Monero, NEM, and (of course), Bitcoin.
  • A single-ended arrow (that encodes directional information of an edge) is represented by a thick, solid block.
    • Refresher: "XY" means "X causes Y".
  • The "CAUSAL" label on an edge indicates that the algorithm believes that there's an arrow from X to Y which is genuinely causal.
  • Whereas the absence of it just means that the algorithm isn't really sure. One possibility is that there could be a latent confounding variable (Z) between X and Y.
    • In other words: "XZY" (note the different arrow directions).

Thus, based on historical crypto prices, the algorithm believes that the following are genuinely causal (you should know what I mean):

I hope that by now, you should realize the power and importance of causal graph inference — and its potential to make you rich 😛.


If you enjoyed this post and want to buy me a cup of coffee...

The thing is, I'll always accept a cup of coffee. So feel free to buy me one.

Cheers! ☕️