# Causal Graph Inference on Cryptocurrencies

This post explores the use of a causal analysis tool that allows us to generate a partial causal graph from purely observational data.

And because cryptocurrency is the new investment hype, let's attempt to make sense of crypto prices using causal graph inference! 🤑

### 1. Correlation DOES NOT Imply Causation

People often confuse correlation with causation. The following charts should illustrate my point:

Note: You can find more entertaining correlations from Spurious Correlations.

In short:

• Causation is hard to find, but very powerful (💪).
• Correlation is easier to find, but less powerful (as evident in the charts above).

### 2. How to Quantify Causality?

We'll talk about causality from the perspective of the Pearlian causal framework, which uses graphs as the language of causality.

In other words, to represent X causes Y explicitly, we simply use graphs:

It's intuitive and pictorial, and lets you talk about causal pathways from one variable to another: if you can put together a chain of cause and effect going from X to Y, then X might have a causal effect on Y. In that framework, it’s easy to enumerate the consequences of actions.

But how do you find the graph? We'll use a causal analysis tool that's able to give you a partial causal graph from purely observational data. However, before we proceed to the next section, be sure to install the following:

• pip install causality
• pip install networkx

### 3. Let's "Causal Analyze" Cryptocurrencies

To retrieve crypto data, load the code from this blog post. We're only interested in the variable combined_df, which is a pandas.DataFrame object that contains the values of each cryptocurrency (in USD) w.r.t. time.

You should see the following table (i.e. last 5 rows) when you execute combined_df.tail():

DASH ETC ETH LTC SC STR XEM XMR XRP BTC
date
2017-12-18 1093.617265 37.237390 743.536101 329.416791 0.014948 0.265881 0.755417 350.778285 0.731127 18684.557924
2017-12-19 1155.879642 39.012279 811.639229 346.905402 0.015313 0.272750 0.923820 363.436691 0.762403 18015.201393
2017-12-20 1347.509546 38.341707 780.718146 315.366494 0.018790 0.235122 0.889440 403.312159 0.701708 16628.158457
2017-12-21 1402.437089 38.373994 791.643752 302.663058 0.022633 0.251669 0.945312 425.916859 0.943081 15938.493055
2017-12-22 1247.069623 32.871011 748.259744 283.881877 0.019761 0.245783 0.867034 365.363276 1.127947 15438.632917

As always, we'll need to import the required packages.

from causality.inference.search import IC
from causality.inference.independence_tests import RobustRegressionTest
import networkx as nx

import matplotlib.pyplot as plt
%matplotlib inline

# Set figure width to 15 and height to 12
plt.rcParams["figure.figsize"] = [15., 12.]


We'll proceed to perform causal analysis on our cryptocurrency prices.

The output will be a variable called graph.

# Define the variable types: 'c' is 'continuous'.
# The variables defined here are the ones the search is performed over,
# i.e. the columns in our DataFrame that represent various cryptocurrencies.
variable_types = {x:'c' for x in combined_df.columns}

# Run the IC* algorithm (IC = Inductive Causation)
ic_algorithm = IC(RobustRegressionTest, alpha=0.1)
graph = ic_algorithm.search(data=combined_df, variable_types=variable_types)


Let's take a peek at the nodes in the graph:

graph.nodes()


Output:

['DASH', 'ETH', 'SC', 'XEM', 'XRP', 'BTC', 'STR', 'XMR', 'LTC', 'ETC']


And then the edges in the graph:

graph.edges(data=True)


Output:

[('DASH', 'ETH', {'arrows': ['ETH', 'ETH'], 'marked': False}),
('DASH', 'ETC', {'arrows': ['ETC'], 'marked': False}),
('DASH', 'XMR', {'arrows': [], 'marked': False}),
('ETH', 'XEM', {'arrows': ['ETH', 'XEM'], 'marked': False}),
('ETH', 'ETC', {'arrows': ['ETH', 'ETH'], 'marked': True}),
('ETH', 'XRP', {'arrows': ['ETH', 'ETH', 'XRP', 'XRP'], 'marked': False}),
('SC', 'XRP', {'arrows': ['XRP', 'XRP', 'XRP'], 'marked': False}),
('SC', 'ETC', {'arrows': ['ETC'], 'marked': False}),
('XEM', 'XRP', {'arrows': ['XRP', 'XRP'], 'marked': True}),
('XEM', 'LTC', {'arrows': ['XEM', 'LTC'], 'marked': False}),
('XRP', 'STR', {'arrows': ['XRP', 'XRP', 'STR'], 'marked': False}),
('BTC', 'STR', {'arrows': ['BTC', 'STR'], 'marked': False}),
('BTC', 'XMR', {'arrows': ['BTC'], 'marked': False}),
('STR', 'LTC', {'arrows': ['LTC'], 'marked': True}),
('XMR', 'LTC', {'arrows': ['LTC'], 'marked': False})]


### 4. Sanitize the Data

In order to generate a visually appealing graph, we'll perform the following:

1. Add colors to the nodes (representing various cryptocurrencies).
2. Remove duplicated arrows from the edges.
3. Remove unwanted labels from the edges.

4.1 Nodes: Generate unique colors per node. This is based on a previous blog post on generating a range of "n" colors.

from colour import Color

def get_color_range(n, output_type='hex'):
red = Color('red')
blue = Color('blue')
color_range = list(red.range_to(blue, n))
if output_type == 'hex':
return [c.get_hex_l() for c in color_range]
else:
return [c.get_rgb() for c in color_range]

n_nodes = len(graph.nodes())
n_colors = get_color_range(n_nodes)


4.2 Edges: Sanitize the edges by removing duplicated arrows.

# Sanitize graph edgges: remove duplicated arrows
sanitized_edges = []
for t in graph.edges(data=True):
attr = t[2]
attr['arrows'] = list(set(attr['arrows']))
sanitized_edges.append((t[0], t[1], attr))


4.3 Edge Labels: Sanitize edge labels by removing labels marked as False, while converting labels marked as True into CAUSAL labels.

edge_labels = [((u,v,),'CAUSAL') if d['marked'] else ((u,v,),'') for u,v,d in graph.edges(data=True)]
edge_labels = dict(edge_labels)


### 5. Plot the Graph

pos = nx.spring_layout(G, k=2000, iterations=1000)

# Add nodes
nx.draw_networkx_nodes(G, pos, node_color=n_colors, node_size=2000)

# Add labels to nodes
nx.draw_networkx_labels(G, pos, font_size=12)

# Add edges
nx.draw_networkx_edges(G, pos, edgelist=sanitized_edges,
arrows=True,
width=2.0,
edge_color='#1C2833',
style='dotted',
alpha=0.8)

# Add labels to edges
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels,
font_size=12,
font_weight='bold')

# plt.draw()
plt.show()


Analysis:

• Each node represents a unique cryptocurrency, namely: Ethereum, Litecoin, Ripple, Ethereum Classic, Stellar, Dash, Siacoin, Monero, NEM, and (of course), Bitcoin.
• A single-ended arrow (that encodes directional information of an edge) is represented by a thick, solid block.
• Refresher: "XY" means "X causes Y".
• The "CAUSAL" label on an edge indicates that the algorithm believes that there's an arrow from X to Y which is genuinely causal.
• Whereas the absence of it just means that the algorithm isn't really sure. One possibility is that there could be a latent confounding variable (Z) between X and Y.
• In other words: "XZY" (note the different arrow directions).

Thus, based on historical crypto prices, the algorithm believes that the following are genuinely causal (you should know what I mean):

I hope that by now, you should realize the power and importance of causal graph inference — and its potential to make you rich 😛.

If you enjoyed this post and want to buy me a cup of coffee...

The thing is, I'll always accept a cup of coffee. So feel free to buy me one.

Cheers! ☕️

#### Jovian Lin, Ph.D.

A Singaporean with a fiery passion in solving real-life problems with machine learning and intelligent hacks.