Google Summer of Code 2019

Quick Overview

Below is a condensed version of all the blog posts I wrote during my time working on the package symbolic-pymc. The below code is outdated and the package is now titled aesara and can be found at https://github.com/aesara-devs/aesara

Old Posts

Introduction

This summer I will be working as part of the 2019 Google Summer of Code program on the open source package PyMC and its symbolic counterpart aptly named symbolic-pymc. In the next few posts I will be collecting my thoughts and describing what both these packages do and how I’ll be contributing.

General Update

Since my last blog post I am transitioning into working more in minikanren. In the meantime I have created a PR that adds the ability to index graphs. This will make it easier moving forward when the user needs to study the graph piece by piece. For example,

 1    import pymc4 as pm
 2  
 3    from pymc4 import distributions as dist
 4  
 5    import tensorflow as tf
 6  
 7    from tensorflow.python.eager.context import graph_mode
 8  
 9    from symbolic_pymc.tensorflow.printing import tf_dprint
10  
11  
12    @pm.model
13    def transform_example():
14        x = yield dist.Normal('x', mu=0, sigma=1)
15        y = yield dist.Normal('y', mu=0, sigma=1)
16        q = tf.realdiv(x, y)
17        return q
18  
19  
20    with graph_mode():
21        model = transform_example()
22        obs_graph, state = pm.evaluate_model(model)
23  
24    _ = tf_dprint(obs_graph)
25

Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Tensor(RealDiv):0,  shape=[]    "RealDiv:0"
|  Op(RealDiv)  "RealDiv"
|  |  Tensor(Reshape):0,    shape=[]    "Normal_1/sample/Reshape:0"
|  |  |  Op(Reshape)    "Normal_1/sample/Reshape"
|  |  |  |  Tensor(Add):0,  shape=[1]   "Normal_1/sample/add:0"
|  |  |  |  |  Op(Add)  "Normal_1/sample/add"
|  |  |  |  |  |  Tensor(Mul):0,    shape=[1]   "Normal_1/sample/mul:0"
|  |  |  |  |  |  |  Op(Mul)    "Normal_1/sample/mul"
|  |  |  |  |  |  |  |  Tensor(Add):0,  shape=[1]   "Normal_1/sample/random_normal:0"
|  |  |  |  |  |  |  |  |  Op(Add)  "Normal_1/sample/random_normal"
|  |  |  |  |  |  |  |  |  |  Tensor(Mul):0,    shape=[1]   "Normal_1/sample/random_normal/mul:0"
|  |  |  |  |  |  |  |  |  |  |  Op(Mul)    "Normal_1/sample/random_normal/mul"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(RandomStandardNormal):0, shape=[1]   "Normal_1/sample/random_normal/RandomStandardNormal:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  Op(RandomStandardNormal) "Normal_1/sample/random_normal/RandomStandardNormal"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(ConcatV2):0,   shape=[1]   "Normal_1/sample/concat:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Op(ConcatV2)   "Normal_1/sample/concat"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_1/sample/concat/values_0:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(BroadcastArgs):0,    shape=[0]   "Normal_1/sample/BroadcastArgs:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Op(BroadcastArgs)    "Normal_1/sample/BroadcastArgs"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_1/sample/Shape:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_1/sample/Shape_1:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_1/sample/concat/axis:0"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_1/sample/random_normal/stddev:0"
|  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_1/sample/random_normal/mean:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal/scale:0"
|  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal/loc:0"
|  |  |  |  Tensor(ConcatV2):0, shape=[0]   "Normal_1/sample/concat_1:0"
|  |  |  |  |  Op(ConcatV2) "Normal_1/sample/concat_1"
|  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_1/sample/sample_shape:0"
|  |  |  |  |  |  Tensor(StridedSlice):0,   shape=[0]   "Normal_1/sample/strided_slice:0"
|  |  |  |  |  |  |  Op(StridedSlice)   "Normal_1/sample/strided_slice"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_1/sample/Shape_2:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_1/sample/strided_slice/stack:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_1/sample/strided_slice/stack_1:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_1/sample/strided_slice/stack_2:0"
|  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_1/sample/concat_1/axis:0"
|  |  Tensor(Reshape):0,    shape=[]    "Normal_2_1/sample/Reshape:0"
|  |  |  Op(Reshape)    "Normal_2_1/sample/Reshape"
|  |  |  |  Tensor(Add):0,  shape=[1]   "Normal_2_1/sample/add:0"
|  |  |  |  |  Op(Add)  "Normal_2_1/sample/add"
|  |  |  |  |  |  Tensor(Mul):0,    shape=[1]   "Normal_2_1/sample/mul:0"
|  |  |  |  |  |  |  Op(Mul)    "Normal_2_1/sample/mul"
|  |  |  |  |  |  |  |  Tensor(Add):0,  shape=[1]   "Normal_2_1/sample/random_normal:0"
|  |  |  |  |  |  |  |  |  Op(Add)  "Normal_2_1/sample/random_normal"
|  |  |  |  |  |  |  |  |  |  Tensor(Mul):0,    shape=[1]   "Normal_2_1/sample/random_normal/mul:0"
|  |  |  |  |  |  |  |  |  |  |  Op(Mul)    "Normal_2_1/sample/random_normal/mul"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(RandomStandardNormal):0, shape=[1]   "Normal_2_1/sample/random_normal/RandomStandardNormal:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  Op(RandomStandardNormal) "Normal_2_1/sample/random_normal/RandomStandardNormal"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(ConcatV2):0,   shape=[1]   "Normal_2_1/sample/concat:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Op(ConcatV2)   "Normal_2_1/sample/concat"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_2_1/sample/concat/values_0:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(BroadcastArgs):0,    shape=[0]   "Normal_2_1/sample/BroadcastArgs:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Op(BroadcastArgs)    "Normal_2_1/sample/BroadcastArgs"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_2_1/sample/Shape:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_2_1/sample/Shape_1:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_2_1/sample/concat/axis:0"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_2_1/sample/random_normal/stddev:0"
|  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_2_1/sample/random_normal/mean:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_2/scale:0"
|  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_2/loc:0"
|  |  |  |  Tensor(ConcatV2):0, shape=[0]   "Normal_2_1/sample/concat_1:0"
|  |  |  |  |  Op(ConcatV2) "Normal_2_1/sample/concat_1"
|  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_2_1/sample/sample_shape:0"
|  |  |  |  |  |  Tensor(StridedSlice):0,   shape=[0]   "Normal_2_1/sample/strided_slice:0"
|  |  |  |  |  |  |  Op(StridedSlice)   "Normal_2_1/sample/strided_slice"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_2_1/sample/Shape_2:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_2_1/sample/strided_slice/stack:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_2_1/sample/strided_slice/stack_1:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_2_1/sample/strided_slice/stack_2:0"
|  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_2_1/sample/concat_1/axis:0"
python.el: native completion setup loaded

The above output can be tedious to read. With the changes I’m implementing one could look at the graph in a reduced form, such as,

1    _ = tf_dprint(obs_graph, depth_index=range(3, 6))

|  |  |  Op(Reshape)    "Normal_1/sample/Reshape"
|  |  |  |  Tensor(Add):0,  shape=[1]   "Normal_1/sample/add:0"
|  |  |  |  |  Op(Add)  "Normal_1/sample/add"
|  |  |  |  Tensor(ConcatV2):0, shape=[0]   "Normal_1/sample/concat_1:0"
|  |  |  |  |  Op(ConcatV2) "Normal_1/sample/concat_1"
|  |  |  Op(Reshape)    "Normal_2_1/sample/Reshape"
|  |  |  |  Tensor(Add):0,  shape=[1]   "Normal_2_1/sample/add:0"
|  |  |  |  |  Op(Add)  "Normal_2_1/sample/add"
|  |  |  |  Tensor(ConcatV2):0, shape=[0]   "Normal_2_1/sample/concat_1:0"
|  |  |  |  |  Op(ConcatV2) "Normal_2_1/sample/concat_1"

Next I plan on tackling issue #19, which involves graph normalization and creating goals for tensorflow.

Converting PyMC4 to Symbolic-PyMC

Closing Loose Ends

Picking up from my last blog we are now in the position to use kanren and Symbolic-PyMC together to walk and replace sections in our SVD graph problem.

 1    import tensorflow as tf
 2  
 3    import numpy as np
 4  
 5    from unification import var
 6  
 7    from kanren import run, eq, lall
 8  
 9    from symbolic_pymc.etuple import etuple, ExpressionTuple
10    from symbolic_pymc.relations.graph import graph_applyo
11    from symbolic_pymc.tensorflow.meta import mt
12    from symbolic_pymc.tensorflow.printing import tf_dprint
13  
14    from tensorflow.python.framework.ops import disable_eager_execution
15  
16    disable_eager_execution()
17  
18    X = tf.convert_to_tensor(np.random.normal(0, 1, (10, 10)), name='X')
19    S = tf.matmul(X, X, transpose_a=True)
20    d, U, V = tf.linalg.svd(S)
21    S_2 = tf.matmul(U, tf.matmul(tf.linalg.diag(d), V, adjoint_b=True))
22    ans = S - S_2
23  
24    def svd_reduceo(expanded_term, reduced_term):
25        S_lv = var()
26        d_mt, U_mt, V_mt = mt.linalg.svd(S_lv, name=var())
27  
28        t1 = mt.matrixdiag(d_mt, name=var())
29        t2 = mt.matmul(t1, V_mt, transpose_a=False, transpose_b=True, name=var())
30        template_mt = mt.matmul(U_mt, t2, transpose_a=False, transpose_b=False, name=var())
31  
32        # # This is a workaround to reference issue #47.
33        # d_mt.op.node_def.attr.clear()
34        # t1.op.node_def.attr.clear()
35        # t2.op.node_def.attr.clear()
36        # template_mt.op.node_def.attr.clear()
37  
38        return lall(eq(expanded_term, template_mt),
39                    eq(reduced_term, S_lv))
40  
41  
42    def simplify_graph(expanded_term):
43        expanded_term = mt(expanded_term)
44        reduced_term = var()
45  
46        graph_goal = graph_applyo(svd_reduceo, expanded_term, reduced_term)
47        res = run(1, reduced_term, graph_goal)
48        res_tf = res[0].eval_obj.reify()
49        return res_tf
50  
51    # tf_dprint(ans)
52    # tf_dprint(simplify_graph(ans))
53  
54    def test():
55        S_lv = var()
56        d_mt, U_mt, V_mt = mt.linalg.svd(S_lv, name=var())
57  
58        t1 = mt.matrixdiag(d_mt, name=var())
59        t2 = mt.matmul(t1, V_mt, transpose_a=False, transpose_b=True, name=var())
60        template_mt = mt.matmul(U_mt, t2, transpose_a=False, transpose_b=False, name=var())
61        return template_mt

Tensor(Sub):0,  shape=[10, 10]  "sub:0"
|  Op(Sub)  "sub"
|  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul:0"
|  |  |  Op(MatMul) "MatMul"
|  |  |  |  Tensor(Const):0,    shape=[10, 10]  "X:0"
|  |  |  |  Tensor(Const):0,    shape=[10, 10]  "X:0"
|  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul_2:0"
|  |  |  Op(MatMul) "MatMul_2"
|  |  |  |  Tensor(Svd):1,  shape=[10, 10]  "Svd:1"
|  |  |  |  |  Op(Svd)  "Svd"
|  |  |  |  |  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul:0"
|  |  |  |  |  |  |  ...
|  |  |  |  Tensor(MatMul):0,   shape=[10, 10]  "MatMul_1:0"
|  |  |  |  |  Op(MatMul)   "MatMul_1"
|  |  |  |  |  |  Tensor(MatrixDiag):0, shape=[10, 10]  "MatrixDiag:0"
|  |  |  |  |  |  |  Op(MatrixDiag) "MatrixDiag"
|  |  |  |  |  |  |  |  Tensor(Svd):0,  shape=[10]  "Svd:0"
|  |  |  |  |  |  |  |  |  Op(Svd)  "Svd"
|  |  |  |  |  |  |  |  |  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul:0"
|  |  |  |  |  |  |  |  |  |  |  ...
|  |  |  |  |  |  Tensor(Svd):2,    shape=[10, 10]  "Svd:2"
|  |  |  |  |  |  |  Op(Svd)    "Svd"
|  |  |  |  |  |  |  |  Tensor(MatMul):0,   shape=[10, 10]  "MatMul:0"
|  |  |  |  |  |  |  |  |  ...
Tensor(Sub):0,  shape=[10, 10]  "sub_1:0"
|  Op(Sub)  "sub_1"
|  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul:0"
|  |  |  Op(MatMul) "MatMul"
|  |  |  |  Tensor(Const):0,    shape=[10, 10]  "X:0"
|  |  |  |  Tensor(Const):0,    shape=[10, 10]  "X:0"
|  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul:0"
|  |  |  ...

We have now seen a way to move from TensorFlow to Symbolic-PyMC and traverse a graph. Note that simplify_graph is equivalent to str_optimize from the analogy. How does this relate to PyMC4?

A look into new pymc4 models

As of the date this blog has been posted PyMC4 received a large update introducing generative models. In previous iterations of PyMC4 conversion would have involved trying to pinpoint what TensorFlow object represented the observations. Luckily, with the recent changes this can be controlled by the user creating the model relieving the need for searching on Symbolic-PyMCs part.

Consider the following model,

 1    from symbolic_pymc.tensorflow.meta import mt
 2  
 3    from tensorflow.python.framework.ops import disable_eager_execution
 4    disable_eager_execution()
 5  
 6    import numpy as np
 7  
 8    import pymc4 as pm
 9  
10    from pymc4 import distributions as dist
11  
12    @pm.model(keep_return=False)
13    def nested_model(intercept, x_coeff, x):
14        y = yield dist.Normal("y", mu=intercept + x_coeff.sample() * x, sigma=1.0)
15        return y
16  
17  
18    @pm.model
19    def main_model():
20        intercept = yield dist.Normal("intercept", mu=0, sigma=10)
21        x = np.linspace(-5, 5, 100)
22        x_coeff = dist.Normal("x_coeff", mu=0, sigma=5)
23        result = yield nested_model(intercept, x_coeff, x)
24        return result
25  
26    ret, state = pm.evaluate_model(main_model())
27    _ = [ret, state]

[<tf.Tensor 'y_3_1/sample/Reshape:0' shape=(100,) dtype=float32>,
 SamplingState(
    values: ['main_model/intercept', 'main_model/nested_model/y', 'main_model']
    distributions: ['Normal:main_model/intercept', 'Normal:main_model/nested_model/y']
    num_potentials=0
)]

Since the output of models in PyMC4 are TensorFlow objects, which Symbolic-PyMC is already setup to deal with. This means one can convert PyMC4 models to Symbolic-PyMC meta objects trivially by

1    ret_mt = mt(ret)
2    _ = ret_mt

TFlowMetaTensor(tf.float32, TFlowMetaOp(TFlowMetaOpDef(obj=name: "Reshape"
i...f.Operation 'y_3_1/sample/Reshape' type=Reshape>), 0, TFlowMetaTensorShape(100,),, obj=TensorShape([100])), 'y_3_1/sample/Reshape:0', obj=<tf.Tensor 'y_3_1/sample/Reshape:0' shape=(100,) dtype=float32>)

To move in reverse we only have to call reify on the new object

1    _ = ret_mt.reify()

<tf.Tensor 'y_3_1/sample/Reshape:0' shape=(100,) dtype=float32>

Moving forward

From this point there are a few topics that need to be tackled. The first is how do we implement the conversion of PyMC4 models into Symbolic-PyMC models behind the scenes? One way would be to expand on the dispatcher that already runs on TensorFlow objects to now consider PyMC4 models. Other questions that have come up while digging into this is whether there exists a way to reconstruct a graph when eager mode is enabled.

Converting PyMC4 to Symbolic-PyMC Continued

Second evaluation and moving forward

During this second stretch I was able to have another PR accepted. This also began the phase of looking into how one might convert a PyMC4 model to symbolic-pymc. This process was the topic of my last blog where I not only closed my discussion on my svd problem. I also showed that converting PyMC4 model to a symbolic-pymc meta object is a pretty straight forward operation after the recent changes to PyMC4.

What about converting a PyMC4 model to a symbolic-pymc meta object making improvements and then converting it back? In the following ratio of normals example I am not accounting for shape since PyMC4 does not directly account for it without using sample from the underlying tensorflowprobability object.

 1    from pymc4 import distributions as dist
 2  
 3    from symbolic_pymc.tensorflow.meta import mt
 4  
 5    import tensorflow as tf
 6  
 7    import tensorflow_probability as tfp
 8  
 9    from tensorflow.python.eager.context import graph_mode
10  
11    @pm.model
12    def transform_example():
13        x = yield dist.Normal('x', mu=0, sigma=1)
14        y = yield dist.Normal('y', mu=0, sigma=1)
15        q = tf.realdiv(x, y)
16        return q
17  
18  
19    with graph_mode():
20        model = transform_example()
21        obs_graph, state = pm.evaluate_model(model)
22  
23    _ = tf_dprint(obs_graph)
24

Tensor(RealDiv):0,  shape=[]    "RealDiv_1:0"
|  Op(RealDiv)  "RealDiv_1"
|  |  Tensor(Reshape):0,    shape=[]    "Normal_3_1/sample/Reshape:0"
|  |  |  Op(Reshape)    "Normal_3_1/sample/Reshape"
|  |  |  |  Tensor(AddV2):0,    shape=[1]   "Normal_3_1/sample/add:0"
|  |  |  |  |  Op(AddV2)    "Normal_3_1/sample/add"
|  |  |  |  |  |  Tensor(Mul):0,    shape=[1]   "Normal_3_1/sample/mul:0"
|  |  |  |  |  |  |  Op(Mul)    "Normal_3_1/sample/mul"
|  |  |  |  |  |  |  |  Tensor(Add):0,  shape=[1]   "Normal_3_1/sample/random_normal:0"
|  |  |  |  |  |  |  |  |  Op(Add)  "Normal_3_1/sample/random_normal"
|  |  |  |  |  |  |  |  |  |  Tensor(Mul):0,    shape=[1]   "Normal_3_1/sample/random_normal/mul:0"
|  |  |  |  |  |  |  |  |  |  |  Op(Mul)    "Normal_3_1/sample/random_normal/mul"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(RandomStandardNormal):0, shape=[1]   "Normal_3_1/sample/random_normal/RandomStandardNormal:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  Op(RandomStandardNormal) "Normal_3_1/sample/random_normal/RandomStandardNormal"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(ConcatV2):0,   shape=[1]   "Normal_3_1/sample/concat:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Op(ConcatV2)   "Normal_3_1/sample/concat"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_3_1/sample/concat/values_0:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(BroadcastArgs):0,    shape=[0]   "Normal_3_1/sample/BroadcastArgs:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Op(BroadcastArgs)    "Normal_3_1/sample/BroadcastArgs"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_3_1/sample/Shape:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_3_1/sample/Shape_1:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_3_1/sample/concat/axis:0"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_3_1/sample/random_normal/stddev:0"
|  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_3_1/sample/random_normal/mean:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_3/scale:0"
|  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_3/loc:0"
|  |  |  |  Tensor(ConcatV2):0, shape=[0]   "Normal_3_1/sample/concat_1:0"
|  |  |  |  |  Op(ConcatV2) "Normal_3_1/sample/concat_1"
|  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_3_1/sample/sample_shape:0"
|  |  |  |  |  |  Tensor(StridedSlice):0,   shape=[0]   "Normal_3_1/sample/strided_slice:0"
|  |  |  |  |  |  |  Op(StridedSlice)   "Normal_3_1/sample/strided_slice"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_3_1/sample/Shape_2:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_3_1/sample/strided_slice/stack:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_3_1/sample/strided_slice/stack_1:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_3_1/sample/strided_slice/stack_2:0"
|  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_3_1/sample/concat_1/axis:0"
|  |  Tensor(Reshape):0,    shape=[]    "Normal_4_1/sample/Reshape:0"
|  |  |  Op(Reshape)    "Normal_4_1/sample/Reshape"
|  |  |  |  Tensor(AddV2):0,    shape=[1]   "Normal_4_1/sample/add:0"
|  |  |  |  |  Op(AddV2)    "Normal_4_1/sample/add"
|  |  |  |  |  |  Tensor(Mul):0,    shape=[1]   "Normal_4_1/sample/mul:0"
|  |  |  |  |  |  |  Op(Mul)    "Normal_4_1/sample/mul"
|  |  |  |  |  |  |  |  Tensor(Add):0,  shape=[1]   "Normal_4_1/sample/random_normal:0"
|  |  |  |  |  |  |  |  |  Op(Add)  "Normal_4_1/sample/random_normal"
|  |  |  |  |  |  |  |  |  |  Tensor(Mul):0,    shape=[1]   "Normal_4_1/sample/random_normal/mul:0"
|  |  |  |  |  |  |  |  |  |  |  Op(Mul)    "Normal_4_1/sample/random_normal/mul"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(RandomStandardNormal):0, shape=[1]   "Normal_4_1/sample/random_normal/RandomStandardNormal:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  Op(RandomStandardNormal) "Normal_4_1/sample/random_normal/RandomStandardNormal"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(ConcatV2):0,   shape=[1]   "Normal_4_1/sample/concat:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Op(ConcatV2)   "Normal_4_1/sample/concat"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_4_1/sample/concat/values_0:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(BroadcastArgs):0,    shape=[0]   "Normal_4_1/sample/BroadcastArgs:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Op(BroadcastArgs)    "Normal_4_1/sample/BroadcastArgs"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_4_1/sample/Shape:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_4_1/sample/Shape_1:0"
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_4_1/sample/concat/axis:0"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_4_1/sample/random_normal/stddev:0"
|  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_4_1/sample/random_normal/mean:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[]    "Normal_4/scale:0"
|  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_4/loc:0"
|  |  |  |  Tensor(ConcatV2):0, shape=[0]   "Normal_4_1/sample/concat_1:0"
|  |  |  |  |  Op(ConcatV2) "Normal_4_1/sample/concat_1"
|  |  |  |  |  |  Tensor(Const):0,  shape=[0]   "Normal_4_1/sample/sample_shape:0"
|  |  |  |  |  |  Tensor(StridedSlice):0,   shape=[0]   "Normal_4_1/sample/strided_slice:0"
|  |  |  |  |  |  |  Op(StridedSlice)   "Normal_4_1/sample/strided_slice"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_4_1/sample/Shape_2:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_4_1/sample/strided_slice/stack:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_4_1/sample/strided_slice/stack_1:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[1]   "Normal_4_1/sample/strided_slice/stack_2:0"
|  |  |  |  |  |  Tensor(Const):0,  shape=[]    "Normal_4_1/sample/concat_1/axis:0"

Theoretically the division of two normal distributions produces a Cauchy distribution. Looking at the above graph it’s clear that it does not consider this reduction. This becomes a perfect situation for symbolic-pymc! To do this we need to again construct a template to unify against like in the svd example.

Converting PyMC4 model to symbolic-pymc

As mentioned in my last blog converting PyMC4 objects to symbolic-pymc objects is relatively simple,

1     model_mt = mt(obs_graph)
2     _ = mt(obs_graph)

TFlowMetaTensor(tf.float32, TFlowMetaOp(TFlowMetaOpDef(obj=name: "RealDiv"
i..._1', obj=<tf.Operation 'RealDiv_1' type=RealDiv>), 0, TFlowMetaTensorShape(,, obj=TensorShape([])), 'RealDiv_1:0', obj=<tf.Tensor 'RealDiv_1:0' shape=() dtype=float32>)

manipulating the symbolic-pymc graph

To manipulate the graph we need to create a goal (cauchy_reduceo) That first takes a term we want to manipulate and creates a template to unify against (Q_mt). With this template we need to then unify it against a template representing what we want to substitute in the output graph (cauchy_mt).

 1    from kanren import lall, eq, run
 2    from unification import var
 3    from symbolic_pymc.relations.graph import graph_applyo
 4    from symbolic_pymc.etuple import ExpressionTuple
 5    from tensorflow_probability.python.internal import tensor_util
 6  
 7    def cauchy_reduceo(expanded_term, reduced_term):
 8        ''' Goal used for unification
 9        '''
10        X_mt = mt.reshape(tfp_normal(0, 1), shape=var(), name=var())
11        Y_mt = mt.reshape(tfp_normal(0, 1), shape=var(), name=var())
12        cauchy_mt = tfp_cauchy(0., 1.)
13        Q_mt = mt.realdiv(X_mt, Y_mt, shape=var(), name=var())
14        return lall(eq( Q_mt, expanded_term),
15            eq(reduced_term, cauchy_mt))
16  
17  
18    def simplify_graph(expanded_term):
19        ''' evaluates goal.
20        '''
21        with graph_mode():
22            expanded_term = mt(expanded_term)
23            reduced_term = var()
24            graph_goal = graph_applyo(cauchy_reduceo, expanded_term, reduced_term)
25            res = run(1, reduced_term, graph_goal)
26            res_tf = res[0].reify()
27            return res_tf
28  
29  
30    def tfp_normal(loc, scale):
31        '''Used to create template for unifying
32        '''
33        sampled = var()
34        return mt.add(mt.mul(sampled, scale, name=var()), loc, name=var())
35  
36  
37    def tfp_cauchy(loc, scale):
38        '''Used to create template for unifying
39        '''
40        shape = var()
41        probs = mt.randomuniform(shape=shape, minval=0., maxval=1.)
42        return mt.add(loc,
43                      mt.mul(scale,
44                             mt.tan(mt.mul(np.pi, mt.sub(probs, .5, name=var())),
45                                    name=var()), name=var()), name=var())
46    simplify_graph(obs_graph)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/babel-8QuM0a/python-aq3Ajo", line 3, in <module>
    from symbolic_pymc.relations.graph import graph_applyo
ModuleNotFoundError: No module named 'symbolic_pymc.relations.graph'

In the above code tfp_normal and tfp_cauchy are created to unify against tfp.distributions.normal and tfp.distributions.cauchy. To create these I looked at tfp.distributions.normal._sample_n and tfp.distributions.cauchy._sample_n respectively. Now would be a good time to mention that symbolic-pymc no longer disables eager mode by default. The way around this is with tensorflow's own graph_mode as shown above in simplify_graph.

Another part of symbolic-pymc is it’s access to most of tensorflow's api. Using this api is as simple as calling “mt.API_NAME” for example mt.add(1, 2). What this does in the background is searches for the operation through op_def_library.OpDefLibrary and returns the corresponding meta object. It is important to use the “mt” representation because it allows us to use logic variables; var() from the unification library.

Moving Forward

Now that it’s possible to create a PyMC4 model, convert it to symbolic_pymc, manipulate the graph and then convert back to a usable object I’ll be focusing adding common algebraic operations.

Unifying Reifying and Symbolic-PyMC

Introduction

Digging through TensorFlow I started by computing basic examples and comparing them to numpy’s output. While doing this I came across the common theme of numerical approximation that theoretically should not have been present. This brought me to pondering what would I have to do to get around these numerical errors that arise in software today?

In this article consider the situation were one passingly uses an SVD.

1    import numpy as np
2  
3    X = np.random.normal(0, 1, (10, 10))
4    S = X.T.dot(X)
5    U, d, Vt = np.linalg.svd(S)
6    _ = S - np.dot(U*d, Vt)
7.10542736e-15 -1.15463195e-14 -2.66453526e-15 1.24344979e-14 2.22044605e-15 -6.66133815e-16 1.19904087e-14 -4.6629367e-15 3.33066907e-16 4.4408921e-15
-1.28785871e-14 5.32907052e-15 5.32907052e-15 -5.32907052e-15 -1.77635684e-15 8.8817842e-16 -1.52655666e-14 1.77635684e-15 8.8817842e-15 -3.55271368e-15
1.24344979e-14 -7.10542736e-15 -8.8817842e-15 1.77635684e-15 -8.8817842e-16 -1.77635684e-15 8.43769499e-15 -2.22044605e-15 -2.66453526e-15 6.21724894e-15
8.8817842e-15 -1.64313008e-14 7.10542736e-15 -1.77635684e-15 -6.21724894e-15 -4.4408921e-16 5.32907052e-15 -6.66133815e-15 2.22044605e-16 -2.44249065e-15
-4.4408921e-16 0.0 1.44328993e-15 -4.4408921e-15 -1.77635684e-15 -7.42461648e-16 -1.99840144e-15 1.11022302e-15 2.22044605e-15 -1.77635684e-15
-2.06501483e-14 1.66533454e-14 1.59872116e-14 -9.76996262e-15 6.52256027e-16 0.0 -1.33226763e-14 4.4408921e-15 5.77315973e-15 -7.10542736e-15
-1.99840144e-14 1.06026299e-14 1.7985613e-14 -7.10542736e-15 -8.8817842e-16 3.99680289e-15 -1.42108547e-14 2.66453526e-15 4.4408921e-15 -1.27675648e-14
5.55111512e-15 -2.66453526e-15 -7.10542736e-15 1.77635684e-15 6.66133815e-16 0.0 4.4408921e-16 -8.8817842e-16 -7.07767178e-16 2.66453526e-15
3.33066907e-14 -2.39808173e-14 -2.04281037e-14 1.17683641e-14 -8.8817842e-16 -3.99680289e-15 2.66453526e-14 -7.91033905e-15 -1.0658141e-14 1.37667655e-14
2.23154828e-14 -1.42108547e-14 -1.77635684e-14 1.02140518e-14 1.33226763e-15 0.0 1.44051437e-14 -5.32907052e-15 -7.10542736e-15 7.10542736e-15

Let’s see if TensorFlow exhibits the same issue?

 1    """ Seeing if tensorflow has the same issue
 2    """
 3    import tensorflow as tf
 4    from tensorflow.python.framework.ops import disable_eager_execution
 5  
 6  
 7    tf.compat.v1.InteractiveSession()
 8    disable_eager_execution()
 9  
10    tfp = tfp.distributions
11  
12    X = np.random.normal(0, 1, (10, 10))
13  
14    S = tf.matmul(X, X, transpose_a=True)
15  
16    d, U, V = tf.linalg.svd(S)
17  
18    D = tf.matmul(U, tf.matmul(tf.linalg.diag(d), V, adjoint_b=True))
19    ans = S - D
20  
21    _ = ans.eval()
-3.01980663e-14 -4.4408921e-15 2.39808173e-14 4.4408921e-15 7.99360578e-15 -2.7533531e-14 1.37667655e-14 -1.59872116e-14 2.48689958e-14 7.10542736e-15
-5.99520433e-15 -1.24344979e-14 6.88338275e-15 -1.24344979e-14 1.77635684e-15 -1.82076576e-14 -1.66533454e-15 -5.77315973e-15 -3.99680289e-15 -1.95399252e-14
2.13162821e-14 2.88657986e-15 -1.77635684e-14 2.22044605e-15 -7.99360578e-15 2.57571742e-14 -1.02140518e-14 5.88418203e-15 -1.55431223e-14 3.33066907e-16
5.77315973e-15 -1.77635684e-14 2.22044605e-16 -1.77635684e-14 -1.94289029e-15 -1.0658141e-14 -8.8817842e-15 4.99600361e-15 -2.66453526e-15 -2.13162821e-14
7.77156117e-15 1.77635684e-15 -4.4408921e-15 -1.22124533e-15 -7.99360578e-15 1.46549439e-14 -4.08006962e-15 -1.99840144e-15 -1.0658141e-14 1.55431223e-15
-2.84217094e-14 -1.9095836e-14 2.66453526e-14 -8.8817842e-15 1.66533454e-14 -4.08562073e-14 3.55271368e-15 -7.77156117e-16 3.01980663e-14 -1.59872116e-14
1.28785871e-14 -1.88737914e-15 -1.0658141e-14 -8.8817842e-15 -3.1918912e-15 -8.8817842e-16 -4.97379915e-14 3.90798505e-14 1.19904087e-14 -3.55271368e-14
-1.59872116e-14 -5.32907052e-15 9.43689571e-15 5.32907052e-15 -6.66133815e-16 -2.44249065e-15 3.37507799e-14 -2.48689958e-14 -1.15463195e-14 1.0658141e-14
2.30926389e-14 -7.77156117e-16 -1.28785871e-14 -8.8817842e-16 -7.10542736e-15 2.57571742e-14 8.43769499e-15 -1.24344979e-14 -3.55271368e-14 1.15463195e-14
4.88498131e-15 -2.13162821e-14 -2.22044605e-15 -1.86517468e-14 3.77475828e-15 -1.77635684e-14 -3.73034936e-14 1.59872116e-14 1.50990331e-14 -5.32907052e-14

In regards to theory this should have been 0, but due to rounding errors mostly drawn from limitations of floats this is not the case. Questions one might ask are “Is there a way around this?” and “Why would one care?”

Answering the second question, one reason is of course optimizing numerical deficiencies when possible. The other could be this idea of automating “pen and paper” math. This would allow someone with a domain specific skill set be it in probability, numerical analysis to be able to automate their “tricks” and demystify more complex ideas in their respective fields.

Moving back to the first question, one method is to think of the process of doing SVD above as a graph of operations. In this graph each node is the output of an operation which are represented as the edge connecting nodes. What this would allow us to do is to traverse the graph looking for operations that could be reduced or removed all together.

  1. An analogy

    Consider an analogy using strings,

    input_src_code = """
    u, d, v = svd(S)
    ans = S - matmul(u, d, v)
    """
    

    Consider some optimize function, that does the following:

    output_src_code = str_optimize(input_src_code)
    

    where

    assert output_src_code == """
    u, d, v = svd(S)
    ans = S - S
    """
    

    In this we need to replace "matmul(u, d, v)" with "S", but what do we need in order to implement this?

    1. Match the pattern for an SVD, e.g.

      import re
      res = re.search("([a-zA-Z]+), ([a-zA-Z]+), ([a-zA-Z]+) = svd\(([a-zA-Z]+)\)", input_src_code)
      U = res.group(1)
      D = res.group(2)
      V = res.group(3)
      S = res.group(4)
      
    2. If it matches, match and replace the “matmul”, e.g. with

      optimized_code = input_src_code.replace("matmul({}, {}, {})".format(U, D, V), S)
      

    Using this analogy how does this map back to the TensorFlow objects that we’ll be working with?

Graph reconstruction through TensorFlow

To begin answering the first question, let’s look at what our term ans has as objects outside of the standard assigned objects. As a side note, this blog post covers some of the same material.

1    _ = [i for i in dir(ans) if not i.startswith('_')]

['OVERLOADABLE_OPERATORS',
 'consumers',
 'device',
 'dtype',
 'eval',
 'get_shape',
 'graph',
 'name',
 'op',
 'set_shape',
 'shape',
 'value_index']

One that immediately strikes some interest is ans.op.

1    _ = ans.op

<tf.Operation 'sub' type=Sub>

A tf.Operation is a node in a graph that corresponds to a computation. Some of the properties included in tf.Operation are inputs and outputs. These could be the arguments to the operation and the outputs, which corresponds to “S”, “matmul(…)” for inputs and “ans” for outputs in input_src_code.

Using our analogy, the above TensorFlow operation is the subtraction in the string input_src_code.

1    _ = [ans.op.inputs._inputs, ans.op.outputs]

[[<tf.Tensor 'MatMul:0' shape=(10, 10) dtype=float64>,
  <tf.Tensor 'MatMul_2:0' shape=(10, 10) dtype=float64>],
 [<tf.Tensor 'sub:0' shape=(10, 10) dtype=float64>]]

These look like references to the previous tensors that were subtracted to create ans. Of course I can directly check this.

1    _ = [ans.op.inputs._inputs[0] == S, ans.op.inputs._inputs[1] == D]

[True, True]

Great! So as a quick recap I now have a way to take the result ans and walk backwards to our original matrices. Is it possible to determine what kind of operations are transpiring? Specifically, is it possible to determine if there was an SVD operation? The quick answer is “Yes”! All I need to do is use the same methods I’ve used thus far.

1    _ = ans.op.inputs._inputs[1].op.inputs._inputs[0].op

<tf.Operation 'Svd' type=Svd>

This is like the “svd(…)” in our analogy, so the argument to this “string operator” is op.inputs.

At this point it’s clear there exists a way to move through operations and get the corresponding inputs and outputs. It is also possible to determine what the nature of these operations were. How do we do this using TensorFlow? We know we would need a way to traverse a TensorFlow graph and find patterns like we did above, which is analogous to searching strings with re.search and replacing with str.replace.

In later blog posts I’ll dive into creating functions that parse this graph and make the required replacements much like our string analogy. This is one of the main goals of the symbolic-pymc package I’ll be working with during GSoC 2019.

Unifying, Reifying and Symbolic-PyMC Continued

Graph Reconstruction Through TensorFlow Part 2

In the last blog post I focused on looking through TensorFlow objects and what could be used within these to recreate the graph of operations. Considering the analogy given in the first blog I have the basic information to recreate the str_optimize function for TensorFlow. However, there is still something to say regarding how manipulating the graphs will work.

 1    """ Seeing if tensorflow has the same issue
 2    """
 3    import numpy as np
 4    import tensorflow as tf
 5    from tensorflow.python.framework.ops import disable_eager_execution
 6  
 7    disable_eager_execution()
 8  
 9    X = np.random.normal(0, 1, (10, 10))
10  
11    S = tf.matmul(X, X, transpose_a=True)
12  
13    d, U, V = tf.linalg.svd(S)
14  
15    D = tf.matmul(U, tf.matmul(tf.linalg.diag(d), V, adjoint_b=True))
16    ans = S - D

Using symbolic-pymc in particular the tf_dprint function we can inspect the graph.

1    from symbolic_pymc.tensorflow.printing import tf_dprint
2  
3    _ = tf_dprint(ans)

Tensor(Sub):0,  shape=[10, 10]  "sub_1:0"
|  Op(Sub)  "sub_1"
|  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul_3:0"
|  |  |  Op(MatMul) "MatMul_3"
|  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul_3/a:0"
|  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul_3/b:0"
|  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul_5:0"
|  |  |  Op(MatMul) "MatMul_5"
|  |  |  |  Tensor(Svd):1,  shape=[10, 10]  "Svd_1:1"
|  |  |  |  |  Op(Svd)  "Svd_1"
|  |  |  |  |  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul_3:0"
|  |  |  |  |  |  |  Op(MatMul) "MatMul_3"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul_3/a:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul_3/b:0"
|  |  |  |  Tensor(MatMul):0,   shape=[10, 10]  "MatMul_4:0"
|  |  |  |  |  Op(MatMul)   "MatMul_4"
|  |  |  |  |  |  Tensor(MatrixDiag):0, shape=[10, 10]  "MatrixDiag_1:0"
|  |  |  |  |  |  |  Op(MatrixDiag) "MatrixDiag_1"
|  |  |  |  |  |  |  |  Tensor(Svd):0,  shape=[10]  "Svd_1:0"
|  |  |  |  |  |  |  |  |  Op(Svd)  "Svd_1"
|  |  |  |  |  |  |  |  |  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul_3:0"
|  |  |  |  |  |  |  |  |  |  |  Op(MatMul) "MatMul_3"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul_3/a:0"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul_3/b:0"
|  |  |  |  |  |  Tensor(Svd):2,    shape=[10, 10]  "Svd_1:2"
|  |  |  |  |  |  |  Op(Svd)    "Svd_1"
|  |  |  |  |  |  |  |  Tensor(MatMul):0,   shape=[10, 10]  "MatMul_3:0"
|  |  |  |  |  |  |  |  |  Op(MatMul)   "MatMul_3"
|  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[10, 10]  "MatMul_3/a:0"
|  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[10, 10]  "MatMul_3/b:0"

The output of the top layer (furthest left) represents the subtraction that took place. Each subsequent step right moves effectively one step down in the list of operations until the original inputs are reached.

From this point the next step is to write a function that can replace the below portion,

|  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul_2:0"
|  |  |  Op(MatMul) "MatMul_2"
|  |  |  |  Tensor(Svd):1,  shape=[10, 10]  "Svd:1"
|  |  |  |  |  Op(Svd)  "Svd"
|  |  |  |  |  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul:0"
|  |  |  |  |  |  |  Op(MatMul) "MatMul"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul/a:0"
|  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul/b:0"
|  |  |  |  Tensor(MatMul):0,   shape=[10, 10]  "MatMul_1:0"
|  |  |  |  |  Op(MatMul)   "MatMul_1"
|  |  |  |  |  |  Tensor(MatrixDiag):0, shape=[10, 10]  "MatrixDiag:0"
|  |  |  |  |  |  |  Op(MatrixDiag) "MatrixDiag"
|  |  |  |  |  |  |  |  Tensor(Svd):0,  shape=[10]  "Svd:0"
|  |  |  |  |  |  |  |  |  Op(Svd)  "Svd"
|  |  |  |  |  |  |  |  |  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul:0"
|  |  |  |  |  |  |  |  |  |  |  Op(MatMul) "MatMul"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul/a:0"
|  |  |  |  |  |  |  |  |  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul/b:0"
|  |  |  |  |  |  Tensor(Svd):2,    shape=[10, 10]  "Svd:2"
|  |  |  |  |  |  |  Op(Svd)    "Svd"
|  |  |  |  |  |  |  |  Tensor(MatMul):0,   shape=[10, 10]  "MatMul:0"
|  |  |  |  |  |  |  |  |  Op(MatMul)   "MatMul"
|  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[10, 10]  "MatMul/a:0"
|  |  |  |  |  |  |  |  |  |  Tensor(Const):0,  shape=[10, 10]  "MatMul/b:0"

with the following:

|  |  Tensor(MatMul):0, shape=[10, 10]  "MatMul:0"
|  |  |  Op(MatMul) "MatMul"
|  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul/a:0"
|  |  |  |  Tensor(Const):0,    shape=[10, 10]  "MatMul/b:0"

How do we match graphs like input_block such that we obtain we obtain the argument of the “Svd” operator (i.e S, the 40)?

Unification and Reification

The idea behind unification is to make two terms equal by finding substitutions for logic variables that would satisfy equality. A logic variable is like an unknown term in algebra and substitutions are simply a mapping between logic variables and values. Let’s look at a few quick examples where x is a logic variable,

1    from unification import unify, reify, var
2  
3    x = var('x')
4    _ = [unify((4, x), (4, 5), {}),
5         unify(['t', x, 'est'], ['t', 'e', 'est'], {}),
6         unify((4, x), (2, 5), {})]

[{~x: 5}, {~x: 'e'}, False]

Reification is the opposite operation to unification. This implies that it takes a variable and a substitution and returns a value that contains no variables. Below is a quick example using Matt Rocklin’s unification library,

1    from unification import unify, reify, var
2    _ = [reify(["m", x, "s", "i", "c"], {x:'u'}),
3         reify((4, x), {x: 5})]

[['m', 'u', 's', 'i', 'c'], (4, 5)]

The concepts of “unification” and “reification” are important in term rewriting, and what we have been discussing up to this point is term rewriting!

Now, we want to unify 39 with another graph containing a logic variable as the input for an “Svd”. We can then use this logic variable to reify.

How do we do this with a TensorFlow graph? Using the unification library we already have support for most basic builtin types such as “str”, “tuple” and “list”. However, unification can be extended further by modifying _unify and _reify. This extension is something that Symbolic_PyMC uses to manipulate TensorFlow graphs.

1    from symbolic_pymc.tensorflow.meta import mt
2  
3    S_lv = var()
4    d_mt, U_mt, V_mt = mt.linalg.svd(S, compute_uv=var(),
5                            full_matrices=var(), name=var())
6  
7    template_mt = mt.matmul(U, mt.matmul(mt.matrixdiag(d, name=var()), V,
8                                                transpose_a=False, transpose_b=True, name=var()),
9                            transpose_a=False, transpose_b=False, name=var())

1    D_mt = mt(D)
2    s = unify(D_mt, template_mt, {})
3    _ = s

{~_27: tf.float64, ~_23: tf.float64, ~_19: tf.float64, ~_18: 'MatrixDiag_1', ~_20: TFlowMetaTensorShape([Dimension(10), Dimension(10)],, obj=TensorShape([10, 10])), ~_21: 'MatrixDiag_1:0', ~_22: 'MatMul_4', ~_24: TFlowMetaTensorShape([Dimension(10), Dimension(10)],, obj=TensorShape([10, 10])), ~_25: 'MatMul_4:0', ~_26: 'MatMul_5', ~_28: TFlowMetaTensorShape([Dimension(10), Dimension(10)],, obj=TensorShape([10, 10])), ~_29: 'MatMul_5:0'}

Reification in this case is straightforward.

1    _ = reify(S_lv, s)

~_5

In our running example we would walk the graph i.e. ans in our case. The output would be a new graph where 39 has been replaced with S_lv. What can we use to implement walking through a graph?

The concepts of unification and reification are encapsulated in the language miniKanren as eq and run respectively. Luckily, miniKanren has a python implementation!

1    from kanren import eq, run
2    x = var()
3    _ = run(1, x, eq((1, 2), (1, x)))

(2,)

In later posts I’ll go into exactly how Symbolic-PyMc uses miniKanren while adding relations such as graph_applyo to walk and replace sections.

Updates and end of GSoC

Quick update

In the final week I continued to work on issue #50 regarding updates to `tf_dprint`. There have been a number of updates since I first began, a notable one has been the change in user input. In my last post `depth_index` was specified as a range, but in the latest iteration the user must assign a `depth_lower` and a `depth_upper`. This change in notation will hopefully make it more understandable. Another change is how a truncated graph looks when printed.

My Time with GSoC

During my time working through GSoC I had roughly 1 contribution between each graded section. During this time I also made sure to attend project meetings and even hosted one journal talk. In the following few sections I’ll discuss each contribution I made, what I am currently working on as well as how I did against my original proposal and what my next steps will be.

First period 5/27-6/28

During this period I started by getting familiar with the code base for PyMC4 and symbolic-pymc by going through open issues and following discussions in group meetings and on slack. Since my work for this project started slightly before the beginning portion of GSoC I was able to make my first contribution early on. My first contribution involved removing the `from_obj` method from `MetaSymbol` (#15). `from_obj` was responsible for converting TensorFlow/Theano objects to related meta objects. After removing I implemented a dispatcher that provided the same conversion but made it easier to add new cases as they show up.

Second period 6/28-7/22

In the second period I spent more time studying symbolic-pymc as well as minikanren. I was able to close issue (#41) which involved remapping `tensorflow.Operations.inputs` so that they matched `OpDef`’s signature. To accomplish this I had to dig deeper into `TensorFlow`’s API to determine why the list of inputs were being flattened. I also hosted a journal club talk focusing on the minimalist version of minikanren, “mukanren”. I also published a blog post showing after recent updates how to convert `PyMc4` objects to `symbolic-pymc`.

Third period 7/22-8/26

This period I focused more on enhancing tools. In doing so I began to draft pull requests that will add basic graph normalization and another that will add the ability to index graph printing making it easier to read and interpret. Moving forward I am in the process of closing the pull request related to graph indexing (#60) and will move on to create the pull request for graph normalization. In regards to my original proposal after completing both the mentioned PRs all that would be left would be to implement Gibbs sampling.

Final Remarks

This marks the last blog post under the umbrella of GSoC. For those who have been following my blog I would like to thank you and as I move forward I hope to make blog posts a regular thing! In regards to GSoC and PyMC it’s hard to describe how thankful I am for the experience to be able to contribute to an open source project. I know as I progress in my career what I learned during my time will become an asset and a starting point for what I can hope will be a bright future.