In part 1 we imported some connections data from a pdf catalogue and we finished with a dictionary called catalogue in which we had a list of arrays, the index of which corresponded with the column headers in the template dictionary, which we combined together into a pandas DataFrame so that we could QAQC the data.

Permutations

There are 92 Wedge 521 connections in our newly created catalogue, ranging in size from 18 5/8” to 4” (the nominal diameter of the pipe body). How many different ways can we combine these connections?

Definitions

First, let’s be clear what we mean by combine. For me, a valid combination is when one connection on pipe can pass through another connection on pipe. That means that the largest outer diameter of smaller size connection and pipe must be less than the drift of the larger coupling and pipe.

That escalated quickly

So, for each coupling, we need to test it against every other coupling to see if it will “drift”. If there’s 92 connections, that means we have to make 92 x 91 = 8,372 checks to see which connection fits through what. Once we know what fits through what, we need to string together combinations of these pairs. There’s a nice bit of maths that can help us here called network theory.

Building relationships

Fortunately, there are a number of Python implementations of network graphs but for this exercise we’re going to use networkx, simply because it’s the simplest to install and easiest to work with. This comes at the cost of speed, but for the time being let’s not worry about that.

You’ll need to install networkx in your Python env with the following terminal command:

pip install networkx

Then we can start writing some code, importing our libraries (including the code we wrote last time - note that I made a couple of changes to the code from last time, so be sure to get the latest version):

import numpy as np
import networkx as nx
import import_from_tenaris_catalogue  # our code from part 1

# this runs the code from part 1 (make sure it's in your working directory)
# and assigns the data to the variable catalogue
catalogue = import_from_tenaris_catalogue.main()

Before we start, each node in our graph needs to have a unique identifier (UID) which we don’t currently have in our catalogue. So let’s quickly create a Counter class that we can use to generate UID.

class Counter:
    def __init__(self, start=0):
        """
        A simple counter class for storing a number that you want to increment
        by one.
        """
        self.counter = start
    
    def count(self):
        """
        Calling this method increments the counter by one and returns what was
        the current count.
        """
        self.counter += 1
        return self.counter - 1

    def current(self):
        """
        If you just want to know the current count, use this method.
        """
        return self.counter

This counter will run in the background and we can call it for a new UID each time we create a node for our graph and assign that UID to the new node.

To initiate a new graph, which we’ll call graph, we use the following code:

graph = nx.DiGraph()

Adding vertices (nodes)

We’ve now initiated an instance of a DiGraph class, the Di denoting that this is a directional graph, so the edges we’ll be adding will be one way (which will prevent looping when we start navigating the graph). It’s time start adding nodes, but since our connections have 17 attributes each that we want to add to the node, we’ll write a function for this.

def add_node(
    counter, graph, manufacturer, type, type_name, tags, data
):
    """
    Function for adding a node with attributes to a graph.

    Parameters
    ----------
    counter: Counter object
    graph: (Di)Graph object
    manufacturer: str
        The manufacturer of the item being added as a node.
    type: str
        The type of item being added as a node.
    type_name: str
        The (product) name of the item being added as a node.
    tags: list of str
        The attribute tags (headers) of the attributes of the item.
    data: list or array of floats
        The data for each of the attributes of the item.

    Returns
    -------
    graph: (Di)Graph object
        Returns the (Di)Graph object with the addition of a new node.
    """
    # we'll first create a dictionary with all the node attributes
    node = {
        'manufacturer': manufacturer,
        'type': type,
        'name': type_name
    }
    # loop through the tags and data
    for t, d in zip(tags, data):
        node[t] = d
    # get a UID and assign it
    uid = counter.count()
    node['uid'] = uid

    # overwrite the size - in case it didn't import properly from the pdf
    size = (
        node['pipe_body_inside_diameter']
        + 2 * node['pipe_body_wall_thickness']
    )
    # use the class method to add the node, unpacking the node dictionary
    # and naming the node with its UID
    graph.add_node(uid, **node)
    
    return graph

We’ll use the above function to add all our connections to the graph, but once all these nodes or vertices are added, we need to connect them together with edges to indicate which connections can pass through which.

Adding edges

We’ll use two functions for this, one for adding an edge and the other a helper function. First the helper function.

def check_connection_clearance(graph, node1, node2, cut_off=0.7):
    """
    Function for checking if one component will pass through another.

    Parameters
    ----------
    graph: (Di)Graph object
    node1: dict
        A dictionary of node attributes.
    node2: dict
        A dictionary of node attributes.
    cut_off: 0 < float < 1
        A ration of the nominal component size used as a filter, e.g. if set
        to 0.7, if node 1 has a size of 5, only a node with a size greater than
        3.5 will be considered.

    Returns
    -------
    graph: (Di)Graph object
        Graph with an edge added if node2 will drift node1, with a `clearance`
        attribute indicating the clearance between the critical outer diameter
        of node2 and the drift of node1.
    """
    try:
        node2_connection_od = node2['coupling_outside_diameter']
    except KeyError:
        node2_connection_od = node2['box_outside_diameter']
    clearance = min(
            node1['pipe_body_drift'],
            node1['connection_inside_diameter']
    ) - max(
        node2_connection_od,
        node2['pipe_body_inside_diameter']
        + 2 * node2['pipe_body_wall_thickness']
    )
    if all((
        clearance > 0,
        node2['size'] / node1['size'] > cut_off
    )):
        graph.add_edge(node1['id'], node2['id'], **{'clearance': clearance})
    
    return graph

This is essentially checking the largest outside diameter of the node2 component against the drift of the node1 component. However, you’ll see that there’s a filter added, the cut_off parameter - we could link all components that pass through each other in the network graph (e.g. a 4” connection will pass through an 18 5/8” connection), but this will make our decision space unnecessarily large - would we design a well where we run a 4” tubing directly inside an 18 5/8” surface casing? If we do want to consider these options, then we can set cut_off=0, but the default is set cut_off=0.7, which for example means that we’re looking for sizes larger than 13” to run inside our 18 5/8” surface casing.

Now we can write our edge function.

def add_connection_edges(graph, cut_off=0.7):
    """
    Function to add edges between connection components in a network graph.

    Parameters
    ----------
    graph: (Di)Graph object
    cut_off: 0 < float < 1
        A ration of the nominal component size used as a filter, e.g. if set
        to 0.7, if node 1 has a size of 5, only a node with a size greater than
        3.5 will be considered.

    Returns
    -------
    graph: (Di)Graph object
        Graph with edges added for connections that can drift through other
        connections.
    """
    for node_outer in graph.nodes:
        # check if the node is a casing connection
        if graph.nodes[node_outer]['type'] != 'casing_connection':
            continue
        for node_inner in graph.nodes:
            if graph.nodes[node_inner]['type'] != 'casing_connection':
                continue
            graph = check_connection_clearance(
                graph, 
                graph.nodes[node_outer],
                graph.nodes[node_inner],
                cut_off
            )
    return graph

This is not a particularly efficient way to add our edges, but for the purposes of this exercise using a small dataset for illustration, then it will suffice. For larger datasets, filtering the nodes/vertices with numpy and multiprocessing with e.g. ray would get this done much faster.

Bringing it all together

Finally, we’ll write our main function to run the above and generate our populated network graph.

def main():
    # this runs the code from part 1 (make sure it's in your working directory)
    # and assigns the data to the variable catalogue
    catalogue = import_from_tenaris_catalogue.main()

    # initiate our counter and graph
    counter = Counter()
    graph = make_graph(catalogue)

    # add the casing connections from our catalogue to the network graph
    for product, data in catalogue.items():
        for row in data['data']:
            graph = add_node(
                counter,
                graph,
                manufacturer='Tenaris',
                type='casing_connection',
                type_name=product,
                tags=data['headers'],
                data=row
            )
    
    # determine which connections fit through which and add the appropriate
    # edges to the graph.
    graph = add_connection_edges(graph)

    return graph

Results

To run the above script, we can add the following at the bottom of our .py file:

if __name__ == '__main__':
    graph = main()

    # as a QAQC step we can use matplotlib to draw the edges between connected
    # casing connections

    import matplotlib
    import matplotlib.pyplot as plt

    matplotlib.use("TKAgg")  # Ubuntu sometimes needs some help

    nx.draw_networkx(graph, pos=nx.circular_layout(graph))
    plt.show()

    print("Done")

You’ll see that we’re importing matplotlib here as a QAQC step to visually check that we have indeed imported the 92 connections from the catalogue and that we have successfully created edges between connections where one will drift through another. image info Congratulations, you’ve just created a network graph and populated it with some casing connections and linked together connections that will drift through other connections. In the next post we’ll see how we can list the different permutations of casing we can deploy in our well and how we can filter the data to provide us with choices for our specific environment.

Feel free to download the code.