Skip to content Skip to sidebar Skip to footer

Cython Function With Variable Sized Matrix Input

I am trying to convert part of a native python function to cython to improve the compute time. I would like to write a cython function just for the loop component that is taking up

Solution 1:

Cython code is (strategically) statically typed, but that doesn't mean that arrays must have a fixed size. In straight C passing a multidimensional array to a function can be a little awkward maybe, but in Cython you should be able to do something like the following:

Note I took the function and variable names from your follow-up question.

import numpy as np
cimport numpy as np
cimport cython

@cython.boundscheck(False)
@cython.cdivision(True)
def cooccurance_probability_cy(double[:,:] X):
    cdef int P, i, j, k
    P = X.shape[0]
    cdef double item
    cdef double [:] CS = np.sum(X, axis=1)
    cdef double [:,:] D = np.empty((P, P), dtype=np.float)

    for i in range(P):
        for j in range(P):
            item = 0
            for k in range(P):
                item += X[i,k] * X[j,k]
            D[i,j] = item / max(CS[i], CS[j])
    return D

On the other hand, using just Numpy should also be quite fast for this problem, if you use the right functions and some broadcasting. In fact, as the calculation complexity is dominated by the matrix multiplication, I found the following is much faster than the Cython code above (np.inner uses a highly optimized BLAS routine):

def new(X):
    CS = np.sum(X, axis=1, keepdims=True)
    D = np.inner(X,X) / np.maximum(CS, CS.T)
    return D

Solution 2:

Have you tried getting rid of the for loops in numpy?

for the first part of your equation you could for example try:

(data[ np.newaxis,:] * data[:,np.newaxis]).sum(2) 

if memory is an issue you can also use the np.einsum() function. For the second part one could probably also cook up a numpy expression (bit more difficult) if you've not already tried that.


Post a Comment for "Cython Function With Variable Sized Matrix Input"