[SciPy-dev] reimplementation of lfilter

Discussion:

Sturla Molden

2009-09-22 06:21:44 UTC

Looking at this atrocious piece of C:

http://svn.scipy.org/svn/scipy/trunk/scipy/signal/lfilter.c.src

I got some inspiration for reusing some of my other C code to
reimplement scipy.signal.lfilter. I basically just had to write
this small C function + a Cython wrapper:

static void lfilter_1d(double *restrict b, double *restrict a,
double *x, double *y, double *zi, npy_intp K, npy_intp N)
{
double Z[K];
double *restrict pz = zi, *restrict z = Z;
register npy_intp n, k;
for (n=0; n<N; n++) {
register double yn, xn = x[n];
yn = y[n] = b[0] * xn + pz[0];
for (k=0; k<K-1; k++)
z[k] = b[k+1] * xn + pz[k+1] - a[k+1]*yn;
z[K-1] = b[K]*xn - a[K]*yn;
double *tmp = pz;
pz = z;
z = tmp;
}
if (pz != zi)
memcpy(zi, pz, K*sizeof(double));
}

Relative speed compared to scipy.signal.lfilter:

One processor:
1D array, 1000000 floats: 150%
2D array, 1000 x 1000 floats, axis=0: 80%
2D array, 1000 x 1000 floats, axis=1: 150%

It is faster than lfilter except for 2D with axis=0. For
2D with axis=1 it is probably faster than lfilter due to
copy-in copy-out optimization. For arrays with more than
one dimension, multithreading helps as well:

Four processors:
1D array, 1000000 floats: 150%
2D array, 1000 x 1000 floats, axis=0: 160%
2D array, 1000 x 1000 floats, axis=1: 250%

Multithreading obviously does not help for filtering single
vector, as there is no work to share.

Some other improvements over signal.lfilter, apart from readable
Cython code:

* GIL is released. The C code is not littered with Python C
API calls that prevents this.

* Filtering can be done in reverse, i.e. no need for fliplr or
flipud for filtering backwards.

* Arrays can be filtered in-place.

* Cleaned up docstring.

Still it only work as a "proof of concept" with double precision
floats. Adding support for other dtypes would be easy, just
replace double with any of:

float
float _Complex
double _Complex
long double
long double _Complex

Regards,
Sturla Molden

Ralf Gommers

2009-09-22 14:23:55 UTC

Permalink

Looks like a very useful improvement.

Your docstring won't render well, and there is already a cleaned up version
of the old one here:
http://docs.scipy.org/scipy/docs/scipy.signal.signaltools.lfilter/ Could you
please integrate those changes with yours?

Cheers,
Ralf

Post by Sturla Molden
http://svn.scipy.org/svn/scipy/trunk/scipy/signal/lfilter.c.src
I got some inspiration for reusing some of my other C code to
reimplement scipy.signal.lfilter. I basically just had to write
static void lfilter_1d(double *restrict b, double *restrict a,
double *x, double *y, double *zi, npy_intp K, npy_intp N)
{
double Z[K];
double *restrict pz = zi, *restrict z = Z;
register npy_intp n, k;
for (n=0; n<N; n++) {
register double yn, xn = x[n];
yn = y[n] = b[0] * xn + pz[0];
for (k=0; k<K-1; k++)
z[k] = b[k+1] * xn + pz[k+1] - a[k+1]*yn;
z[K-1] = b[K]*xn - a[K]*yn;
double *tmp = pz;
pz = z;
z = tmp;
}
if (pz != zi) memcpy(zi, pz, K*sizeof(double)); }
1D array, 1000000 floats: 150%
2D array, 1000 x 1000 floats, axis=0: 80%
2D array, 1000 x 1000 floats, axis=1: 150%
It is faster than lfilter except for 2D with axis=0. For
2D with axis=1 it is probably faster than lfilter due to
copy-in copy-out optimization. For arrays with more than
1D array, 1000000 floats: 150%
2D array, 1000 x 1000 floats, axis=0: 160%
2D array, 1000 x 1000 floats, axis=1: 250%
Multithreading obviously does not help for filtering single
vector, as there is no work to share.
Some other improvements over signal.lfilter, apart from readable
* GIL is released. The C code is not littered with Python C
API calls that prevents this.
* Filtering can be done in reverse, i.e. no need for fliplr or
flipud for filtering backwards.
* Arrays can be filtered in-place.
* Cleaned up docstring.
Still it only work as a "proof of concept" with double precision
floats. Adding support for other dtypes would be easy, just
float
float _Complex
double _Complex
long double
long double _Complex
Regards,
Sturla Molden
# Copyright (C) 2009 Sturla Molden
import numpy as np
cimport numpy as np
cdef extern int _linear_filter( object a, object b,
object x, object y, object z,
int axis, int direction)
def lfilter(object B, object A, object X,
object zi=None,
object axis = None,
int direction=1,
"""
Filter data along one-dimension with an IIR or FIR filter.
Description
===========
Filter a data sequence, x, using a digital filter. This works for
many
fundamental data types (including Object type). The filter is a
direct
form II transposed implementation of the standard difference equation
(see "Algorithm").
=======
b : vector-like
The numerator coefficient vector in a 1-D sequence.
a : vector-like
The denominator coefficient vector in a 1-D sequence. If a[0]
is not 1, then both a and b are normalized by a[0].
x : array-like
An N-dimensional input array.
axis : int or None
The axis of the input data array along which to apply the
linear filter. The filter is applied to each subarray along
this axis. If axis is None, the filter is applied to x
flattened. Default: axis=None.
zi : array-like
Initial conditions for the filter delays. It is a vector
(or array of vectors for an N-dimensional input) of length
max(len(a),len(b)). If zi=None or is not given then initial
rest is assumed. If axis is None, zi should be a vector
of length K = max(M,N), where M=len(b)-1 and N=len(a)-1. If
an axis is not None, zi should either be a vector of length
K, or an array with same shape as x, except for zi.shape[axis]
which should be K. If zi is a vector, the same initial
conditions are applied to each vector along the specified
axis. Default: zi=None (initial rest).
See signal.lfiltic for more information.
direction : int
If negative, the filter is applied in reverse direction
along the axis. Default: direction=1 (filter forward).
inplace : bool
If True, allow x or zi to be modified inplace if possible.
Default: False.
Outputs: (y, {zf})
=======
y : ndarray
The output of the digital filter.
zf : ndarray
If zi is None, this is not returned, otherwise, zf holds the
final filter delay values.
==========
The filter function is implemented as a direct II transposed
structure.
This means that the filter implements
a[0]*y[n] = b[0]*x[n] + b[1]*x[n-1] + ... + b[nb]*x[n-nb]
- a[1]*y[n-1] - ... - a[na]*y[n-na]
y[m] = b[0]*x[m] + z[0,m-1]
z[0,m] = b[1]*x[m] + z[1,m-1] - a[1]*y[m]
...
z[n-3,m] = b[n-2]*x[m] + z[n-2,m-1] - a[n-2]*y[m]
z[n-2,m] = b[n-1]*x[m] - a[n-1]*y[m]
where m is the output sample number and n=max(len(a),len(b)) is the
model order.
The rational transfer function describing this filter in the
z-transform domain is
-1 -nb
b[0] + b[1]z + ... + b[nb] z
Y(z) = ---------------------------------- X(z)
-1 -na
a[0] + a[1]z + ... + a[na] z
"""
cdef np.ndarray b, a, x, y, z
cdef np.npy_intp i, ierr
cdef double tmp
cdef object K, zshape, iaxis
# get filter coeffs b, a
A = np.asarray(A)
B = np.asarray(B)
raise ValueError, 'a and b must be vectors'
K = max(len(A)-1,len(B)-1)
b = np.zeros(K+1)
a = np.zeros(K+1)
a[:A.shape[0]] = A[:]
b[:B.shape[0]] = B[:]
# normalize by a[0]
tmp = a[0]
a[i] /= tmp
b[i] /= tmp
# set up input signal
X = np.asarray(X, dtype=np.float64)
X = X.ravel()
raise ValueError, 'axis must be None or an integer >= 0'
raise ValueError, 'axis must be None or an integer >= 0'
x = X
# set up output signal
y = X
y = np.zeros(X.shape, dtype=np.float64)
# set up filter delay buffer
iaxis = 0 if (axis is None) else axis
# zi not specified, assume initial rest
zshape = list(X.shape)
zshape[iaxis] = K
z = np.zeros(zshape, dtype=np.float64)
zi = np.asarray(zi, dtype=np.float64)
# x and zi are both vectors
raise ValueError, 'length of zi must be K'
z = zi
z = zi.copy()
# x is nd array, zi is vector
# broadcast zi (cannot modify inplace)
raise ValueError, 'length of zi must be K'
zshape = list(X.shape)
zshape[iaxis] = K
z = np.zeros(zshape, dtype=np.float64)
zshape = [1] * X.ndim
zshape[iaxis] = K
z = z + zi.reshape(zshape)
# x and zi are both nd arrays
zshape = list(X.shape)
zshape[iaxis] = K
zshape = tuple(zshape)
raise ValueError, ('bad shape of zi: expected %r, got %r' %
(zshape,zi.shape))
z = zi
z = zi.copy()
# apply the linear filter
ierr = _linear_filter(a, b, x, y, z, <int> iaxis, direction)
# check for error conditions on return
raise MemoryError, 'malloc returned NULL'
raise ValueError, 'shape of x, y, and z did not match... should
never happen, debug please.'
raise ValueError, 'shape of a and b did not match... should never
happen, debug please.'
# return y or (y, z) depending on zi
# return (y if (zi is None) else (y, z)) ## Cython fails on this
return y
return (y,z)
/* Copyright (C) 2009 Sturla Molden */
#include <Python.h>
#include <numpy/arrayobject.h>
#include <string.h>
#include <stdlib.h>
typedef PyArrayObject ndarray; /* PyArrayObject is an ugly name */
typedef Py_ssize_t integer; /* Py_ssize_t of int for 64 bit support,
npy_intp is typedef'ed incorrectly. */
/* copy data to and from temporary work arrays */
static void copy_in(double *restrict y, double *restrict x, integer n,
integer s)
{
integer i;
char *tmp;
if (s == sizeof(double))
memcpy(y, x, n*sizeof(double));
else {
tmp = (char *)x;
for (i=0; i<n; i++) {
*y++ = *x;
tmp += s;
x = (double *)tmp;
}
}
}
static void copy_out(double *restrict y, double *restrict x, integer n,
integer s)
{
integer i;
char *tmp;
if (s == sizeof(double))
memcpy(y, x, n*sizeof(double));
else {
tmp = (char *)y;
for (i=0; i<n; i++) {
*y = *x++;
tmp += s;
y = (double *)tmp;
}
}
}
/* this computes the start adress for every vector along a dimension
(axis) of an ndarray */
inline char *datapointer(ndarray *a, integer *indices)
{
char *data = a->data;
int i;
integer j;
for (i=0; i < a->nd; i++) {
j = indices[i];
data += j * a->strides[i];
}
return data;
}
static double ** _get_axis_pointer(integer *indices, int axis,
ndarray *a, double **axis_ptr_array, int dim)
{
/* recursive axis pointer search for 4 dimensions or more */
integer i;
double *axis_ptr;
if (dim == a->nd) {
/* done recursing dimensions,
compute address of this axis */
axis_ptr = (double *) datapointer(a, indices);
*axis_ptr_array = axis_ptr;
return (axis_ptr_array + 1);
} else if (dim == axis) {
/* use first element only */
indices[dim] = 0;
axis_ptr_array = _get_axis_pointer(indices, axis, a, axis_ptr_array,
dim+1);
return axis_ptr_array;
} else {
/* iterate range of indices */
integer length = a->dimensions[dim];
for (i=0; i < length; i++) {
indices[dim] = i;
axis_ptr_array = _get_axis_pointer(indices, axis, a,
axis_ptr_array, dim+1);
}
return axis_ptr_array;
}
}
static double **get_axis_pointers(ndarray *a, int axis, integer *count)
{
integer indices[a->nd];
double **out, **tmp;
integer i, j;
j = 1;
for (i=0; i < a->nd; i++) {
if (i != axis)
j *= a->dimensions[i];
}
*count = j;
out = (double **) malloc(*count * sizeof(double*));
if (out == NULL) {
*count = 0;
return NULL;
}
tmp = out;
switch (a->nd) {
/* for one dimension we just return the data pointer */
*tmp = (double *)a->data;
break;
/* specialized for two dimensions */
switch (axis) {
for (i=0; i<a->dimensions[1]; i++)
*tmp++ = (double *)(a->data + i*a->strides[1]);
break;
for (i=0; i<a->dimensions[0]; i++)
*tmp++ = (double *)(a->data + i*a->strides[0]);
break;
}
break;
/* specialized for three dimensions */
switch (axis) {
for (i=0; i<a->dimensions[1]; i++)
for (j=0; j<a->dimensions[2]; j++)
*tmp++ = (double *)(a->data + i*a->strides[1] +
j*a->strides[2]);
break;
for (i=0; i<a->dimensions[0]; i++)
for (j=0; j<a->dimensions[2]; j++)
*tmp++ = (double *)(a->data + i*a->strides[0] +
j*a->strides[2]);
break;
for (i=0; i<a->dimensions[0]; i++)
for (j=0; j<a->dimensions[1]; j++)
*tmp++ = (double *)(a->data + i*a->strides[0] +
j*a->strides[1]);
}
break;
/* four or more dimensions: use recursion */
for (i=0; i<a->nd; i++) indices[i] = 0;
_get_axis_pointer(indices, axis, a, out, 0);
}
return out;
}
/* applies a linear filter along one dimension */
static void lfilter_1d( double *restrict b, double *restrict a,
double *x, double *y,
double *zi,
integer K, integer N)
{
double Z[K];
double *restrict pz = zi, *restrict z = Z;
register integer n, k;
for (n=0; n<N; n++) {
register double yn, xn = x[n];
yn = y[n] = b[0] * xn + pz[0];
for (k=0; k<K-1; k++)
z[k] = b[k+1] * xn + pz[k+1] - a[k+1]*yn;
z[K-1] = b[K]*xn - a[K]*yn;
double *tmp = pz;
pz = z;
z = tmp;
}
if (pz != zi)
memcpy(zi, pz, K*sizeof(double));
}
static void rev_lfilter_1d( double *restrict b, double *restrict a,
double *restrict x, double *restrict y,
double *zi,
integer K, integer N)
{
double Z[K];
double *restrict pz = zi, *restrict z = Z;
register integer n, k;
for (n=N-1; n >= 0; n--) {
register double yn, xn = x[n];
yn = y[n] = b[0] * xn + pz[0];
for (k=0; k<K-1; k++)
z[k] = b[k+1]*xn + pz[k+1] - a[k+1]*yn;
z[K-1] = b[K]*xn - a[K]*yn;
double *tmp = pz;
pz = z;
z = tmp;
}
if (pz != zi)
memcpy(zi, pz, K*sizeof(double));
}
typedef void (*lfilterfun_t)(double *restrict b, double *restrict a,
double *restrict x, double *restrict y,
double *zi, integer K, integer N);
int _linear_filter(ndarray *a, ndarray *b,
ndarray *x, ndarray *y, ndarray *z,
int axis,
int direction)
{
int retval = 0;
double *wx, *wy, *wz, *data_a, *data_b;
integer xcount, ycount, zcount;
integer xstride, ystride, zstride;
double *xdata, *ydata, *zdata;
double **x_axis_ptrs = NULL, **y_axis_ptrs = NULL, **z_axis_ptrs = NULL;
lfilterfun_t lfilter;
integer i, K, N;
/* release the GIL */
Py_BEGIN_ALLOW_THREADS
/* sanity check on a and b (these should never fail) */
if (a->nd != 1) goto error_ab;
if (b->nd != 1) goto error_ab;
if (b->dimensions[0] != a->dimensions[0]) goto error_ab;
data_a = (double *)a->data;
data_b = (double *)b->data;
K = a->dimensions[0] - 1;
/* sanity ckeck on x, y and z */
if ((x->nd != y->nd) || (y->nd != z->nd)) goto error_xyz;
for (int d=0; d < a->nd; d++) {
if (d == axis) continue;
if ((x->dimensions[d] != y->dimensions[d])
|| (y->dimensions[d] != z->dimensions[d])) goto error_xyz;
}
N = x->dimensions[axis];
/* extract axis pointers */
x_axis_ptrs = get_axis_pointers(x, axis, &xcount);
y_axis_ptrs = get_axis_pointers(y, axis, &ycount);
z_axis_ptrs = get_axis_pointers(z, axis, &zcount);
/* sanity check */
if (!x_axis_ptrs) goto error_malloc;
if (!y_axis_ptrs) goto error_malloc;
if (!z_axis_ptrs) goto error_malloc;
if ((xcount != ycount)||(ycount != zcount)) goto error_xyz;
/* all ok, we can start ... */
lfilter = direction < 0 ? rev_lfilter_1d : lfilter_1d;
xstride = x->strides[axis];
ystride = y->strides[axis];
zstride = z->strides[axis];
#pragma omp parallel \
firstprivate(lfilter, N, K, xcount, x_axis_ptrs, y_axis_ptrs,
z_axis_ptrs, \
xstride, ystride, zstride, data_b, data_a) \
private(i, xdata, ydata, zdata, wx, wy, wz) \
shared(retval) \
default(none)
{
if ((xstride==sizeof(double))
&& (ystride==sizeof(double))
&& (xstride==sizeof(double)))
{
#pragma omp for schedule(static)
for (i=0; i < xcount; i++) {
xdata = x_axis_ptrs[i];
ydata = y_axis_ptrs[i];
zdata = z_axis_ptrs[i];
/* strictly speaking we could be aliasing
xdata and ydata here, but it should not matter. */
lfilter(data_b, data_a, xdata, ydata, zdata, K, N);
}
} else {
/* axis is strided, use copy-in copy-out */
wx = (double *)malloc((2*N+K)*sizeof(double));
if (!wx) {
#pragma omp critical
retval = -1; /* error_malloc */
}
#pragma omp barrier
#pragma omp flush(retval)
if (retval < 0)
/* malloc failed in some thread */
goto filtering_complete;
wy = wx + N;
wz = wy + N;
#pragma omp for schedule(static)
for (i=0; i < xcount; i++) {
/* get pointe to the first element of the axis */
xdata = x_axis_ptrs[i];
ydata = y_axis_ptrs[i];
zdata = z_axis_ptrs[i];
/* copy to local contiguous buffers */
copy_in(wx, xdata, N, xstride);
copy_in(wz, zdata, K, zstride);
/* filter */
lfilter(data_b, data_a, wx, wy, wz, K, N);
/* copy data back */
copy_out(ydata, wy, N, ystride);
copy_out(zdata, wz, K, zstride);
}
if (wx) free(wx);
}
}
/* we are done... */
goto done;
/* error conditions */
retval = -3;
goto done;
retval = -2;
goto done;
retval = -1;
/* clean up and exit */
if (x_axis_ptrs) free(x_axis_ptrs);
if (y_axis_ptrs) free(y_axis_ptrs);
if (z_axis_ptrs) free(z_axis_ptrs);
/* get the GIL back and return */
Py_END_ALLOW_THREADS
return retval;
}
_______________________________________________
Scipy-dev mailing list
http://mail.scipy.org/mailman/listinfo/scipy-dev

Sturla Molden

2009-09-22 19:19:27 UTC

Permalink

Post by Ralf Gommers
Looks like a very useful improvement.
Your docstring won't render well, and there is already a cleaned up

version of the old one here:
http://docs.scipy.org/scipy/docs/scipy.signal.signaltools.lfilter/ Could
you please integrate those changes with yours?

The code from SVN I linked is current version used in signaltools.

Also lost of the lfilter C code deal with complex numbers, which
is something an ISO compliant C or C++ compiler can do for us.

#ifdef __cplusplus
#include <complex>
typedef std::complex<float> complex_float;
typedef std::complex<double> complex_double;
typedef std::complex<long double> complex_long_double;
#define restrict
#else
#include <complex.h>
typedef float _Complex complex_float;
typedef double _Complex complex_double;
typedef long double _Complex complex_long_double;
#endif

Perhaps a better option is to use C++ only... In any case, it is
better to leave the complex number arithmetics to the compiler,
instead of bloating the source.

Sturla Molden

Ralf Gommers

2009-09-22 22:02:48 UTC

Permalink

Post by Ralf Gommers

Post by Ralf Gommers
Looks like a very useful improvement.
Your docstring won't render well, and there is already a cleaned up

http://docs.scipy.org/scipy/docs/scipy.signal.signaltools.lfilter/ Could
you please integrate those changes with yours?
The code from SVN I linked is current version used in signaltools.
Sure, I know. The docs I linked have improvements over svn. If you merge

your changes, there'll be a conflict and those docs will not get merged soon
- I was hoping to avoid that.

I also wanted to draw your attention to the doc wiki, since in this case you
redid work that was already done by someone else on the wiki before.
Hopefully this will save you some effort next time.

Cheers,
Ralf

Sturla Molden

2009-09-23 00:25:13 UTC

Permalink

Post by Ralf Gommers
Sure, I know. The docs I linked have improvements over svn. If you
merge your changes, there'll be a conflict and those docs will not get
merged soon - I was hoping to avoid that.
I also wanted to draw your attention to the doc wiki, since in this
case you redid work that was already done by someone else on the wiki
before. Hopefully this will save you some effort next time.

Well, I changed all the code to C++, to use the std::complex type,
std::vector instead of malloc, and templates for specializing the filter
functions to all dtypes supported by lfilter. I'm quite happy with the
way the C++ looks :-)

It seems to be faster than signal.lfilter in most circumstances, except
one (annoyingly the most common). The most extreme case were complex64
dtype with inplace filtering, for which I got 369% speed improvement.

Here are some timings (speed improvement over signal.lfilter in percent):

<type 'numpy.float32'>
axis=0, shape=(1000000,), speed: 240
axis=0, shape=(1000, 1000), speed: 149
axis=1, shape=(1000, 1000), speed: 238

<type 'numpy.float64'>
axis=0, shape=(1000000,), speed: 151
axis=0, shape=(1000, 1000), speed: 93
axis=1, shape=(1000, 1000), speed: 146

<type 'numpy.complex64'>
axis=0, shape=(1000000,), speed: 297
axis=0, shape=(1000, 1000), speed: 204
axis=1, shape=(1000, 1000), speed: 292

<type 'numpy.complex128'>
axis=0, shape=(1000000,), speed: 209
axis=0, shape=(1000, 1000), speed: 137
axis=1, shape=(1000, 1000), speed: 193

Inplace filter:

<type 'numpy.float32'>
axis=0, shape=(1000000,), speed: 291
axis=0, shape=(1000, 1000), speed: 182
axis=1, shape=(1000, 1000), speed: 300

<type 'numpy.float64'>
axis=0, shape=(1000000,), speed: 227
axis=0, shape=(1000, 1000), speed: 137
axis=1, shape=(1000, 1000), speed: 228

<type 'numpy.complex64'>
axis=0, shape=(1000000,), speed: 369
axis=0, shape=(1000, 1000), speed: 257
axis=1, shape=(1000, 1000), speed: 345

<type 'numpy.complex128'>
axis=0, shape=(1000000,), speed: 288
axis=0, shape=(1000, 1000), speed: 179
axis=1, shape=(1000, 1000), speed: 276

To build on Windows I did this, using GCC 4.4.1 from
http://www.equation.com/servlet/equation.cmd?fa=fortran

g++ -c -O3 -ffast-math -msse3 -march=core2 -Ic:/Python26/include
-Ic:/Python26/Lib/site-packages/numpy/core/include _linear_filter.cpp
cython.py linear_filter.pyx
gcc -c -O2 -ffast-math -msse3 -march=core2 -Ic:/Python26/include
-Ic:/Python26/Lib/site-packages/numpy/core/include linear_filter.c
g++ -shared -o linear_filter.pyd -Lc:/Python26/libs _linear_filter.o
linear_filter.o -lpython26 -lmsvcr90

Best regards,

Sturla Molden

Sturla Molden

2009-09-23 00:30:00 UTC

Permalink

Post by Sturla Molden
It seems to be faster than signal.lfilter in most circumstances,
except one (annoyingly the most common). The most extreme case were
complex64 dtype with inplace filtering, for which I got 369% speed
improvement.

Sorry, my english were bad. The numbers are not speed "improvement",
they are speed relative to signal.lfilter.

S.M.

Sturla Molden

2009-09-23 06:10:39 UTC

Permalink

Post by Sturla Molden
Well, I changed all the code to C++, to use the std::complex type,
std::vector instead of malloc, and templates for specializing the
filter functions to all dtypes supported by lfilter. I'm quite happy
with the way the C++ looks :-)

I forgot to add support for the object dtype.

This also fixes a bug in the current signal.lfilter. This will crash the
interpreter:

import numpy as np
import scipy
from scipy.signal import butter, lfilter
b,a = butter(4, .25)
x = np.random.randn(1000000).astype(object).reshape((1000,1000))
fx = lfilter(b,a,x, axis=1)

The C++ version here does not crash on this. :-)

I just ripped some code from current lfilter and cleaned it up slightly;
I hope that is ok. (Cython was not happy about "cdef object *ptr", so I
reluctantly had to put it all in C++.) The reason my code does not
crash like current lfilter, is copy-in copy-out for strided vectors. It
seems lfilter messes up the strides and then segfaults.

<type 'object'>
axis=0, shape=(1000000,), speed: 102
axis=0, shape=(1000, 1000), speed: 105
axis=1, shape=(1000, 1000), scipy.signal.lfilter crashes the interpreter!!!

Building on Windows:

g++ -c -O3 -ffast-math -msse3 -march=core2 -Ic:/Python26/include
-Ic:/Python26/Lib/site-packages/numpy/core/include _linear_filter.cpp
cython.py --cplus linear_filter.pyx
g++ -c -O2 -ffast-math -msse3 -march=core2 -Ic:/Python26/include
-Ic:/Python26/Lib/site-packages/numpy/core/include linear_filter.cpp
g++ -shared -o linear_filter.pyd -Lc:/Python26/libs _linear_filter.o
linear_filter.o -lpython26 -lmsvcr90

Regards,
Sturla Molden

David Cournapeau

2009-09-23 08:31:40 UTC

Permalink

Post by Sturla Molden

Post by Sturla Molden
Well, I changed all the code to C++, to use the std::complex type,
std::vector instead of malloc, and templates for specializing the filter
functions to all dtypes supported by lfilter. I'm quite happy with the way
the C++ looks :-)

I forgot to add support for the object dtype.
This also fixes a bug in the current signal.lfilter. This will crash the
import numpy as np
import scipy
from scipy.signal import butter, lfilter
b,a = butter(4, .25)
x = np.random.randn(1000000).astype(object).reshape((1000,1000))
fx = lfilter(b,a,x, axis=1)
The C++ version here does not crash on this. :-)
I just ripped some code from current lfilter and cleaned it up slightly; I
hope that is ok. (Cython was not happy about "cdef object *ptr", so I
reluctantly had to put it all in C++.) The reason my code does not crash
like current lfilter, is copy-in copy-out for strided vectors. It seems
lfilter messes up the strides and then segfaults.

I am all for improving the lfilter code, here a few comments:
- it is much more likely that your improvements will be included if
you provide patches instead of rewrite of the full code - it makes
reviewing much easier.
- I would also prefer having C instead of C++ as well - in this case,
C++ does not bring much since we have our "templating" system and you
don't use the STL much.
- In any case, please do not use exception, it is not portable.
- you cannot use Py_ssize_t, as it is python 2.5 >= feature - there
is nothing wrong with npy_intp, I don't understand your comment.
- using cython is good - that could be a first patch, to replace the
existing manual wrapping by cython. You can declare pointer without
trouble in cython, you would have to be more explicit about the exact
problem you had.

cheers,

David

Sturla Molden

2009-09-23 11:24:41 UTC

Permalink

Post by David Cournapeau
- I would also prefer having C instead of C++ as well - in this case,
C++ does not bring much since we have our "templating" system and you
don't use the STL much.

It was mainly for complex numbers, since MSVC does not support ISO C.

Post by David Cournapeau
- In any case, please do not use exception, it is not portable.

Ok, the STL containers like vector<> can throw exceptions like
std::bad_alloc. That is why I did this.

Post by David Cournapeau
- you cannot use Py_ssize_t, as it is python 2.5 >= feature - there
is nothing wrong with npy_intp, I don't understand your comment.

Yes there is, cf. PEP 353.

Using Py_intptr_t for indexing would depend on sizeof(void*) ==
sizeof(size_t), which the C standard does not mandate. It can differ on
segment and offset architectures. Two examples are 16-bit x86 (cf. far
and near pointers) and x86 with 36-bit PAE. It accidentally works for
flat 32- and 64-bit address spaces. This is why Python uses Py_ssize_t
instead of Py_intptr_t. And as it happens, npy_intp is typedef'ed to the
latter. (This might be pedantic, but it is a formal error.)

Post by David Cournapeau
- using cython is good - that could be a first patch, to replace the
existing manual wrapping by cython. You can declare pointer without
trouble in cython,

No, you cannot create pointers to a variable declared object. This is
illegal Cython:

cdef object *ptr # would simliar to PyObject **ptr in C

So if we want to filter with dtype object, one could use Dag Sverre's
numpy syntax and fake "cdef object *ptr" with "cdef np.ndarray[object]
ptr", but it would not be efficient. We have to store two z-buffers and
swap them for each filter step. The other option is to use "cdef
PyObject**" instead of "cdef object *" in Cython, but then Cython will
not do reference counting.

S.M.

Ravi

2009-09-23 12:29:58 UTC

Permalink

Hi,

Post by David Cournapeau
- it is much more likely that your improvements will be included if
you provide patches instead of rewrite of the full code - it makes
reviewing much easier.

In this case, I respectfully disagree; the full rewrite actually makes sense
when comparing the previous code to the current one.

Post by David Cournapeau
- I would also prefer having C instead of C++ as well - in this case,
C++ does not bring much since we have our "templating" system and you
don't use the STL much.
- In any case, please do not use exception, it is not portable.

Are there any such compilers on which scipy can be compiled? The only compiler
on which exceptions are still a problem (as far as I know) is OpenWatcom, but
scipy cannot be compiled using that compiler anyway. For all major *nixes
(Linux, Solaris, HPUX, AIX(!), *BSD), both the system C++ compiler and the
major external compilers (usually gcc & icc) provide full exception support.
On Windows, only versions upto MSVC6 have problems (but they cannot be used to
compile python extensions). Even the proprietary ARM and MIPS compilers
support exceptions. On the embedded side, the only compiler that I can think
of that does not support real C++ is Keil, but is anyone running scipy on an
8051?

Regards,
Ravi

David Cournapeau

2009-09-23 13:58:40 UTC

Permalink

Post by Ravi
Hi,

- it is much more likely that your improvements will be included if
you provide patches instead of rewrite of the full code - it makes
reviewing much easier.

In this case, I respectfully disagree; the full rewrite actually makes sense
when comparing the previous code to the current one.

It can be a full rewrite, but still should be sent as patches. If I am
the one to review, I would prefer this way. That's especially
important to track regressions.

Post by Ravi

- I would also prefer having C instead of C++ as well - in this case,
C++ does not bring much since we have our "templating" system and you
don't use the STL much.
- In any case, please do not use exception, it is not portable.

Are there any such compilers on which scipy can be compiled?

It is a fundamental problem of C++. Different compilers do not
propagate exceptions the same way, and that's a problem when different
compilers are involved (happens easily when the C and C++ compilers
are not the same, for example). That has been a problem on every new
platform I have tried to port numpy and scipy to.

That's the same rationale as why avoiding fortran runtime calls - if
you only use fortran, it is ok, but once you use the fortran runtime
and the C runtime in the same extension, you get random crashes which
are impossible to debug.

cheers,

David

Sturla Molden

2009-09-23 15:07:29 UTC

Permalink

Post by David Cournapeau
It can be a full rewrite, but still should be sent as patches. If I am
the one to review, I would prefer this way. That's especially
important to track regressions.

Well.. I didn't intend this for inclusion in SciPy, at least not in the
beginning. I needed it for some neuroscience software I am writing.

Post by David Cournapeau
It is a fundamental problem of C++. Different compilers do not
propagate exceptions the same way, and that's a problem when different
compilers are involved (happens easily when the C and C++ compilers
are not the same, for example). That has been a problem on every new
platform I have tried to port numpy and scipy to.

Does this mean we cannot use g++ to compile extensions at all, when
Python VM is compiled with MSVC? Or does the rule just apply to the
particular extension, like C and Fortran runtimes?

I don't like the complexity of C++. But it has some advantages over C
for scientific computing; notably STL containers, templates for
generics, the std::complex<> type, and so on. The problem with C++ is
that it encourages bloat ugly style, and OOP stuff is etter left in Python.

I personally like exceptions because they remove the need for lot or
error checking. In C, we can achieve almost the same effect using
setjmp/longjmp. Is that bad style as well?

Sturla Molden

Charles R Harris

2009-09-23 15:31:10 UTC

Permalink

Post by Sturla Molden

Post by David Cournapeau
It can be a full rewrite, but still should be sent as patches. If I am
the one to review, I would prefer this way. That's especially
important to track regressions.

Well.. I didn't intend this for inclusion in SciPy, at least not in the
beginning. I needed it for some neuroscience software I am writing.

Does this mean we cannot use g++ to compile extensions at all, when
Python VM is compiled with MSVC? Or does the rule just apply to the
particular extension, like C and Fortran runtimes?
I don't like the complexity of C++. But it has some advantages over C
for scientific computing; notably STL containers, templates for
generics, the std::complex<> type, and so on. The problem with C++ is
that it encourages bloat ugly style, and OOP stuff is etter left in Python.
I personally like exceptions because they remove the need for lot or
error checking. In C, we can achieve almost the same effect using
setjmp/longjmp. Is that bad style as well?

The setjmp/longjmp machinery has been used (zeros.c, for instance), but I
thinks it makes the C code less transparent and should be avoided unless the
traditional use of return values is so ugly you can't bear to look at the
code. C really doesn't have very good error handling, that may even be a
feature in a low level language.

It might be useful to think up some way that setjmp/longjmp could be used in
some sort of standard error handling template.

Chuck

Sturla Molden

2009-09-24 01:10:45 UTC

Permalink

Post by Charles R Harris
It might be useful to think up some way that setjmp/longjmp could be
used in some sort of standard error handling template.

One way is to use a resource pool (a la Apache), see Cython code below.

cdef int _foobar() nogil:
cdef mempool *mp
cdef int rv
mp = mempool_init()
if (mp == NULL):
return -1 # error
else:
some_C_function(mp)
mempool_destroy(mp)
return int(rv)

def foobar():
if (_foobar() < 0):
raise MemoryError, 'malloc failed'

This alleviates any need for error checking on the return value from
mempool_malloc. It might not be obvious though. What happens is that a
NULL return form malloc trigger a longjmp back to mempool_init, after
which the memory error is raised. Anything left allocated by the pool is
freed on mempool_destroy, so it cannot leak. Pools can be used to manage
any type of resource, e.g. open file handles, not just memory. The
advantage is that you get C code that don't have to do error checking
everywhere.

The problem with Cython is that a longjmp can mess up refcounts, so it
probably should only be used in pure C mode, i.e. in a function safe to
call with nogil.

C++ exceptions are much cleaner than C resource pools though, for
example because destructors are called automatically. And the fact that
C++ can put objects on the stack, and references can be used instead of
pointers, means it is very safe against resource leaks if used
correctly. The problems people have with C++ comes from bad style, for
example calling new outside a constructor, using new[] instead of
std::vector, using pointers instead of references (a reference cannot be
dangling), etc.

Sturla Molden

# memory pool
# Copyright (C) 2009 Sturla Molden

import numpy as np
cimport numpy

cdef extern from "setjmp.h":
ctypedef struct jmp_buf:
pass
int setjmp(jmp_buf jb) nogil
void longjmp(jmp_buf jb, int errcode) nogil

cdef extern from "stdlib.h":
# Assume malloc, realloc and free are thread safe
# We may need to change this, in which case a lock
# is needed in the memory pool.
ctypedef unsigned long size_t
void free(void *ptr) nogil
void *malloc(size_t size) nogil
void *realloc(void *ptr, size_t size) nogil

cdef struct mempool:
mempool *prev
mempool *next

cdef public mempool *mempool_init() nogil:
cdef jmp_buf *jb
cdef mempool *p = <mempool *> malloc(sizeof(jmp_buf) + sizeof(mempool))
if not p: return NULL
jb = <jmp_buf*>(<np.npy_intp>p + sizeof(mempool))
p.prev = NULL
p.next = NULL
if setjmp(jb[0]):
# mempool_error has been called
mempool_destroy(p)
return NULL
else:
return p

cdef public void mempool_error(mempool *p) nogil:
# rewind stack back to mempool_init()
cdef jmp_buf *jb = <jmp_buf*>(<np.npy_intp>p + sizeof(mempool))
longjmp(jb[0], 1)

cdef public void *mempool_destroy(mempool *p) nogil:
# this releases all memory allocated to this pool
cdef mempool *tmp
while p:
tmp = p
p = p.next
free(tmp)

cdef public void *mempool_malloc(mempool *p, size_t n) nogil:
cdef mempool *block
cdef void *m = malloc(sizeof(mempool) + n)
if not m:
mempool_error(p)
block = <mempool*> m
block.next = p.next
if block.next:
block.next.prev = block
block.prev = p
p.next = block
return <void*>(<np.npy_intp>m + sizeof(mempool))

cdef public void *mempool_realloc(mempool *p, void *ptr, size_t n) nogil:
cdef void *retval
cdef mempool *block, *prev, *next, *new_addr
if not n:
# realloc(p, 0) is free(p)
mempool_free(ptr)
retval = NULL
elif not ptr:
# realloc(0, n) is malloc(n)
retval = mempool_malloc(p,n)
else:
# realloc(p, n)
block = <mempool *>(<np.npy_intp>ptr - sizeof(mempool))
prev = block.prev
next = block.next
new_addr = <mempool*> realloc(block, n + sizeof(mempool))
if not new_addr:
mempool_error(p)
if new_addr != block:
prev.next = new_addr
if next:
next.prev = new_addr
new_addr.prev = prev
new_addr.next = next
retval = <void *>(<np.npy_intp>new_addr + sizeof(mempool))
return retval

cdef public void *mempool_free(void *ptr) nogil:
cdef mempool *prev, *next, *cur
cdef mempool *block = <mempool *>(<np.npy_intp>ptr - sizeof(mempool))
prev = block.prev
next = block.next
if next:
free(block)
next.prev = prev
prev.next = next
else:
free(block)
prev.next = NULL

David Cournapeau

2009-09-24 01:26:15 UTC

Permalink

Post by Sturla Molden
The
advantage is that you get C code that don't have to do error checking
everywhere.

Really, this is a big disadvantage. In C, you have to check error codes.
Anything which gives the illusion you can get away with it is wrong
IMHO. Technically, setjmp/longjmp are also problematic for many reasons
(often the same as C++ exceptions: it is difficult to get right). At
least with goto you can often understand what's wrong - not so much with
longjmp/setjmp - if the error handling is too complicated to handle with
simple goto, then the code itself is too complicated. If you need to
handle deeply nested errors, then using the C API for python exception
is good enough.

I agree we are not as good as we should in numpy for error handling at
the C level, but I think it boils down to not having a unified error
handling (that is -3 may mean could not allocate in one function, and
could not convert the argument to something in another). But that can be
solved with a unified set of error codes and associated strings + a
simple log system (that's how sndfile does it for example, and I really
like it).

cheers,

David

David Cournapeau

2009-09-23 16:25:46 UTC

Permalink

Post by Sturla Molden

Post by David Cournapeau
It can be a full rewrite, but still should be sent as patches. If I am
the one to review, I would prefer this way. That's especially
important to track regressions.

Well.. I didn't intend this for inclusion in SciPy, at least not in the
beginning. I needed it for some neuroscience software I am writing.

Does this mean we cannot use g++ to compile extensions at all, when
Python VM is compiled with MSVC?

Oh, you certainly *can*, as we can use gcc to compile extensions
against python built with MS compiler. But there are limitations -
with C, those limitations are known, with C++, they become atrocious
once you use the C++ runtime (RTTI, exception, etc...). I had to debug
some stuff in sparsetools on windows 64 bits a few days ago, and this
was horrible because the exceptions are lost between DLL.

Post by Sturla Molden
I don't like the complexity of C++. But it has some advantages over C
for scientific computing; notably STL containers, templates for
generics, the std::complex<> type, and so on.

For basic templates, the .src stuff is good enough. For complex, yes,
that's a bit of a pain, although you could use the C99 complex type if
you only care about modern C compilers in your own extension.

Post by Sturla Molden
I personally like exceptions because they remove the need for lot or
error checking.

I call this a disadvantage :) And having exceptions in a language
without gc causes more trouble than it worths IMHO. Exception-safe C++
is incredibly hard to do right.

Post by Sturla Molden
In C, we can achieve almost the same effect using
setjmp/longjmp. Is that bad style as well?

The standard way to handle errors in C code using the python API is
goto. It works well, and if you need to jump several calls, you can
always use python exception mechanism.

cheers,

David

Sturla Molden

2009-09-24 01:43:30 UTC

Permalink

Post by David Cournapeau
Oh, you certainly *can*, as we can use gcc to compile extensions
against python built with MS compiler. But there are limitations -
with C, those limitations are known, with C++, they become atrocious
once you use the C++ runtime (RTTI, exception, etc...). I had to debug
some stuff in sparsetools on windows 64 bits a few days ago, and this
was horrible because the exceptions are lost between DLL.

Mingw (g++) will statically link its own C++ runtime. DLLs compiled
with g++ do not share C++ runtime. This obviously is a problem if you
need to propagate an exception from one DLL to another. MS Visual C++
does not have this issue.

You get the same issue in C if a DLL statically links its own C runtime,
or two DLLs use different shared C runtimes.

Post by David Cournapeau
I call this a disadvantage :) And having exceptions in a language
without gc causes more trouble than it worths IMHO. Exception-safe C++
is incredibly hard to do right.

I'd say exceptipn safe C++ is easy to do right, but C++ textbooks don't
teach you how. They generally teach you all the mistakes you can do. And
if a book has 'best practices' or 'effective' in the title, it is likely
full of BS.

The main issue is to never -- ever -- open an allocated reource outside
a constructor. That includes calling new or new[]. Always use
std::vector instead of new[], unless you have an extremely good reason.
Always deallocate resources in a destructor. Objects should always be
put on the stack (unless allocated in a constructor), and references
should be used instead of pointers.

For example, look at wxWidgets, that has an additional Destroy() method
in the GUI classes. Destroy() must be called manually. This is a classic
C++ mistake. Destroy() is not called automatically, so an exception will
mess this up.

Sturla Molden

Post by David Cournapeau

Post by Sturla Molden
In C, we can achieve almost the same effect using
setjmp/longjmp. Is that bad style as well?

The standard way to handle errors in C code using the python API is
goto. It works well, and if you need to jump several calls, you can
always use python exception mechanism.
cheers,
David
_______________________________________________
Scipy-dev mailing list
http://mail.scipy.org/mailman/listinfo/scipy-dev

David Cournapeau

2009-09-24 01:37:55 UTC

Permalink

Post by Sturla Molden
Mingw (g++) will statically link its own C++ runtime. DLLs compiled
with g++ do not share C++ runtime. This obviously is a problem if you
need to propagate an exception from one DLL to another. MS Visual C++
does not have this issue.

yes it does, when you mix C runtimes (/MD vs /MDd for example),
different C++ code is compiled with different incompatible options, or
when you mix mingw and msvc (Those are all concrete problems I had to
solve on windows at one point or the other BTW).

Those problems are not c++ specific, but they are much more manageable
in C than in C++. Interoperability with C++ from other languages is a
known problem - given that numpy supports so many compilers and
platforms, and that using 'weird' compilers is common in scientific
computing, I don't think it worths the hassle in most cases.

cheers,

David

Sturla Molden

2009-09-23 01:01:53 UTC

Permalink

Oh, the docs, not the code... Would something like this be better?

The Cython module doesn't need to have docs in it.

from linear_filter import lfilter as _lfilter

def lfilter(b, a, x, axis=-1, zi=None, direction=1, inplace=-1):
"""Filter data along one-dimension with an IIR or FIR filter.

Description

Filter a data sequence, x, using a digital filter. This works for many
fundamental data types (including Object type). The filter is a direct
form II transposed implementation of the standard difference equation
(see "Algorithm").

Inputs:

b -- The numerator coefficient vector in a 1-D sequence.
a -- The denominator coefficient vector in a 1-D sequence. If a[0]
is not 1, then both a and b are normalized by a[0].
x -- An N-dimensional input array.
axis -- The axis of the input data array along which to apply the
linear filter. The filter is applied to each subarray along
this axis. If axis is -1, the filter is applied to x
flattened. (*Default* = -1)
zi -- Initial conditions for the filter delays. It is a vector
(or array of vectors for an N-dimensional input) of length
max(len(a),len(b)). If zi=None or is not given then initial
rest is assumed. If axis is None, zi should be a vector
of length K = max(M,N), where M=len(b)-1 and N=len(a)-1. If
an axis is not None, zi should either be a vector of length
K, or an array with same shape as x, except for zi.shape[axis]
which should be K. If zi is a vector, the same initial
conditions are applied to each vector along the specified
axis. Default: zi=None (initial rest).
direction -- Filtered in the reverse direction if -1. (*Default* = 1)
inplace -- If possible, modification to x is done implace.
(*Default* = -1)

Outputs: (y, {zf})

y -- The output of the digital filter.
zf -- If zi is None, this is not returned, otherwise, zf holds the
final filter delay values.

Algorithm:

The filter function is implemented as a direct II transposed structure.
This means that the filter implements

a[0]*y[n] = b[0]*x[n] + b[1]*x[n-1] + ... + b[nb]*x[n-nb]
- a[1]*y[n-1] - ... - a[na]*y[n-na]

using the following difference equations:

y[m] = b[0]*x[m] + z[0,m-1]
z[0,m] = b[1]*x[m] + z[1,m-1] - a[1]*y[m]
...
z[n-3,m] = b[n-2]*x[m] + z[n-2,m-1] - a[n-2]*y[m]
z[n-2,m] = b[n-1]*x[m] - a[n-1]*y[m]

where m is the output sample number and n=max(len(a),len(b)) is the
model order.

The rational transfer function describing this filter in the
z-transform domain is
-1 -nb
b[0] + b[1]z + ... + b[nb] z
Y(z) = ---------------------------------- X(z)
-1 -na
a[0] + a[1]z + ... + a[na] z

"""
if isscalar(a):
a = [a]

if axis < 0: axis = None
if inplace < 0: inplace = None

return _lfilter(b, a, x, axis=axis, zi=zi,
inplace=inplace, direction=direction)

Regards,
S.M.

Ralf Gommers

2009-09-23 01:40:48 UTC

Permalink

Post by Sturla Molden

Oh, the docs, not the code... Would something like this be better?

See attached for a new version which incorporates your changes, the latest
wiki version, as well as some more edits. It fixes indentation, type
descriptions and rest markup.

I have one (very minor) comment on the options you added, "direction" and
"inplace". I think these should both be bools, because they select between
two options.

The C++ code looks clean, I do not know enough about that language to say
much more about it.

Cheers,
Ralf

Post by Sturla Molden
The Cython module doesn't need to have docs in it.
from linear_filter import lfilter as _lfilter
"""Filter data along one-dimension with an IIR or FIR filter.
Description
Filter a data sequence, x, using a digital filter. This works for many
fundamental data types (including Object type). The filter is a direct
form II transposed implementation of the standard difference equation
(see "Algorithm").
b -- The numerator coefficient vector in a 1-D sequence.
a -- The denominator coefficient vector in a 1-D sequence. If a[0]
is not 1, then both a and b are normalized by a[0].
x -- An N-dimensional input array.
axis -- The axis of the input data array along which to apply the
linear filter. The filter is applied to each subarray along
this axis. If axis is -1, the filter is applied to x
flattened. (*Default* = -1)
zi -- Initial conditions for the filter delays. It is a vector
(or array of vectors for an N-dimensional input) of length
max(len(a),len(b)). If zi=None or is not given then initial
rest is assumed. If axis is None, zi should be a vector
of length K = max(M,N), where M=len(b)-1 and N=len(a)-1. If
an axis is not None, zi should either be a vector of length
K, or an array with same shape as x, except for zi.shape[axis]
which should be K. If zi is a vector, the same initial
conditions are applied to each vector along the specified
axis. Default: zi=None (initial rest).
direction -- Filtered in the reverse direction if -1. (*Default* = 1)
inplace -- If possible, modification to x is done implace.
(*Default* = -1)
Outputs: (y, {zf})
y -- The output of the digital filter.
zf -- If zi is None, this is not returned, otherwise, zf holds the
final filter delay values.
The filter function is implemented as a direct II transposed structure.
This means that the filter implements
a[0]*y[n] = b[0]*x[n] + b[1]*x[n-1] + ... + b[nb]*x[n-nb]
- a[1]*y[n-1] - ... - a[na]*y[n-na]
y[m] = b[0]*x[m] + z[0,m-1]
z[0,m] = b[1]*x[m] + z[1,m-1] - a[1]*y[m]
...
z[n-3,m] = b[n-2]*x[m] + z[n-2,m-1] - a[n-2]*y[m]
z[n-2,m] = b[n-1]*x[m] - a[n-1]*y[m]
where m is the output sample number and n=max(len(a),len(b)) is the
model order.
The rational transfer function describing this filter in the
z-transform domain is
-1 -nb
b[0] + b[1]z + ... + b[nb] z
Y(z) = ---------------------------------- X(z)
-1 -na
a[0] + a[1]z + ... + a[na] z
"""
a = [a]
if axis < 0: axis = None
if inplace < 0: inplace = None
return _lfilter(b, a, x, axis=axis, zi=zi,
inplace=inplace, direction=direction)
Regards,
S.M.
_______________________________________________
Scipy-dev mailing list
http://mail.scipy.org/mailman/listinfo/scipy-dev