[SciPy-Dev] Adding a convenient method to create ufuncs in for scipy.stats

Discussion:

Warren Weckesser

2017-03-16 19:19:08 UTC

I'm working on an update to the Frechet distribution in scipy.stats (see
https://github.com/scipy/scipy/issues/3258 and
https://github.com/scipy/scipy/pull/3275).

Instead jumping through the "lazy_where" hoops that are required for
conditional computations, it would be much easier to create a ufunc for the
standard PDF, CDF and possibly other required functions. Easier, that is,
if I use the ufunc generation tools that we have over in scipy.special.
Would there be any objections to this? We already have quite a few
functions for probability distributions in scipy.special:
https://docs.scipy.org/doc/scipy/reference/special.html#raw-statistical-functions

I wouldn't mind creating ufuncs for some of the other distributions, too.
A ufunc implementation is more efficient, simplifies the code in
scipy.stats, and automatically handles broadcasting.

I'm bringing this up here to see if anyone has any objections to the
expansion of the statistical functions in scipy.special.

Warren

Warren Weckesser

2017-03-16 19:39:49 UTC

Permalink

On Thu, Mar 16, 2017 at 3:19 PM, Warren Weckesser <

Post by Warren Weckesser
I'm working on an update to the Frechet distribution in scipy.stats (see
https://github.com/scipy/scipy/issues/3258 and https://github.com/scipy/
scipy/pull/3275).
Instead jumping through the "lazy_where" hoops that are required for
conditional computations, it would be much easier to create a ufunc for the
standard PDF, CDF and possibly other required functions. Easier, that is,
if I use the ufunc generation tools that we have over in scipy.special.
Would there be any objections to this? We already have quite a few
https://docs.scipy.org/doc/scipy/reference/special.html#
raw-statistical-functions
I wouldn't mind creating ufuncs for some of the other distributions, too.
A ufunc implementation is more efficient, simplifies the code in
scipy.stats, and automatically handles broadcasting.
I'm bringing this up here to see if anyone has any objections to the
expansion of the statistical functions in scipy.special.
Warren

In my previous email, the heading hints at an alternative that I didn't
mention in the text. The question implied in the heading is: what do folks
think about adding ufunc generation tools to scipy.stats, instead of
generating the ufuncs in scipy.special. There are a lot of conditional
computations in scipy.stats that would benefit from being implemented as
ufuncs, but probably don't need to be public functions. So instead of
adding more functions to scipy.special, perhaps we could add code in
scipy.stats for generating ufuncs, many of which would be private. Of
course, we could just generate private ufuncs in scipy.special, and only
use them in scipy.stats.

What do you think?

Warren

Robert Kern

2017-03-16 20:14:32 UTC

Permalink

On Thu, Mar 16, 2017 at 12:39 PM, Warren Weckesser <

Post by Warren Weckesser
On Thu, Mar 16, 2017 at 3:19 PM, Warren Weckesser <

Post by Warren Weckesser
I'm working on an update to the Frechet distribution in scipy.stats (see

https://github.com/scipy/scipy/issues/3258 and
https://github.com/scipy/scipy/pull/3275).

Post by Warren Weckesser

Post by Warren Weckesser
Instead jumping through the "lazy_where" hoops that are required for

conditional computations, it would be much easier to create a ufunc for the
standard PDF, CDF and possibly other required functions. Easier, that is,
if I use the ufunc generation tools that we have over in scipy.special.
Would there be any objections to this? We already have quite a few
functions for probability distributions in scipy.special:
https://docs.scipy.org/doc/scipy/reference/special.html#raw-statistical-functions

Post by Warren Weckesser

Post by Warren Weckesser
I wouldn't mind creating ufuncs for some of the other distributions,

too. A ufunc implementation is more efficient, simplifies the code in
scipy.stats, and automatically handles broadcasting.

Post by Warren Weckesser

Post by Warren Weckesser
I'm bringing this up here to see if anyone has any objections to the

expansion of the statistical functions in scipy.special.

Post by Warren Weckesser

Post by Warren Weckesser
Warren

In my previous email, the heading hints at an alternative that I didn't

mention in the text. The question implied in the heading is: what do folks
think about adding ufunc generation tools to scipy.stats, instead of
generating the ufuncs in scipy.special. There are a lot of conditional
computations in scipy.stats that would benefit from being implemented as
ufuncs, but probably don't need to be public functions. So instead of
adding more functions to scipy.special, perhaps we could add code in
scipy.stats for generating ufuncs, many of which would be private. Of
course, we could just generate private ufuncs in scipy.special, and only
use them in scipy.stats.

+1 for adding additional more standard PDF/CDF functions to scipy.special
as needed.

There's already precedent for putting statistics-related but not
distribution-related ufuncs into scipy.special, specifically for the
conditional operations, e.g. boxcox(). On the other hand, if the functions
you are thinking of would not be part of the public API, then I'd prefer to
implement them in scipy.stats instead of scipy.special.

What work do you think is entailed in implementing the ufuncs in
scipy.stats? Is there infrastructure we need to duplicate? Can we abstract
out the build infrastructure to a common place? I haven't looked at the
build details for scipy.special in some time.

--
Robert Kern

Joshua Wilson

2017-03-16 20:28:10 UTC

Permalink

Can we abstract out the build infrastructure to a common place?

This should be fairly easy to do. It could be set up so that each
module that needs ufuncs could have a local config file (cribbed off
of the current `FUNCS` string in `generate_ufuncs.py`).

Though I also don't object to adding more functions to special.

On Thu, Mar 16, 2017 at 12:39 PM, Warren Weckesser

Post by Warren Weckesser
On Thu, Mar 16, 2017 at 3:19 PM, Warren Weckesser

Post by Warren Weckesser
I'm working on an update to the Frechet distribution in scipy.stats (see
https://github.com/scipy/scipy/issues/3258 and
https://github.com/scipy/scipy/pull/3275).
Instead jumping through the "lazy_where" hoops that are required for
conditional computations, it would be much easier to create a ufunc for the
standard PDF, CDF and possibly other required functions. Easier, that is,
if I use the ufunc generation tools that we have over in scipy.special.
Would there be any objections to this? We already have quite a few
https://docs.scipy.org/doc/scipy/reference/special.html#raw-statistical-functions
I wouldn't mind creating ufuncs for some of the other distributions, too.
A ufunc implementation is more efficient, simplifies the code in
scipy.stats, and automatically handles broadcasting.
I'm bringing this up here to see if anyone has any objections to the
expansion of the statistical functions in scipy.special.
Warren

+1 for adding additional more standard PDF/CDF functions to scipy.special as
needed.
There's already precedent for putting statistics-related but not
distribution-related ufuncs into scipy.special, specifically for the
conditional operations, e.g. boxcox(). On the other hand, if the functions
you are thinking of would not be part of the public API, then I'd prefer to
implement them in scipy.stats instead of scipy.special.
What work do you think is entailed in implementing the ufuncs in
scipy.stats? Is there infrastructure we need to duplicate? Can we abstract
out the build infrastructure to a common place? I haven't looked at the
build details for scipy.special in some time.
--
Robert Kern
_______________________________________________
SciPy-Dev mailing list
https://mail.scipy.org/mailman/listinfo/scipy-dev

Joshua Wilson

2017-03-16 20:38:41 UTC

Permalink

ps--if we decide to abstract out the infrastructure I'd be willing to
write the code.

On Thu, Mar 16, 2017 at 3:28 PM, Joshua Wilson