Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting error AttributeError: 'FlatParam' object has no attribute 'dist' #15

Open
spencerm89 opened this issue Dec 13, 2019 · 7 comments

Comments

@spencerm89
Copy link

I am getting this error despite the fact that I'm not putting any attribute 'dist' into the FlatParam function, just the test_value argument which it is supposed to have. Below is a simplified toy model which also creates this error. It is a random effects model, and initially I thought the problem was somehow due to the LLB line and its inclusion in the return line, but that can be removed and the error will still occur.

from pydream.parameters import FlatParam
from pydream.parameters import SampledParam
from pydream.core import run_dream
from pydream.convergence import Gelman_Rubin
import numpy as np
#from pysb.integrate import Solver
from scipy.stats import norm
from scipy.stats import uniform
from dill import dump_session
x=norm.rvs(size=10)
xb=np.repeat(x,10)
y=xb+norm.rvs(size=100)
explan=np.repeat(range(0,9),10)
pysb_sampled_parameter_names=['B','sigmaB','sigma']
sigma = SampledParam(uniform,loc=0,scale=1000)
sigmaB = SampledParam(uniform,loc=0,scale=1000)
Blst = np.linspace(0,0,num=10)
B = FlatParam(test_value=Blst)
sampled_parameter_names=[B,sigmaB,sigma]
def likFH(parameter_vector):
param_dict = {pname: pvalue for pname, pvalue in zip(pysb_sampled_parameter_names, parameter_vector)}
LLB = norm.logpdf(param_dict['B'],0,param_dict['sigmaB'])
Mu = param_dict['B'][explan]
LL1 = norm.logpdf(y,Mu,param_dict['sigma'])
return (np.sum(LL1)+np.sum(LLB))

#Run model
sampled_params, log_ps = run_dream(sampled_parameter_names, likFH, model_name='ToyRE_5chain', verbose=True)

@ortega2247
Copy link
Contributor

Hi, thank you for reporting this problem. From looking at your code there are some clarifications that I would like to point out:

  1. FlatParam is a Flat parameter class that returns 0 for the prior log probability at any location in parameter space. This FlatParam class is not a distribution and therefore doesn't support random draws to initialize the pydream algorithm. If you want to use this parameter you should provide a file with an initial sample matrix as an argument to the run_dream function.
  2. The test_value parameter of FlatParam determines the parameter dimension, so if you only want to use 1 FlatParam to your model you should use FlatParam(test_value=1)
  3. I am not sure I understand your likelihood function. It seems that when you calculate the logpdf you're assigning different loc and scale to the norm distribution.

I hope this information is helpful.

This is an example that runs, but the likelihood function return only nans:

from pydream.parameters import FlatParam
from pydream.parameters import SampledParam
from pydream.core import run_dream
from pydream.convergence import Gelman_Rubin
import numpy as np
#from pysb.integrate import Solver
from scipy.stats import norm
from scipy.stats import uniform
from dill import dump_session
x=norm.rvs(size=10)
xb=np.repeat(x,10)
y=xb+norm.rvs(size=100)
explan=np.repeat(range(0,9),10)
pysb_sampled_parameter_names=['B','sigmaB','sigma']
sigma = SampledParam(uniform,loc=0,scale=1000)
sigmaB = SampledParam(uniform,loc=0,scale=1000)
Blst = np.linspace(0,0,num=10)
B = FlatParam(test_value=Blst)
sampled_parameter_names=[B,sigmaB,sigma]

mean = np.linspace(0,0,num=12)
cov = np.identity(12)
m = np.random.multivariate_normal(mean, cov, size=10)
np.save('model_seed.npy', m)


def likFH(parameter_vector):
    param_dict = {pname: pvalue for pname, pvalue in zip(pysb_sampled_parameter_names, parameter_vector)}
    LLB = norm.logpdf(param_dict['B'],0,param_dict['sigmaB'])
    Mu = param_dict['B']
    LL1 = norm.logpdf(y,Mu,param_dict['sigma'])
    return (np.sum(LL1)+np.sum(LLB))

starts = [m[chain] for chain in range(3)]

sampled_params, log_ps = run_dream(sampled_parameter_names, likFH, model_name='ToyRE_5chain', start=starts,
                                   history_file='model_seed.npy', start_random=False, parallel=False, verbose=True)

@spencerm89
Copy link
Author

Thanks for the reply. I am still getting the error 'FlatParam' object has no attribute 'dist' with your example however. Also I believe the line Mu = param_dict['B'][explan] does need to have [explan] included as param_dict['B'] is a vector of 10 while y is a vector of 100. I am sorry if there isn't an actual bug behind my issue here, and perhaps I should have posted this on stackexchange not github but I thought I'd be much more likely to get a reply here. And I think it would be quite useful for you to have a working example hierarchical model for PyDREAM.

  1. Thanks for pointing out that the FlatParam function does not support random draws, I didn't realize that.
  2. I did not want to use 1 FlatParam in my model, I wanted to use 10.
  3. What I am trying to do is fit a hierarchical or multilevel model. For more information see how these models work in Stan here https://mc-stan.org/users/documentation/case-studies/radon.html . I am using partial pooling for the B values here, there are 10 different levels for the B variable and each level has 10 observations. The B coefficients have a mean of 0 and are normally distributed with standard deviation sigmaB. So the line LLB = norm.logpdf(param_dict['B'],0,param_dict['sigmaB']) could be thought of as being the actual prior distribution for B, I was just trying to use FlatParam to initialize B more or less.

So any ideas on getting this kind of system to work?

@ortega2247
Copy link
Contributor

Hi!
I look a the example that you pointed out. Here, I reproduce the first figure of the radon example. I hope this is helpful. I use the pymc3 version of the example and the data provided in their package: https://docs.pymc.io/notebooks/multilevel_modeling.html

import numpy as np
import pandas as pd
from pymc3 import __version__
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm, halfcauchy
from pydream.parameters import SampledParam
from pydream.core import run_dream

plt.style.use('seaborn-darkgrid')
print('Running on PyMC3 v{}'.format(__version__))

#### Use pymc3 data
from pymc3 import get_data

# Import radon data
srrs2 = pd.read_csv(get_data('srrs2.dat'))
srrs2.columns = srrs2.columns.map(str.strip)
srrs_mn = srrs2[srrs2.state=='MN'].copy()


srrs_mn['fips'] = srrs_mn.stfips*1000 + srrs_mn.cntyfips
cty = pd.read_csv(get_data('cty.dat'))
cty_mn = cty[cty.st=='MN'].copy()
cty_mn[ 'fips'] = 1000*cty_mn.stfips + cty_mn.ctfips


srrs_mn = srrs_mn.merge(cty_mn[['fips', 'Uppm']], on='fips')
srrs_mn = srrs_mn.drop_duplicates(subset='idnum')
u = np.log(srrs_mn.Uppm)

n = len(srrs_mn)

srrs_mn.county = srrs_mn.county.map(str.strip)
mn_counties = srrs_mn.county.unique()
counties = len(mn_counties)
county_lookup = dict(zip(mn_counties, range(len(mn_counties))))


county = srrs_mn['county_code'] = srrs_mn.county.replace(county_lookup).values
radon = srrs_mn.activity
srrs_mn['log_radon'] = log_radon = np.log(radon + 0.1).values
floor_measure = srrs_mn.floor.values

floor = srrs_mn.floor.values
log_radon = srrs_mn.log_radon.values
#### Use pymc3 data


# Define priors
beta0 = SampledParam(norm,loc=0,scale=1e5)
beta1 = SampledParam(norm,loc=0,scale=1e5)
sigma = SampledParam(halfcauchy, scale=5)
sampled_parameters = [beta0, beta1, sigma]


# Define likelihood function
def likFH(parameter_vector):
    theta = parameter_vector[0] + parameter_vector[1]*floor
    # Sum over the logpdf
    total_logp = np.sum(norm.logpdf(log_radon, theta, parameter_vector[2]))
    return total_logp

niterations = 20000
nchains = 3
sampled_params, log_ps = run_dream(sampled_parameters, likFH, model_name='radon_model', niterations=niterations,
                                   nchains=nchains, verbose=True)

b0 = np.mean([chain[:, 0][10000:] for chain in sampled_params])
m0 = np.mean([chain[:, 1][10000:] for chain in sampled_params])

plt.scatter(srrs_mn.floor, np.log(srrs_mn.activity+0.1))
xvals = np.linspace(-0.2, 1.2)
plt.plot(xvals, m0*xvals+b0, 'r--');
plt.show()

radon_model

@spencerm89
Copy link
Author

Thanks for spending the time to write up that code. However I have tried to run that code on both my work and personal computers, but I always get the error "NameError: name 'floor' is not defined" It appears that for some reason the likFH function is unable to access the variable floor which should be globally defined. I am running python version 3.7.2 in Jupyter, if any of that has something to do with this error. I also should say that this example doesn't really fit what I wanted, as the partial pooling example (code snippet 18) is what I'm actually wanting to do, not the pooling example (which isn't actually a hierarchical/multilevel model.)

@ortega2247
Copy link
Contributor

Are you running things on a Windows machine?

@spencerm89
Copy link
Author

spencerm89 commented Jan 20, 2020 via email

@curtis-jones
Copy link

I am similarly receiving an "AttributeError: Can't get attribute 'likFH' on <module 'main'" error. If I import my likelihood function I can get around the error, but in that case I don't know how to pass additional arguments needed for the likelihood calculations. I'm aware of a related thread here (#16) but can't seem to resolve my issue. I am working on a Windows machine. Any ideas as to what I am doing wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants