Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A problem about NAN #20

Open
Simon1zxb opened this issue Jul 30, 2018 · 2 comments
Open

A problem about NAN #20

Simon1zxb opened this issue Jul 30, 2018 · 2 comments

Comments

@Simon1zxb
Copy link

Hi josedolz,
So sorry to bother you.
I am trying to use your network to do some experiment.
But I find that when I train my own model. The cost begin to become NAN after one train step.
I really don't know why. Did you meet the same problem before?
The parameter of the network is set as your paper's first architecture.(kernel 777, input 272727)
And I change the code to fit the Python3.

@ytliang97
Copy link

same problem. I set LiviaNet_Config.ini

n_classes = 5
number Of Epochs = 10
number Of SubEpochs = 1000
imageTypes = 0
# custom dataset, no ROI
# other setting is the same

image

I think it is because imagesSamplesAll=0 in src/LiviaNet/startTraining.py.
And it cause numberBatches=0, so the model stop training.

[imagesSamplesAll,
gt_samplesAll] = getSamplesSubepoch(numberOfSamplesSupEpoch,
imageNames_Train,
groundTruthNames_Train,
roiNames_Train,
imageType,
sampleSize_Train,
receptiveField,
applyPadding
)
# Variable that will contain weights for the cost function
# --- In its current implementation, all the classes have the same weight
weightsCostFunction = np.ones(myLiviaNet3D.n_classes, dtype='float32')
numberBatches = len(imagesSamplesAll) / myLiviaNet3D.batch_Size
myLiviaNet3D.trainingData_x.set_value(imagesSamplesAll, borrow=True)
myLiviaNet3D.trainingData_y.set_value(gt_samplesAll, borrow=True)
costsOfBatches = []
evalResultsSubepoch = np.zeros([ myLiviaNet3D.n_classes, 4 ], dtype="int32")
for b_i in xrange(numberBatches):
# TODO: Make a line that adds a point at each trained batch (Or percentage being updated)
costErrors = myLiviaNet3D.networkModel_Train(b_i, weightsCostFunction)
meanBatchCostError = costErrors[0]
costsOfBatches.append(meanBatchCostError)
myLiviaNet3D.updateLayersMatricesBatchNorm()
#======== Calculate and Report accuracy over subepoch
meanCostOfSubepoch = sum(costsOfBatches) / float(numberBatches)
print(" ---------- Cost of this subEpoch: {}".format(meanCostOfSubepoch))

I still try to figure out why getSamplesSubepoch() will output 0

[imagesSamplesAll,
gt_samplesAll] = getSamplesSubepoch(numberOfSamplesSupEpoch,
imageNames_Train,
groundTruthNames_Train,
roiNames_Train,
imageType,
sampleSize_Train,
receptiveField,
applyPadding
)

@josedolz
Copy link
Owner

josedolz commented Jun 24, 2020

Hi, it is a long time since last time I used this code, but I recall that this comes from the fact that there are 0 samples to process (typically from a wrong path pointing to the image files). Check whether the files can be found and loaded, and then why the number of samples is equal to 0.

Best,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants