Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the validation #4

Open
junfengluo opened this issue Aug 31, 2017 · 13 comments
Open

About the validation #4

junfengluo opened this issue Aug 31, 2017 · 13 comments

Comments

@junfengluo
Copy link

Hi, can you tell me about the validation part, there is no describe about validation in your code. Can I just use the eval.py to evaluate the trained model on validation data like your method on inference.py as following? Or can you give me an example.

python eval.py --eval_data_pattern="$path_to_features/validatea*.tfrecord" --model=NetVLADModelLF --train_dir=gatedlightvladLF-256k-1024-80-0002-300iter-norelu-basic-gatedmoe --frame_features=True --feature_names="rgb,audio" --feature_sizes="1024,128" --batch_size=1024 --base_learning_rate=0.0002 --netvlad_cluster_size=256 --netvlad_hidden_size=1024 --moe_l2=1e-6 --iterations=300 --learning_rate_decay=0.8 --netvlad_relu=False --gating=True --moe_prob_gating=True --lightvlad=True --run_once=True --top_k=50

@wincle
Copy link

wincle commented Aug 31, 2017

The eval command is almost same with the inference as I have tried it and get a return result like :
INFO:tensorflow:epoch/eval number 266878 | Avg_Hit@1: 0.902 | Avg_PERR: 0.795 | MAP: 0.168 | GAP: 0.8706 | Avg_Loss: 3.936493

@antoine77340
Copy link
Owner

yes your example is correct, is this working ?

@junfengluo
Copy link
Author

I am still in training process, I just want to know the eval command in advance. Ok, thanks very much anyway @antoine77340 .

@junfengluo
Copy link
Author

OK, thanks, I will also try it according to the inference command. @wincle

@junfengluo
Copy link
Author

@antoine77340 Hi, here is another question, If I use all the train and validation data to train these models with no eval process, are the test results reasonable? Have you tried it in your experiments ?

@junfengluo
Copy link
Author

Hi, when I used your inference code to the test data, I have met the errors as following: "tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value train_input/input_producer/limit_epochs/epochs". I did not revise the code anywhere. @antoine77340 @wincle

@antoine77340
Copy link
Owner

Hi @junfengluo, If you use the train and validation data to train the models without eval process,
the results would be very similar.

  • Can you send the inference command you did that trigger this error ?

@junfengluo
Copy link
Author

The inference command is same with your command and I just copied it, I also don't know where is the problem. For example in ''inference-GRU'', my command is : python ../inference.py --output_file=test-GRU-0002-1200.csv --input_data_pattern="/data/test/test*.tfrecord" --model=GruModel --train_dir=GRU-0002-1200 --frame_features=True --feature_names="rgb,audio" --feature_sizes="1024,128" --batch_size=1024 --base_learning_rate=0.0002 --gru_cells=1200 --learning_rate_decay=0.9 --moe_l2=1e-6 --run_once=True --top_k=20.
The error message :
image

@antoine77340
Copy link
Owner

Hmm it is strange I do not understand this error (I am still not very good at understanding Tensorflow error code ahah). I tried to re-run this inference command with the latest TF version and it seems to work on my side. Are you sure you correctly trained the model and that at least one model is correctly exported ?

@junfengluo
Copy link
Author

Yeah, I also trained the models with TF 1.3.0, I am sure the GRU model is trained correctly by 300000 steps. There have two models are still in training in two single GPU, I don't know how to solve this problem. Are these 7 models affect each other when execute the inference command ?

@wincle
Copy link

wincle commented Sep 4, 2017

I haven't met that problem , it's all right for me to inference or evaluate.

@junfengluo
Copy link
Author

junfengluo commented Sep 13, 2017

Hello, can you tell me how to transform the video id such as "-1VnJGJ6c2U" to a integer which is showed in result *.csv file.

@junfengluo
Copy link
Author

I find that the code which transform the video id into integer is mainly about two sentence in the inference.py as :
1,video_id_batch, video_batch, num_frames_batch = get_input_data_tensors(reader, data_pattern, batch_size).
2,video_id_batch_val, video_batch_val,num_frames_batch_val = sess.run([video_id_batch, video_batch, num_frames_batch]).
where the video_id_batch_val is the integer. But I don't the details, can you tell me? @wincle @antoine77340

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants