You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Train via InputMode.TENSORFLOW
In this mode, each worker will load the entire MNIST dataset into memory (automatically downloading the dataset if needed).
Train via InputMode.SPARK
In this mode, Spark will distribute the MNIST dataset (as CSV) across the workers, so each of the workers will see only a portion of the dataset per epoch. Also note that InputMode.SPARK currently only supports a single input RDD, so the validation/test data is not used.
Which mode would one want to use to achieve scalability and performance on larger datasets? I would think that the SPARK mode would be more efficient - ? But I'm confused by the subsequent comment, "currently only supports a single input RDD, so the validation/test data is not used." What does this mean? I wouldn't be able to use the validation and test datasets? But then how could I assess the accuracy of the model?
Thanks.
The text was updated successfully, but these errors were encountered:
Have you seen the wiki docs which explain the architecture details of TensorFlowOnSpark along with the FAQ?
Anyhow, at the most general level, TFoS basically just helps to stand up a distributed TensorFlow cluster on top of Spark executors... so to achieve scalability and performance on larger datasets, you really should understand and experiment with distributed TF first. The InputMode.SPARK is mostly a "compatibility" shim that allows you to use Spark RDDs w/ TF (with some limitations).
Could you provide an example/recipe for how to run a TensorFlow application on AWS EMR using TensorFlowOnSpark to achieve scale and distributability?
For example let's say I want to run this on AWS EMR using a cluster that EMR configures: https://www.tensorflow.org/recommenders/examples/basic_retrieval. It's a basic app from the Tensorflow Recommenders framework.
How would one run this example on TensorFlowOnSpark in EMR?
Also, looking at examples in TensorFlowOnSpark, specifically: https://github.com/yahoo/TensorFlowOnSpark/tree/master/examples/mnist/keras:
Which mode would one want to use to achieve scalability and performance on larger datasets? I would think that the SPARK mode would be more efficient - ? But I'm confused by the subsequent comment, "currently only supports a single input RDD, so the validation/test data is not used." What does this mean? I wouldn't be able to use the validation and test datasets? But then how could I assess the accuracy of the model?
Thanks.
The text was updated successfully, but these errors were encountered: