Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Ask timed out on on the latest docker image. #120

Open
ttashi opened this issue Jun 16, 2016 · 1 comment
Open

Getting Ask timed out on on the latest docker image. #120

ttashi opened this issue Jun 16, 2016 · 1 comment

Comments

@ttashi
Copy link

ttashi commented Jun 16, 2016

I am getting for following timeout message on invoking a job. Any Help will be appreciated

{
"status": "ERROR",
"result": {
"message": "Ask timed out on [Actor[akka://JobServer/user/context-supervisor/xxxx#-1022892173]] after [20000 ms]",
"errorClass": "akka.pattern.AskTimeoutException",
"stack": ["akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)", "akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)", "scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)", "scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)", "akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)", "akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)", "akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)", "akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)", "java.lang.Thread.run(Thread.java:745)"]
}

I am using the velvia/spark-jobserver:0.6.2.mesos-0.28.1.spark-1.6.1 and at the bottom I have added the jobserver.conf.

Below is the stack strace.

ERROR .jobserver.JobManagerActor [] [] - About to restart actor due to exception:
java.util.concurrent.TimeoutException: Futures timed out after [3 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread$$anon$3.block(ThreadPoolBuilder.scala:169)
at scala.concurrent.forkjoin.ForkJoinPool.managedBlock(ForkJoinPool.java:3640)
at akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread.blockOn(ThreadPoolBuilder.scala:167)
at scala.concurrent.Await$.result(package.scala:107)
at spark.jobserver.JobManagerActor$$anonfun$startJobInternal$1.apply$mcV$sp(JobManagerActor.scala:200)
at scala.util.control.Breaks.breakable(Breaks.scala:37)
at spark.jobserver.JobManagerActor.startJobInternal(JobManagerActor.scala:192)
at spark.jobserver.JobManagerActor$$anonfun$wrappedReceive$1.applyOrElse(JobManagerActor.scala:144)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at ooyala.common.akka.ActorStack$$anonfun$receive$1.applyOrElse(ActorStack.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at ooyala.common.akka.Slf4jLogging$$anonfun$receive$1$$anonfun$applyOrElse$1.apply$mcV$sp(Slf4jLogging.scala:26)
at ooyala.common.akka.Slf4jLogging$class.ooyala$common$akka$Slf4jLogging$$withAkkaSourceLogging(Slf4jLogging.scala:35)
at ooyala.common.akka.Slf4jLogging$$anonfun$receive$1.applyOrElse(Slf4jLogging.scala:25)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at ooyala.common.akka.ActorMetrics$$anonfun$receive$1.applyOrElse(ActorMetrics.scala:24)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at ooyala.common.akka.InstrumentedActor.aroundReceive(InstrumentedActor.scala:8)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Jobserver.conf

Template for Spark Job Server Docker config

You can easily override the spark master through SPARK_MASTER env variable

Spark Cluster / Job Server configuration

spark {

master = "local[4]"
master = ${?SPARK_MASTER}

Default # of CPUs for jobs to use for Spark standalone cluster

job-number-cpus = 4

jobserver {
port = 8090
jobdao = spark.jobserver.io.JobSqlDAO

context-per-jvm = true
context-init-timeout = 90s

sqldao {
  # Directory where default H2 driver stores its data. Only needed for H2.
  rootdir = /database

  # Full JDBC URL / init string.  Sorry, needs to match above.
  # Substitutions may be used to launch job-server, but leave it out here in the default or tests won't pass
  jdbc.url = "jdbc:h2:file:/database/h2-db"
}

}

predefined Spark contexts

contexts {

my-low-latency-context {

num-cpu-cores = 1 # Number of cores to allocate. Required.

memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, 1G, etc.

}

# define additional contexts here

}

universal context configuration. These settings can be overridden, see README.md

context-settings {
num-cpu-cores = 2 # Number of cores to allocate. Required.
memory-per-node = 1024m # Executor memory per node, -Xmx style eg 512m, #1G, etc.

# in case spark distribution should be accessed from HDFS (as opposed to being installed on every mesos slave)
# spark.executor.uri = "hdfs://namenode:8020/apps/spark/spark.tgz"

# uris of jars to be loaded into the classpath for this context. Uris is a string list, or a string separated by commas ','
# dependent-jar-uris = ["file:///some/path/present/in/each/mesos/slave/somepackage.jar"]

# If you wish to pass any settings directly to the sparkConf as-is, add them here in passthrough,
# such as hadoop connection settings that don't use the "spark." prefix
passthrough {
  #es.nodes = "192.1.1.1"
}

}

This needs to match SPARK_HOME for cluster SparkContexts to be created successfully

home = "/usr/local/spark"
}

akka {
remote.netty.tcp {
# This controls the maximum message size, including job results, that can be sent
maximum-frame-size = 30 MiB
}
}

spray.can.server {

uncomment the next line for making this an HTTPS example

ssl-encryption = on

idle-timeout = 210 s
request-timeout = 200 s
pipelining-limit = 2 # for maximum performance (prevents StopReading / ResumeReading messages to the IOBridge)

Needed for HTTP/1.0 requests with missing Host headers

default-host-header = "spray.io:8765"
parsing.max-content-length = 400m
}

client {

The time period within which the TCP connecting process must be completed.

Set to infinite to disable.

connecting-timeout = 10s
}

@CalebFenton
Copy link

The main error is this:

java.util.concurrent.TimeoutException: Futures timed out after [3 seconds]

Try adjusting the timeout from 3 seconds to something higher here: https://github.com/spark-jobserver/spark-jobserver/blob/master/job-server/src/spark.jobserver/JobManagerActor.scala#L217

It doesn't appear to be configurable anywhere unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants