Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs - Multiple Classes in Jar, Custom Encoder, Package Class, Resubmit Conf, Debug and Absolute name of artifact & function. #468

Open
gowravshekar opened this issue Mar 27, 2018 · 12 comments
Labels

Comments

@gowravshekar
Copy link

  • How to add multiple functions from same jar.
    The packaged jar has multiple classes. How to write conf for multiple functions pointing to the respective classes.
  • How to write a custom encoder for case class which can is used in MistFn[CaseClass].
// Sample case class
case class CorrelationMatrix(headers: Array[String], values:Array[Array[Double]])
object CorrelationMatrix extends MistFn[CorrelationMatrix] {
    ...
}
  • How to add a class which has a package in class-name.
    The class has package like io.hydrosphere. Adding class-name = "io.hydrosphere.CorrelationMatrix$" doesn't work.
  • How to re-submit the function after code changes.
    After submitting the conf, there are few more code changes. If we submit the conf again, Error: Artifact key xxx.jar has to be unique. How to overwrite the artifact without manually deleting the data/artifacts/xxx.jar and data/functions/yyy.conf.
  • How to debug the spark job code.
  • How to prevent current user getting prefixed to artifact and function.
@dos65 dos65 added the docs label Mar 27, 2018
@dos65
Copy link
Contributor

dos65 commented Mar 29, 2018

Thanks for questions, they will help us to improve our documentation. For a start, I try to answer here

  • Multiple functions
    If your question was about mist-cli configuration, then you just need to create a conf file that points on class-name for each function that you want to deploy. For example for two functions A and B there should be two files:

    • a.conf
      model = Function
      name = a
      data {
         path = my_jar_0.0.1.jar
         class-name = "A$"
         context = default
      }
      
    • b.conf:
      model = Function
      name = b
      data {
       path = my_jar_0.0.1.jar
       class-name = "B$"
       context = default
      }
      
  • Custom encoders
    We are going to add encoder derivation for cases classes in future releases, so currently there are no other ways except to write it manually:

    import mist.api._
    import mist.api.Encoder
    import mist.api.data._
    
    case class MyResponse(x: Int, y: String)
    object MyResponse {
        implicit val myResponseEncoder = new Encoder[MyResponse] {
             override def aply(rsp: MyResponse): JsLikeData = {
                 JsLikeMap("x" -> JsLikeNumber(a.x), "s" -> JsLikeString(a.s))
             }
        }
    }
    
    object MyFn extends MistFn[MyResponse] { 
      ..
    } 
  • Package - I can't reproduce that problem. Are you sure that package you specified is correct and exists in jar?

  • Updating artifact:
    You can use mist-cli apply -f conf --validate true. But keep in mind: that action can affect in-progress functions. Also, there is an issue about artifact refreshing on workers Refresh shared worker after artifact update #437, so if you use shared context type you need to manually stop worker to apply changes or use exclusive.

  • Every job has logs - you can use them for debugging. In RC14 we improved them - now mist collects logs from spark too. There is also withMistExtras directive to obtain a logger inside function body

  • Passing empty -u argument should work: mist-cli apply -f conf -u ''. I think we should reconsider the default behavior of building names in mist-cli. @blvp, do you have any thoughts?

@dos65
Copy link
Contributor

dos65 commented Mar 29, 2018

Also, we have gitter room for questions.

@gowravshekar
Copy link
Author

@dos65, Thank you for the explanation. Really appreciate your time and consideration.

Package class works. The artifact wasn't refreshed when I added package to class. On restarting mist-master it worked.

I was not able to get the updating artifact work. Using mist-1.0.0-RC13

If I run mist-cli apply -f conf --validate true -u '', getting error - Artifact key xxx.jar has to be unique.

If I run mist-cli apply -f conf/correlation-matrix.conf --validate true -u '', getting error - Error: 400 Client Error: Bad Request for url: http://localhost:2004/v2/api/functions?force=False: class java.lang.IllegalStateException: Endpoint correlation-matrix already exists

With respect to debugging, I'm looking for a way to put breakpoint in code and debug. Similar to this.

@blvp
Copy link
Contributor

blvp commented Mar 30, 2018

Last error with function update was fixed in a new version of mist-cli, try to update it with following command:pip install mist-cli --upgrade

@gowravshekar
Copy link
Author

After upgrading,

mist-cli apply -f conf/correlation-matrix.conf --validate true -u '' - Works.

mist-cli apply -f conf --validate true -u '' - Getting same error message. Artifact key xxx.jar has to be unique

@dos65
Copy link
Contributor

dos65 commented Apr 2, 2018

@gowravshekar
About debugging - unfortunately, there is a bug with constructing spark-submit command (#472), so currently it's impossible to pass driver-java-options correctly, If you really need it you can implement manual runner and add into spark submit following argument --driver-java-options '-Xdebug -Xrunjdwp:transport=dt_socket,address=15000,server=y,suspend=y'

@blvp
Copy link
Contributor

blvp commented Apr 2, 2018

mist-cli apply -f conf --validate true -u '' - Getting same error message. Artifact key xxx.jar has to be unique

This is normal behavior because you can break all functions using that jar.
If you want to update jar with enabled validation you should change version config value and then change it in function. Reasons behind this are the following - apply method used for both development and release and this limitation is kind of our vision of release process.

Some additional notes.

You can use environment variables to manage artifact version. For example:
artifact.conf

model = Artifact
name = test-artifact
version = ${ARTIFACT_VERSION}
data.file-path = "./path/to/artifact.jar"

function.conf

model = 
data {
    ...
    path = test-artifact_${ARTIFACT_VERSION}.jar
    ...
}

and then ARTIFACT_VERSION=0.0.1 mist-cli apply -f conf/

@dos65
Copy link
Contributor

dos65 commented Apr 2, 2018

Oh, my mistake - --validate false instead of --validate true for unsafe update

@apoorv22
Copy link

@gowravshekar
About debugging - unfortunately, there is a bug with constructing spark-submit command (#472), so currently it's impossible to pass driver-java-options correctly, If you really need it you can implement manual runner and add into spark submit following argument --driver-java-options '-Xdebug -Xrunjdwp:transport=dt_socket,address=15000,server=y,suspend=y'

Does this bug still exist? Is there a way now to debug the spark job?

@dos65
Copy link
Contributor

dos65 commented Dec 3, 2018

@apoorv22 this one is fixed, you can use these options to debug spark job.
Also, you need to be aware of the following things:

  • your context should have precreated=true and maxParallelJobs=1settings. Otherwise, it will be problematical to start several workers and connect a debugger to the desired process.
  • breakpoints should suspend the current thread only, not all VM. When mist-master loses heartbeats from worker-process it marks it as failed. For example, by default IntelliJ sets breakpoints that suspend VM fully.

@gowravshekar
Copy link
Author

@blvp, Is there a way to use an environmental variable or config to use in data.file-path in artifact.conf?

Some thing similar as below:
data.file-path = "./path/to/artifact_${ARTIFACT_VERSION}.jar"

@blvp
Copy link
Contributor

blvp commented Dec 5, 2018

Yes, you can use environment variable here in a similar manner:
data.file-path="simple-name"${VERSION}".jar"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants