Skip to content
This repository has been archived by the owner on Sep 11, 2021. It is now read-only.

Exception in thread "main" java.lang.StackOverflowError #8

Open
valera7979 opened this issue Feb 15, 2018 · 4 comments
Open

Exception in thread "main" java.lang.StackOverflowError #8

valera7979 opened this issue Feb 15, 2018 · 4 comments

Comments

@valera7979
Copy link

I got StackOverflowError when tested on such data. But I tested on 400K points (not 1M as in link).

Parameters was:

// PatchWork parameters
val epsilon = Array(30.1, 30.1)
val minPts = 1
val minCellInCluster = 10
val ratio = 0.0

@Mignastor
Copy link
Contributor

Hello,

I had no problem reading your data like that:
val dataRDD = sc.textFile("9_1M.csv").map(_.split(",")).map(s => Array(s(0).toDouble, s(1).toDouble)).cache

Did you have a RDD[Array[Double]]?

@valera7979
Copy link
Author

valera7979 commented Feb 19, 2018

Yes. I almost did not change anything in the code.
I only add .setMaster("local[4]") in SparkContext

Here is RDD:
// Reading and parsing Data val dataRDD: RDD[Array[Double]] = sc.textFile("datasets/9_1M.csv") .map(_.split(",")).map(s => Array(s(0).toDouble, s(1).toDouble)).cache

May be you use another code. Download latest from github and try to run.

Full error list:
Exception in thread "main" java.lang.StackOverflowError at scala.collection.SeqLike$class.size(SeqLike.scala:106) at scala.collection.AbstractSeq.size(Seq.scala:40) at scala.collection.mutable.Builder$class.sizeHint(Builder.scala:69) at scala.collection.mutable.ArrayBuffer.sizeHint(ArrayBuffer.scala:47) at scala.collection.TraversableLike$class.builder$1(TraversableLike.scala:240) at scala.collection.TraversableLike$class.map(TraversableLike.scala:243) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at scala.Array$.concat(Array.scala:243) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$getNearCell$2.apply(PatchWork.scala:164) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$getNearCell$2.apply(PatchWork.scala:164) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at ca.crim.spark.mllib.clustering.PatchWork.getNearCell(PatchWork.scala:164) at ca.crim.spark.mllib.clustering.PatchWork.innerCells(PatchWork.scala:189) at ca.crim.spark.mllib.clustering.PatchWork.expandCluster(PatchWork.scala:205) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$expandCluster$1.apply(PatchWork.scala:220) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$expandCluster$1.apply(PatchWork.scala:205) at scala.collection.immutable.List.foreach(List.scala:318) at ca.crim.spark.mllib.clustering.PatchWork.expandCluster(PatchWork.scala:205) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$expandCluster$1.apply(PatchWork.scala:220) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$expandCluster$1.apply(PatchWork.scala:205) at scala.collection.immutable.List.foreach(List.scala:318) at ca.crim.spark.mllib.clustering.PatchWork.expandCluster(PatchWork.scala:205) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$expandCluster$1.apply(PatchWork.scala:220) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$expandCluster$1.apply(PatchWork.scala:205)

/ / / // Repeats many times:
at scala.collection.immutable.List.foreach(List.scala:318) at ca.crim.spark.mllib.clustering.PatchWork.expandCluster(PatchWork.scala:205) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$expandCluster$1.apply(PatchWork.scala:220) at ca.crim.spark.mllib.clustering.PatchWork$$anonfun$expandCluster$1.apply(PatchWork.scala:205)

@valera7979
Copy link
Author

image

@lccmpn
Copy link

lccmpn commented Mar 29, 2018

I'have got the same exception in the same recursive function with a 60 million 2d points dataset with the following parameters:

  • eps=20
  • minPoints=20
  • minCellInCluster=0
  • ratio=0

Have you resolved it in some way? @Mignastor can a tail recursion resolve the problem in your opinion?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants