Skip to content

Commit

Permalink
Merge pull request #38 from newzly/maven
Browse files Browse the repository at this point in the history
Feature/maven_central: Re-added Maven Central publishing. Improved Readm...
  • Loading branch information
alexflav23 committed Mar 8, 2014
2 parents 64e2e4d + 35824e4 commit ebcd0b9
Show file tree
Hide file tree
Showing 5 changed files with 291 additions and 46 deletions.
266 changes: 230 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,34 +7,36 @@ Using phantom
=============

The current version is: ```val phantomVersion = 0.1.5-SNAPSHOT```.
Phantom is published to Maven Central and it's actively and avidly developed.


Integrating phantom in your project
===================================

For most things, all you need is ```phantom-dsl```. Read through for information on other modules.

```scala
resolvers ++= Seq("newzly Releases" at "http://maven.newzly.com/repository/internal")
libraryDependencies ++= Seq(
"com.newzly" %% "phantom-dsl" % phantomVersion,
"com.newzly" %% "phantom-thrift" % phantomVersion
"com.newzly" %% "phantom-dsl" % phantomVersion
)
```

The full list of available modules is:

Basic data models and Thrift IDL definitions
======================

We use Apache Thrift extensively for our backend services. ```phantom``` is very easy to integrate with Thrift models and uses ```Twitter Scrooge``` to compile them. Thrift integration is optional and available via ```"com.newzly" %% "phantom-thrift" % phantomVersion```.

```thrift
namespace java com.newzly.phantom.sample.ExampleModel
stuct ExampleModel {
1: required i32 id,
2: required string name,
3: required Map<string, string> props,
4: required i32 timestamp
5: optional i32 test
}
```scala
libraryDependencies ++= Seq(
"com.newzly" %% "phantom-dsl" % phantomVersion,
"com.newzly" %% "phantom-cassandra-unit" % phantomVersion,
"com.newzly" %% "phantom-example" % phantomVersion,
"com.newzly" %% "phantom-thrift" % phantomVersion,
"com.newzly" %% "phantom-test" % phantomVersion,
"com.newzly" %% "phantom-finagle" % phantomVersion
)
```

If you don't want Thrift integration, you can simply use:
Basic models
======================

```scala
case class ExampleModel (
id: Int,
Expand All @@ -47,7 +49,6 @@ case class ExampleModel (

Data modeling with phantom
==========================


```scala

Expand All @@ -61,15 +62,14 @@ sealed class ExampleRecord private() extends CassandraTable[ExampleRecord, Examp
object id extends UUIDColumn(this) with PartitionKey[UUID]
object timestamp extends DateTimeColumn(this) with ClusteringOrder with Ascending
object name extends StringColumn(this)
object props extends MapColumn[String, String](this)
object props extends MapColumn[ExampleRecord, ExampleModel, String, String](this)
object test extends OptionalIntColumn(this)

override def fromRow(row: Row): ExampleModel = {
ExampleModel(id(row), name(row), props(row), timestamp(row), test(row));
}
}


```

Querying with Phantom
Expand All @@ -84,46 +84,237 @@ Phantom works with both Scala Futures and Twitter Futures. For the Twitter flavo
object ExampleRecord extends ExampleRecord {
override val tableName = "examplerecord"

// now define a session, which is a Cluster connection.
// now define a session, a normal Datastax cluster connection
implicit val session = SomeCassandraClient.session;

def getRecordsByName(name: String): Future[Seq[ExampleModel]] = {
ExampleRecord.select.where(_.name eqs name).fetch
}

def getOneRecordByName(name: String): Future[Option[ExampleModel]] = {
ExampelRecord.select.where(_.name eqs name).one()
def getOneRecordByName(name: String, someId: UUID): Future[Option[ExampleModel]] = {
ExampleRecord.select.where(_.name eqs name).and(_.id eqs someId).one()
}

// preserving order in Cassandra is not the simplest thing, but:
def getRecordPage(start: Int, limit: Int): Future[Seq[ExampleModel]] = {
ExampleRecord.select.skip(start).limit(10).fetch
}
```

Partial selects
===============

All partial select queries will return Tuples and are therefore limited to 22 fields.
This will change in Scala 2.11 and phantom will be updated once cross version compilation is enabled.

```scala
def getNameById(id: UUID): Future[Option[String]] = {
ExampleRecord.select(_.name).where(_.id eqs someId).one()
}

def getNameAndPropsById(id: UUID): Future[Option(String, Map[String, String])] {
ExampleRecord.select(_.name, _.props).where(_.id eqs someId).one()
}
```

Collection operators
====================

phantom supports CQL 3 modify operations for CQL 3 collections: ```list, set, map```.

It works as you would expect it to:

List operators: ```prepend, prependAll, append, appendAll, remove, removeAll```

```scala

ExampleRecord.update.where(_.id eqs someId).modify(_.someList prepend someItem).future()
ExampleRecord.update.where(_.id eqs someId).modify(_.someList prependAll someItems).future()

ExampleRecord.update.where(_.id eqs someId).modify(_.someList append someItem).future()
ExampleRecord.update.where(_.id eqs someId).modify(_.someList appendAll someItems).future()

ExampleRecord.update.where(_.id eqs someId).modify(_.someList remove someItem).future()
ExampleRecord.update.where(_.id eqs someId).modify(_.someList removeAll someItems).future()

```

Set operators: ```append, appendAll, remove, removeAll```
Map operators: ```put, putAll```

For working examples, see [ListOperatorsTest.scala](https://github.com/newzly/phantom/blob/develop/phantom-test/src/test/scala/com/newzly/phantom/dsl/crud/ListOperatorsTest.scala) and [MapOperationsTest.scala](https://github.com/newzly/phantom/blob/develop/phantom-test/src/test/scala/com/newzly/phantom/dsl/crud/MapOperationsTest.scala).


Automated schema generation
===========================

Replication strategies and more advanced features are not yet available in phantom, but CQL 3 Table schemas are automatically generated from the Scala code. To create a schema in Cassandra from a table definition:

```scala

import scala.concurrent.Await
import scala.concurrent.duration._

Await.result(ExampleRecord.create().future(), 5000 millis)
```

Of course, you don't have to block unless you want to.


Partition tokens, token functions and paginated queries
======================================================

```scala

import scala.concurrent.Await
import scala.concurrent.duration._
import com.newzly.phantom.Implicits._

sealed class ExampleRecord2 private() extends CassandraTable[ExampleRecord2, ExampleModel] with LongOrderKey[ExampleRecod2, ExampleRecord] {

object id extends UUIDColumn(this) with PartitionKey[UUID]
object timestamp extends DateTimeColumn(this)
object name extends StringColumn(this)
object props extends MapColumn[ExampleRecord2, ExampleRecord, String, String](this)
object test extends OptionalIntColumn(this)

override def fromRow(row: Row): ExampleModel = {
ExampleModel(id(row), name(row), props(row), timestamp(row), test(row));
}
}


val orderedResult = Await.result(Articles.select.where(_.id gtToken one.get.id ).fetch, 5000 millis)

```
For more details on how to use Cassandra partition tokens, see [SkipRecordsByToken.scala]( https://github.com/newzly/phantom/blob/develop/phantom-test/src/test/scala/com/newzly/phantom/dsl/SkipRecordsByToken.scala)


Cassandra Time Series
=====================

phantom supports Cassandra Time Series with both ```java.util.Date``` and ```org.joda.time.DateTime ```. To use them, simply mixin ```com.newzly.phantom.keys.ClusteringOrder``` and either ```Ascending``` or ```Descending```.

Restrictions are enforced at compile time.

```scala

import com.newzly.phantom.Implicits._

sealed class ExampleRecord3 private() extends CassandraTable[ExampleRecord3, ExampleModel] with LongOrderKey[ExampleRecod3, ExampleRecord] {

object id extends UUIDColumn(this) with PartitionKey[UUID]
object timestamp extends DateTimeColumn(this) with ClusteringOrder with Ascending
object name extends StringColumn(this)
object props extends MapColumn[ExampleRecord2, ExampleRecord, String, String](this)
object test extends OptionalIntColumn(this)

override def fromRow(row: Row): ExampleModel = {
ExampleModel(id(row), name(row), props(row), timestamp(row), test(row));
}
}
```

Automatic schema generation can do all the setup for you.


Composite keys
==============
Phantom also supports using composite keys out of the box. The schema can once again by auto-generated.

A table can have only one ```PartitionKey``` but several ```PrimaryKey``` definitions. Phantom will use these keys to build a composite value. Example scenario, with the composite key: ```(id, timestamp, name)```

```scala

import org.joda.time.DateTime
import com.newzly.phantom.Implicits._

sealed class ExampleRecord3 private() extends CassandraTable[ExampleRecord3, ExampleModel] with LongOrderKey[ExampleRecod3, ExampleRecord] {

object id extends UUIDColumn(this) with PartitionKey[UUID]
object timestamp extends DateTimeColumn(this) with PrimaryKey[DateTime]
object name extends StringColumn(this) with PrimaryKey[String]
object props extends MapColumn[ExampleRecord2, ExampleRecord, String, String](this)
object test extends OptionalIntColumn(this)

override def fromRow(row: Row): ExampleModel = {
ExampleModel(id(row), name(row), props(row), timestamp(row), test(row));
}
}
```

CQL 3 index and non-primary index columns
=========================================

When you want to use a column in a ```where``` clause, you need an index on it. Cassandra data modeling is out of the scope of this writing, but phantom offers ```com.newzly.phantom.Keys.SecondaryKey``` to enable querying.

The CQL 3 schema for secondary indexes can also be auto-generated with ```ExampleRecord4.create()```.

```scala

import org.joda.time.DateTime
import com.newzly.phantom.Implicits._

sealed class ExampleRecord4 private() extends CassandraTable[ExampleRecord4, ExampleModel] with LongOrderKey[ExampleRecod4, ExampleRecord] {

object id extends UUIDColumn(this) with PartitionKey[UUID]
object timestamp extends DateTimeColumn(this) with SecondaryKey[DateTime]
object name extends StringColumn(this) with SecondaryKey[String]
object props extends MapColumn[ExampleRecord2, ExampleRecord, String, String](this)
object test extends OptionalIntColumn(this)

override def fromRow(row: Row): ExampleModel = {
ExampleModel(id(row), name(row), props(row), timestamp(row), test(row));
}
}
```


Large record sets
=================
Asynchronous iterators for large record sets
============================================

Phantom comes packed with CQL rows asynchronous lazy iterators to help you deal with billions of records.
phantom iterators are based on Play iterators with very lightweight integration.

The functionality is identical with respect to asyncrhonous, lazy behaviour and available methods.
For more on this, see this [Play tutorial](
http://mandubian.com/2012/08/27/understanding-play2-iteratees-for-normal-humans/)


Usage is trivial:

```scala
ExampleRecord.select.fetchEnumerator.foreach {
item => println(item.toString)
}

import scala.concurrent.Await
import scala.concurrent.duration._
import com.newzly.phantom.Implicits._

val enumerator = Await.result(ExampleRecord.select.where(_.timestamp gtToken someTimestamp).fetchEnumerator(), 5000 millis)

```

Batch statements
================

phantom also brrings in support for batch statements. To use them, see [IterateeBigTest.scala]( https://github.com/newzly/phantom/blob/develop/phantom-test/src/test/scala/com/newzly/phantom/iteratee/IterateeBigTest.scala)

We have tested with 10,000 statements per batch, and 1000 batches processed simulatenously.
We have tested with 10,000 statements per batch, and 1000 batches processed simulatenously. Before you run the test, beware that it takes ~40 minutes.

Batches use lazy iterators and daisy chain them to offer thread safe behaviour. They are not memory intensive and you can expect consistent processing speed even with 1 000 000 statements per batch.


Thrift integration
==================

We use Apache Thrift extensively for our backend services. ```phantom``` is very easy to integrate with Thrift models and uses ```Twitter Scrooge``` to compile them. Thrift integration is optional and available via ```"com.newzly" %% "phantom-thrift" % phantomVersion```.

```thrift
namespace java com.newzly.phantom.sample.ExampleModel
stuct ExampleModel {
1: required i32 id,
2: required string name,
3: required Map<string, string> props,
4: required i32 timestamp
5: optional i32 test
}
```

Maintainers
===========
Expand All @@ -140,7 +331,7 @@ Special thanks to Viktor Taranenko from WhiskLabs, who gave us the original idea

Copyright
=========
Copyright 2013 WhiskLabs, Copyright 2013 - 2014 newzly ltd.
Copyright 2013 WhiskLabs, Copyright 2013 - 2014 newzly.


Contributions
Expand All @@ -149,3 +340,6 @@ Contributions
Contributions are most welcome!

To contribute, simply submit a "Pull request" via GitHub.

We use GitFlow as a branching model and SemVer for versioning.

Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ trait ExecutableQuery[T <: CassandraTable[T, _], R] extends CassandraResultSetOp
* @param ctx The Execution Context.
* @return
*/
def fetchEnumerator(implicit session: Session, ctx: scala.concurrent.ExecutionContext): ScalaFuture[PlayEnumerator[R]] = {
def fetchEnumerator()(implicit session: Session, ctx: scala.concurrent.ExecutionContext): ScalaFuture[PlayEnumerator[R]] = {
future() map {
resultSet => {
Enumerator.enumerator(resultSet) through Enumeratee.map(r => this.fromRow(r))
Expand All @@ -66,9 +66,7 @@ trait ExecutableQuery[T <: CassandraTable[T, _], R] extends CassandraResultSetOp
* @param ctx The Execution Context.
* @return
*/
def one(implicit session: Session, ctx: scala.concurrent.ExecutionContext): ScalaFuture[Option[R]] = {
fetchEnumerator flatMap(_ run PlayIteratee.head)
}
def one()(implicit session: Session, ctx: scala.concurrent.ExecutionContext): ScalaFuture[Option[R]]

/**
* Returns a parsed sequence of [R]ows
Expand All @@ -77,7 +75,7 @@ trait ExecutableQuery[T <: CassandraTable[T, _], R] extends CassandraResultSetOp
* @param ctx The Execution Context.
* @return
*/
def fetch(implicit session: Session, ctx: scala.concurrent.ExecutionContext): ScalaFuture[Seq[R]] = {
def fetch()(implicit session: Session, ctx: scala.concurrent.ExecutionContext): ScalaFuture[Seq[R]] = {
fetchEnumerator flatMap(_ run Iteratee.collect())
}
}
}
Loading

0 comments on commit ebcd0b9

Please sign in to comment.