Skip to content

SnappyData OSS 1.0.1 Release

Compare
Choose a tag to compare
@ashetkar ashetkar released this 14 Feb 18:51
· 880 commits to master since this release

The SnappyData team is pleased to announce the availability of version 1.0.1 of the platform. You can find the release artifacts of its Community Edition towards the end of this page.

You can also download the Enterprise Edition here. The table below summarizes the features available in Enterprise and OSS editions.

Feature Community Enterprise
Mutable Row & Column Store X X
Compatibility with Spark X X
Shared Nothing Persistence & HA X X
REST API for Spark Job Submission X X
Fault Tolerance for Driver X X
JDBC Driver X X
CLI for backup, restore & export X X
Spark console extensions X X
System Perf/Behavior statistics X X
Support for transactions in Row tables X X
Support for indexing in Row Tables X X
SQL extensions for stream processing X X
Synopsis Data Engine for Approximate Querying X
ODBC Driver with High Concurrency X
Off-heap data storage for column tables X
CDC Stream receiver for SQL Server into SnappyData X
GemFire/Apache Geode connector X
LDAP security interface X

More details about the release:

New Features:

  • putInto and deleteFrom bulk operations support for column tables (SNAP-2092, SNAP-2093, SNAP-2094):
    • ability to specify "key columns" in the table DDL to use for putInto and deleteFrom APIs
    • "PUT INTO" SQL or putInto API extension to overwrite existing rows and insert non-existing ones
    • "DELETE FROM" SQL or deleteFrom API extension to delete a set of matching rows
    • UPDATE SQL now supports using expressions with column references of another table in RHS of SET
  • Improvements in cluster restart with off-line, failed nodes or with corrupt meta-data (SNAP-2096)
    • new admin command "unblock" to allow the initialization of a table even if it is waiting for offline members
    • retain data unlike revoke and initialize with the latest online working copy (SNAP-2143)
    • parallel recovery of data regions to break any cyclic dependencies between the nodes, and allow reporting on all off-line nodes that may have more recent copy of data
    • many bug-fixes related to startup issues due to meta-data inconsistencies:
      incorrect data conflicts (SNAP-2097, SNAP-2098), metadata corruption (SNAP-2140)
  • Compression of column batches in disk storage and over the network (SNAP-1743)
    • support for LZ4, SNAPPY compression codecs in disk storage and transport of column table data
    • new SOURCEPATH and COMPRESSION columns in SYS.HIVETABLES virtual table
  • Support for temporary, global temporary and persistent VIEWs (SNAP-2072):
    • CREATE VIEW, CREATE TEMPORARY VIEW and CREATE GLOBAL TEMPORARY VIEW DDLs
  • No jar dependencies in snappydata cluster for external datasources of smart connector (SNAP-2072)
  • External tables display in dashboard and snappy command-line (SNAP-2086)
  • Auto-configuration of SPARK_PUBLIC_DNS, hostname-for-clients etc in AWS environment (SNAP-2116)
  • GRANT/REVOKE SQL support in SnappySession.sql() earlier only allowed from JDBC/ODBC (SNAP-2042)
  • LATERAL VIEW support in SnappySession.sql() (SNAP-1283)
  • FETCH FIRST syntax as an alternative to LIMIT to support some SQL tools that use former
  • Addition of IndexStats in for local row table index lookup and range scans
  • SYS.DISKSTOREIDS virtual table to disk-store IDs being used in the cluster by all members (SNAP-2113)

Performance Enhancements:

  • Major performance improvements in smart connector mode (SNAP-2101, SNAP-2084)
    • minimized buffer copying, key lookups in column table rather than full scan for filters, reduce round-trips
    • allow using SnappyUnifiedMemoryManager with smart connector (SNAP-2084)
  • New memory and disk iterator to minimize faultins and serialize disk reads (SNAP-2102):
    • reduce faultins and cross-iterator serial disk reads per diskstore to minimize random reads from disk
    • new remote iterator that substantially reduces the memory overhead and caches only current batch
  • Startup performance improvements to cut down on locator/server/lead start and restart times (SNAP-338)
  • Improve performance of reads of variable length data for some queries (SNAP-2118)
  • Use colocated joins with VIEWs when possible (SNAP-2204)
  • Separate disk store for delta buffer regions to substantially improve column table compaction (SNAP-2121)
  • Projection push-down to scan layer for non-deterministic expressions like spark_partition_id() (SNAP-2036)
  • code-generation cache is larger by default and configurable (SNAP-2120)

Select bug fixes and performance related fixes:
A sample of bug fixes done as part of this release are noted below. For a more comprehensive list, see ReleaseNotes.txt.

  • Now only overflow-to-disk is allowed as eviction action for tables (SNAP-1501):
    • only overflow-to-disk is allowed as a valid eviction action and cannot be explicitly specified
    • OVERFLOW=false property can be used to disable eviction which is true by default
  • Memory accounting fixes:
    • incorrect initial memory accounting causing insert failure even with memory available (SNAP-2084)
    • zero usage shown in UI on restart (SNAP-2180)
  • Disable embedded Zeppelin interpreter in a secure cluster which can bypass security (SNAP-2191)
  • Fix import of JSON data (SNAP-2087)
  • selects missing results or failing during node failures (SNAP-889, SNAP-1547)
  • fixes and improvements to server and lead status in both the launcher status and SYS.MEMBERS table
    (SNAP-1960, SNAP-2060, SNAP-1645)
  • fix updates on complex types (SNAP-2141)
  • column table scan fixes related to null value reads (SNAP-2088)
  • disable tokenization for external tables, flags to disable it and plan caching (SNAP-2114, SNAP-2124)
  • deadlock in transactional operations with GII (SNAP-1950)
  • couple of fixes in UPDATE SQL: unexpected rollover (SNAP-2192), show as update count (SNAP-2156)
  • fixes ported from Apache Geode (GEODE-2109, GEODE-2240)
  • fixes to all failures in snappy-spark test suite which includes both product and test changes
  • more comprehensive python API testing (SNAP-2044)

Description of download artifacts:

Artifact Name Description
snappydata-1.0.1-bin.tar.gz Full product binary (includes Hadoop 2.7)
snappydata-1.0.1-without-hadoop-bin.tar.gz Product without the Hadoop dependency JARs
snappydata-client-1.6.1.jar Client (JDBC) JAR
snappydata-zeppelin-0.7.3.jar The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.7.3
snappydata-ec2-0.8.1.tar.gz Script to Launch SnappyData cluster on AWS EC2 instances