Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update for PouchDB 2.x (and soon 3.x) #11

Open
nolanlawson opened this issue Jun 12, 2014 · 18 comments
Open

Update for PouchDB 2.x (and soon 3.x) #11

nolanlawson opened this issue Jun 12, 2014 · 18 comments

Comments

@nolanlawson
Copy link

Awesome plugin! But there are a few things that have changed in PouchDB 2.x and above (e.g. window.Pouch is no longer defined, only window.PouchDB), so this plugin would need an update to work with the latest version.

I wrote a PouchDB plugin seed that makes it super easy to get up and running (and testing!) your plugins, so you may want to just merge or rebase with that, and then work from there. Else a simple s/Pouch/PouchDB/g would probably also do the trick. :)

@ssured
Copy link

ssured commented Sep 16, 2014

I tried to revive this project in #12. Some steps are made, but quite some work is needed. After getting the code to run on PouchDB 3 I'm happy to help porting it into your plugin seed repo.

@natevw
Copy link
Owner

natevw commented Oct 10, 2014

Really sorry this has sat so long, I've still been swamped with client work and home life. I'm also a bit surprised at the attention this project ended up getting; what are people using/hoping to use this for?

Once I had the proof-of-concept going, I realized that:

  • CouchDB's natural syncing model isn't great for "collaborative editing" type stuff in general (one compelling use case for low-latency direct connection)
  • PouchDB — browser storage, that is — isn't great for huge archives of data (high bandwidth another good reason to directly connect peers)
  • it still requires a reliable, trusted, centralized signaling server (so without being great for the use cases above, it's unclear to me what good it is)

That's not to say it's not been or wouldn't continue to be a fun design to nail, or that I don't think the pieces here are useful (#6) or even that the combination is provably useless, it's just that most all the projects I was thinking would be fun to build on PeerPouch didn't seem to be a good fit after all and so it's tumbled downwards in my "priority queue" :-(

To put it [unnecessarily] rudely: pep talk or GTFO ;-)

@nolanlawson
Copy link
Author

I always sat in the back of the gym during pep assemblies, but I'll try a pep talk. ;)

  • Collaborative editing can be made better in CouchDB if you do the every-doc-a-delta approach (e.g. see delta-pouch)
  • PouchDB is getting better at storing large volumes of data; recent releases have really focused on attachments. In Chrome 38+, Firefox, and IE binary attachments take up the same size on disk that a file would, and the md5sum is calculated without blocking the DOM
  • The centralized server thing is a standard problem in p2p systems; not aware of any way you can avoid it. :/

That being said, we're also making progress on turning CouchDB replication into a streaming protocol to make it easier to replicate over arbitrary transfer mechanisms (see pouchdb-replication-stream). My goal with that is to eventually support bluetooth, same-wifi p2p, you name it. WebRTC would be an excellent addition.

@natevw
Copy link
Owner

natevw commented Oct 10, 2014

Thanks Nolan! delta-pouch looks cool, and if that's already pretty solid I can see that being a great use case for PeerPouch.

A replication protocol would be a great way to go; I considered exchanging HTTP-ish messages like @tilgovi did for his Chrome plugin but in the end getting the RPC hack working was quicker and more interesting. Having PeerPouch simply coordinate/transport pouchdb-replication-stream over WebRTC data channels might be a more "production grade" approach — focusing on syncing two databases instead of remote controlling one.

(That reminds me of the other shortcoming I saw in PeerPouch: given its purpose, PouchDB reasonably has [had?] no concept of authentication or even document validation. So exposing a database over PeerPouch is a major trust exercise without any clear remedy.)

@ssured
Copy link

ssured commented Oct 13, 2014

Hi Nate, My pep talk consists of the following two use cases:

First: it's a very easy way to cheaply scale a service. Eg. I am developing an online service for issue tracking in the construction industry. PouchDB is used for collecting photo's on site using a tablet, which are synced to a CouchDB. A user can then track development on the issues on their desktop. The service is paid for, but is also offered for free (with less options). Still the free plan uses server resources, as a central CouchDB is hosted, and thus costs money. I thought it would be nice if the desktop machine of a free-tier customer could be the main data store (Chrome offers 1gb+ of html5 storage). A mobile phone could then sync with their own desktop, without a costly server in between. Only once the customer is paying for the service, a central CouchDB is created, with all extra benefits.

Second: p2p syncing is really cool in developing countries / rural areas. Internet in for instance India is not available everywhere, yet everyone has a (simple) smartphone. Using filtered replication based on location, it's quite easy to create a decentralized local marketplace using PouchDB. The need for a centralized server is a real issue here though.

In the end, I think the ecosystem of PouchDB would benefit from having a p2p solution. My understanding of the subject is not sufficient to know wheter WebRTC or another approach would be the way to go. I invested some time to 1) learn from your code, 2) see if I could revive it again. Goal 2 was much harder than I thought though. I think there are plenty of use-cases for p2p, so this project really can have a future IMO.

@natevw
Copy link
Owner

natevw commented Oct 14, 2014

Thanks! Any ideas for the other big hole I remembered in my second comment above?

Namely, I remembered again the other big issue with using PeerPouch in production: exposing a database to a peer gives them 100% of your rights and privileges to it. They can do anything you can and any changes they make are indistinguishable from yours (especially if you're also doing an authenticated HTTP push replication of the same database).

Probably you could add some hashy-crypto-signy stuff to at least detect forged/tampered documents, but the damage is still there and I'm somewhat uncomfortable [and certainly demotivated] to spend time polishing a library that will be (for most purposes) hugely insecure by design AFAICT.

Basically my thinking is that:

  • the minimal "hook up WebRTC data channels" logic has value for its broad applicability (e.g. @quartzjer was interested in using it for @telehash at one point) although there's many other (albeit way less minimal) contenders e.g. peerjs
  • the "use any shareable PouchDB backend to set up a WebRTC data channel" has some value, although there's more to be desired as far as identity management (and privacy?) for production use
  • the "magically expose one of my pouches in someone else's browser" makes a fun demo, but it won't be useable without a shared (HTTP) server and it shouldn't be used outside the context of a trusted workgroup

Those "althoughs" and the big "but" snickers childishly are why this has sat waiting; it's waiting for when I or another kindly contributor have enough time to wager on this still ending up useful after all.

@natevw
Copy link
Owner

natevw commented Oct 14, 2014

(The irony of course is that I keep spending time writing out essays on "big plans" and "reasons why I won't follow through" when some of the requisite refactors are probably not much more time once I would get back into this codebase ;-)

@quartzjer
Copy link

I did a small webrtc-peer module to try and create a minimal reusable component that was distinct from larger efforts like peerjs, had it working but not in production yet as a transport for telehash, but will be getting back to it shortly :)

@vallieres
Copy link

Hello,

If I may chip in with a potential use of PeerPouch. I've been working on a mobile app that will be used in trains by employees. We already have much work done and the PouchDB clients sync with a CouchDB server pretty well. Over the course of a route, the connection can be lost but changes can still be recorded and when connection is back, it all syncs up nicely.

There is one other thing we would like to achieve. It would be nice if two devices could talk to each other and sync to one another when there is no cellular network connection (all devices are always on with local train Wi-Fi, but there are cell network dead zones across the country where the Wi-Fi router cannot connect via LTE/3G). The router has a service that can advertise other same-class devices, so I could have for example two devices (10.0.0.2 and 10.0.0.3) that want to sync up their data. My app already have the IP information of peers and I was wondering if peer-to-peer like that could be conceived with PeerTouch.

Now I know it's built to use a CouchDB instance as a signaling server, so what else would be require to bridge the gap in a no reception (no CouchDB) scenario?

@nolanlawson
Copy link
Author

AFAIK it's not possible to write a peer-to-peer system without some intermediary to at least allow the two peers to find each other.

Another thing you should be aware of is that this plugin doesn't work with the latest PouchDB due to breaking changes since 1.1 (when this was written). Fortunately there is now a pouchdb-replication-stream project that runs in both Node and the browser, and which can allow you to replicate two databases by just passing around streams of line-delimited JSON. I've been meaning to update PeerPouch to use it, but just haven't been able to find the time. 😛

But you could definitely use it to sync over wifi or WebRTC or whatever you want. You would still need the intermediary, though.

@natevw
Copy link
Owner

natevw commented Mar 23, 2015

The router has a service that can advertise other same-class devices, so I could have for example two devices (10.0.0.2 and 10.0.0.3) that want to sync up their data. My app already have the IP information of peers and I was wondering if peer-to-peer like that could be conceived with PeerTouch.

As @nolanlawson says, for all the fanfare around WebRTC as enabling "peer-to-peer communications" it is wrapped in many layers of protocol and security concerns and does not simply give web apps the ability a native app would have to arbitrarily open a direct socket connection to a peer.

I'm not going to be 100% accurate in all the details here without specific research, but to establish a WebRTC connection, you basically need to get both an "offer" and an "answer" between the two devices through a separate (often "centralized") channel. This involves not just the IP address but also a (probably dynamically assigned) port and — IIRC — a nonce as well. (I played a bit with pre-generating offers but something didn't go well there. I've now more experience with the SIP protocol and terminology which influences WebRTC greatly and might be able to better troubleshoot if I tried again now, but that's just my recollection.) [xref: https://www.ietf.org/rfc/rfc3264.txt may (or may not) be relevant here.]

Anyway, to put this practically, in cases like yours you probably want at least one of the devices to be running an HTTP server on a hard-coded port. This could be CouchDB on a permanently-installed Raspberry Pi, or Couchbase Lite running in a native app on one of the phones, or perhaps a custom signaling service as a Chrome extension, or whatever — but the peers will need to first establish contact with the same one of these (perhaps scanning through the known IPs in some deterministic fashion) and use that to communicate their offers/answers.

There may be more creative solutions, but to be honest that sounds like something I'd do as a consultant in particular cases, rather than a generic solution — certainly not in this library. As you may have read in the conversations above, this library was something of an experiment/demo that may not be very useful in real life. If you have the infrastructure needed for "traditional" setup of a WebRTC connection, you likely have what you need to just do "traditional" PouchDB or Couchbase Lite replications.

@natevw
Copy link
Owner

natevw commented Mar 23, 2015

@vallieres It does strikes me in your case (and maybe this is what you were showing me!) that you could get all the connections set up ahead of your signal loss and then the WebRTC connection may continue to work.

  1. At station start of route (or whenever signal present really) web apps could use a central server across WAN, to set up WebRTC connections over LAN
  2. Now regardless of WAN status, they can continue to sync over LAN

This is an interesting use case that would fit the WebRTC limitations, but I do wonder if it works in practice. For example, the first thing to test would be whether your browser keeps connections open even when device gets put to sleep in pocket. Otherwise if you wake up when in tunnel, etc. etc.

@natevw
Copy link
Owner

natevw commented Mar 23, 2015

Sorry I am rambling again here, but there's another counter-thought to the above:

  • On iOS, where you need sometimes a web app because Apple does not allow your idea to be published, the Safari browser does not have WebRTC support anyway.
  • on every other platform, it is not a big deal to share a native app with your users through side loading or less power-mad approval process.

So probably what you really want is https://github.com/couchbaselabs/Couchbase-Lite-PhoneGap-Plugin instead of this.

@vallieres
Copy link

Wow! You are not rambling at all! All this is very useful information! Thank you very much!

I will test the scenarios your provided. I particularly like the Raspberry Pi solution as this is something we can do for sure.

@deefactorial
Copy link

I have been doing some recent research into this issue here are my findings:

freedom.js has an API to help with the peer routing:

  • the framework has a social API to abstract access to social networks like gmail contacts or facebook friends to find to replicate with.
  • the framework has a transport API to abstract the routing for the WebRTC connection. (they developed a similar file transfer sample app that was developed for this project)
  • they have a storage API, but this is where I think PouchDB can do better.

I agree still that there needs to be authentication and ACL to control who, and what can be allowed to replicate, couchbase Sync gateway is a good example of how to implement an ACL structure to allow or disallow synchronizations to occur, another project is the thali project which has made great efforts in this direction with developing an identity, ACL and pouchdb replication over BLE.

Currently I have an nodejs API developed using couchbase server. I'm looking at converting that to PouchDB running that on the client with freedom.js and developing a replication protocol to replicate with peers, I'm open to any suggestions or things to check out.

@theavijitsarkar
Copy link

I have a web app and a mobile app, both runs pouchdb. I want to enable sync between them, without storing any info on my server.
I read that I need to use a signal server and pouch over rtc.

Can anybody guide me or help me out with this.

@natevw
Copy link
Owner

natevw commented Nov 27, 2020

@theavijitsarkar This prototype would have enabled that, but hasn't been maintained and so would probably take some updating to work with the latest PouchDB releases. Looks like as of about four years ago another other potential starting point at was https://github.com/pouchdb-community/pouchdb-replication-stream (via https://stackoverflow.com/a/40620835/179583).

I'd still be interested to getting this working and a little more polished, just haven't had an urgent need for it personally nor any client projects that would have reason to sponsor further development.

@natevw
Copy link
Owner

natevw commented Feb 23, 2023

So I happened across an interesting thing today (via GH feed after @developit starred it): https://github.com/pion/offline-browser-communication

This repo demonstrates how you can connect two WebRTC proccesses without signaling. No configuration is needed ahead of time, so no hardcoding of IP Addresses. The peers use mDNS to connect to each other, and have pre-set ICE Credentials and DTLS Certificates.

So maybe mDNS is doing too much of the heavy-lifting for use out on in the WAN web, but the "pre-set ICE Credentials" thing is something I wondered about and so it's interesting as a proof-of-concept of something that could put the "Peer" back into a Pouch-to-Pouch connection setup!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants