Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to develop a feasible server for the solution #109

Open
ghost opened this issue Nov 30, 2020 · 3 comments
Open

Unable to develop a feasible server for the solution #109

ghost opened this issue Nov 30, 2020 · 3 comments

Comments

@ghost
Copy link

ghost commented Nov 30, 2020

Apologies for raising this as an issue, I am actively seeking assistance with designing server architecture for this.

What I have done so far:

  • app is running smoothly and all the exposed methods are tested.
  • I am able to send epoch keys to the server when a user consensually wants to share his positive data.
  • I am able to receive those keys on all clients when clients request server data.
  • I am accurately getting the results of the contact.

What is the issue that I am facing?

  • When sending all users positive data to all clients, I will be facing heavy egress data charges.
  • I am unable to come up with a server design to reduce those charges.

My questions-

  • Since there is a production-ready application for this, I want to understand how was this obstacle removed.
  • Is there a way to send some epoch keys to the server while requesting the server DB data to reduce the amount of data sent to all the clients? Some type of keys on which the server can filter the data.

I appreciate any response and request to redirect this question to the proper channel where questions can be asked.

@eyalr0
Copy link
Collaborator

eyalr0 commented Dec 1, 2020

Hi,

First of all, it should be possible to use a CDN or other solution to allow users to download the "positive data" in a way that shouldn't cost a lot of money to the server.

If it is still an issue, there is a trade-off between the size of the data and the privacy of the infected person. Instead of sending epoch keys, you can send only day keys, and that will decrease the size of the data by a factor of 24. This is not supported in the current version of the app, but might be possible to do.
You can read the details of the protocol design and see the reference code at https://github.com/eyalr0/HashomerCryptoRef for more details.

@ghost
Copy link
Author

ghost commented Dec 1, 2020

@eyalr0, thanks for reaching out so promptly, and for the reference that you provided. I will surely go through it.

Below is my stress scenario on which I was estimating my cost:
Assuming there are 5Million users that are fetching the 50K users' positive reported data, where each record is approx 30Kb, total data going out per user is approx 1.5Gb.

Now if I server 1.5Gb to every user, I have huge egress charges as well as an app that incurs heavy data usage. (if not crashes.)
And that too we are talking about 1 API call.

My question is:

  • Am I miss calculating the stressful scenario or like over calculating the stress?
  • Is there a compatible backend reference that I can check.
  • Is there a way of mapping data on the server in a way in which we only send relevant data to every user. (more like a filter by matching some assigned identifiers without the privacy trade-off.)

Attaching the device data sent to the server for 14 day for reference.

@ghost
Copy link
Author

ghost commented Dec 9, 2020

@eyalr0, I read the reference document. I see that section 2.3 Operational Constraints, Point 4 clearly addresses the issue.
I think even with the day keys, the cost is still high for my scenario (I am using Cloudfront as my CDN).
If the update package is the same all the users for the entire day. We might as well cache it.

If I understand it correctly:

Assuming that client makes 1 API request per day to fetch the positive data and the request is made at the end of the day.

  • A new application user will always fetch data for cases reported on that day and never the ones reported before.
  • MoH (Server) does not need to send any data reported before the current day.

Questions:

  • Are the statements correct for the following assumption?
  • Is it possible to chunk the update package?
  • What could be the consequence of chunking the data and sending excess data over the period of the next few days.

I really appreciate your acknowledgment of the issue and taking out time to assist with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant