Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DataCap Application] <Starling Lab> - <USC Shoah Foundation's Dimensions in Testimony> #62

Open
2 tasks
anstco opened this issue Aug 5, 2024 · 27 comments
Assignees
Labels

Comments

@anstco
Copy link

anstco commented Aug 5, 2024

Data Owner Name

The Starling Lab

Data Owner Country/Region

United States

Data Owner Industry

Education & Training

Website

https://www.starlinglab.org/

Social Media Handle

https://www.linkedin.com/company/starlinglab/

Social Media Type

Other

What is your role related to the dataset

Dataset Owner

Total amount of DataCap being requested

5.6 PiB

Expected size of single dataset (one copy)

2.8 PiB

Number of replicas to store

2

Weekly allocation of DataCap requested

500 TiB

On-chain address for first allocation

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

Data Type of Application

Private Non-Profit / Social impact

Custom multisig

  • Use Custom Multisig

Identifier

No response

Share a brief history of your project and organization

The Stanford / USC Starling Lab for Data Integrity is the first academic lab focused on applied research on web3 and human rights.

Is this project associated with other projects/ecosystem stakeholders?

Yes

If answered yes, what are the other projects/ecosystem stakeholders

DARMA Capital will assist with the Initial Pledge

Describe the data being stored onto Filecoin

Starling Lab will host curated collections in collaboration with USC Libraries. Initial collections include the USC Shoah Foundation's Dimensions in Testimony, which include XR and volumetric video footage.

Where was the data currently stored in this dataset sourced from

My Own Storage Infra

If you answered "Other" in the previous question, enter the details here

No response

If you are a data preparer. What is your location (Country/Region)

None

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details?

No response

If you are not preparing the data, who will prepare the data? (Provide name and business)

We're working with ecosystem partners like PiKNiK to prepare the data into CAR files etc.

Has this dataset been stored on the Filecoin network before? If so, please explain and make the case why you would like to store this dataset again to the network. Provide details on preparation and/or SP distribution.

N/A

Please share a sample of the data

https://sfi.usc.edu/dit

Confirm that this is a public dataset that can be retrieved by anyone on the Network

  • I confirm

If you chose not to confirm, what was the reason

This footage is available for researchers who are approved by USC.

What is the expected retrieval frequency for this data

Sporadic

For how long do you plan to keep this dataset stored on Filecoin

More than 3 years

In which geographies do you plan on making storage deals

North America

How will you be distributing your data to storage providers

Others

How did you find your storage providers

Partners

If you answered "Others" in the previous question, what is the tool or platform you used

No response

Please list the provider IDs and location of the storage providers you will be working with.

1. Krates AI - North America

How do you plan to make deals to your storage providers

Boost client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

Copy link
Contributor

datacap-bot bot commented Aug 5, 2024

Application is waiting for allocator review

@anstco anstco changed the title [DataCap Application] <Organization> - <Project Name> [DataCap Application] The Starling Lab - USC Libraries // Shoah Foundation Aug 5, 2024
datacap-bot bot added a commit that referenced this issue Aug 5, 2024
@kevzak kevzak changed the title [DataCap Application] The Starling Lab - USC Libraries // Shoah Foundation [DataCap Application] <Starling Lab> - <USC Shoah Foundation's Dimensions in Testimony> Aug 5, 2024
datacap-bot bot added a commit that referenced this issue Aug 5, 2024
@kevzak
Copy link
Contributor

kevzak commented Aug 5, 2024

Hi @anstco thank you for applying.

A few questions:

So you have a 2.8 PiB dataset. Are the four copies supposed to be 11.2 PiBs total? You listed 5.6, just want to confirm the total ask.

Can you confirm if these previous LDN applications are related to this dataset just to understand what has been stored already?
filecoin-project/filecoin-plus-large-datasets#53
filecoin-project/filecoin-plus-large-datasets#2085

Can you confirm SPs involved with this project? We ask for minerID, entity name, location. Two copies and two entities are required.

@anstco
Copy link
Author

anstco commented Aug 5, 2024

Hi @kevzak great to hear from you again.

  1. 5.6 total. I could not choose a value lower than 4.
  2. Confirmed these are net-new data cap applications. This is specific to the USC Libraries and contains some Shoah Foundation data as well.
  3. f03046733 - Krates AI - North America

datacap-bot bot added a commit that referenced this issue Aug 6, 2024
@kevzak
Copy link
Contributor

kevzak commented Aug 6, 2024

OK, great @anstco. Is there a second entity/SP involved in storing 2nd copy? Need at least 2

@kevzak kevzak self-assigned this Aug 6, 2024
@anstco
Copy link
Author

anstco commented Aug 16, 2024

Hi @kevzak

Here's the updated list with two copies:

  1. f03046733 - USC - North America
  2. f03112580 - Krates AI - North America

@kevzak
Copy link
Contributor

kevzak commented Aug 19, 2024

Thanks @anstco - last step, I ask for is KYB (Business Check) of your client. Can you please complete this form for our records?

After, as a trusted client you are eligible for 5% of total request as a first allocation (300TiB). As each allocation reaches 75% usage, the deals will be reviewed and datacap topped off in larger allocations.

Copy link
Contributor

datacap-bot bot commented Aug 19, 2024

Datacap Request Trigger

Total DataCap requested

5.6 PiB

Expected weekly DataCap usage rate

500 TiB

DataCap Amount - First Tranche

300TiB

Client address

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

Copy link
Contributor

datacap-bot bot commented Aug 19, 2024

DataCap Allocation requested

Multisig Notary address

Client address

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

DataCap allocation requested

300TiB

Id

62e198f1-ae18-4ee2-b802-cd3b404298e6

Copy link
Contributor

datacap-bot bot commented Aug 19, 2024

Application is ready to sign

@anstco
Copy link
Author

anstco commented Aug 19, 2024

Thanks @kevzak. KYB Application submitted.

@kevzak
Copy link
Contributor

kevzak commented Aug 19, 2024

Confirmed

Copy link
Contributor

datacap-bot bot commented Aug 19, 2024

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedg27krs46pdahyhg2fbx2ycrldfigc42ryoopzzp6mjkiqcaseks

Address

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

Datacap Allocated

300TiB

Signer Address

f1v24knjbqv5p6qrmfjj5xmlaoddzqnon2oxkzkyq

Id

62e198f1-ae18-4ee2-b802-cd3b404298e6

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedg27krs46pdahyhg2fbx2ycrldfigc42ryoopzzp6mjkiqcaseks

Copy link
Contributor

datacap-bot bot commented Aug 19, 2024

Application is Granted

@martplo
Copy link
Contributor

martplo commented Nov 26, 2024

checker:manualTrigger

Copy link
Contributor

datacap-bot bot commented Nov 26, 2024

DataCap and CID Checker Report Summary1

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f03046733: 100.00%

⚠️ 1 storage providers have unknown IP location - f03046733

⚠️ All storage providers are located in the same region.

⚠️ 100.00% of Storage Providers have retrieval success rate equal to zero.

⚠️ 100.00% of Storage Providers have retrieval success rate less than 75%.

⚠️ The average retrieval success rate is 0.00%

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients2

✔️ No CID sharing has been observed.

Full report

Click here to view the CID Checker report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Copy link
Contributor

datacap-bot bot commented Dec 19, 2024

Client used 75% of the allocated DataCap. Consider allocating next tranche.

@martplo
Copy link
Contributor

martplo commented Dec 30, 2024

checker:manualTrigger

Copy link
Contributor

datacap-bot bot commented Dec 30, 2024

DataCap and CID Checker Report Summary1

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f03046733: 100.00%

⚠️ 1 storage providers have unknown IP location - f03046733

⚠️ All storage providers are located in the same region.

⚠️ 100.00% of Storage Providers have retrieval success rate equal to zero.

⚠️ 100.00% of Storage Providers have retrieval success rate less than 75%.

⚠️ The average retrieval success rate is 0.00%

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients2

✔️ No CID sharing has been observed.

Full report

Click here to view the CID Checker report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@martplo
Copy link
Contributor

martplo commented Dec 30, 2024

Hi @anstco,
Could you please provide proof of localization for the storage provider (SP) you used to store your data?
Additionally, I’d like to understand why only one SP was used for sealing, despite the declaration to use two, as mentioned in one of the previous comments.

@anstco
Copy link
Author

anstco commented Jan 13, 2025

Hi @martplo happy to provide some context. We always envisioned two SPs to store this data, but the other SP, KratesAI, had issues obtaining collateral for their replica copy.

Hyunsu Jung from DARMA is stepping in to assist their Krates obtaining collateral and can comment further.

@martplo martplo assigned martplo and unassigned kevzak Jan 16, 2025
@cryptowhizzard
Copy link

@anstco Love to discuss taking a copy for you, can Hyunsu connect us?

@martplo
Copy link
Contributor

martplo commented Jan 21, 2025

@anstco
After talking with Hyunsu, we agreed to grant a smaller allocation until the issue with the second SP is resolved.

The second allocation (15%) should be 860TiB, but we will split it into two tranches due to the above. One 430TiB and another the same after the issue is resolved.

I'm sorry you had to wait so long.

Copy link
Contributor

datacap-bot bot commented Jan 21, 2025

Application is in Refill

Copy link
Contributor

datacap-bot bot commented Jan 21, 2025

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecygh7ekqdrz3zae6lfuwjkkh77wqeqsjnbehsqlyxoosflbjpa6i

Address

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

Datacap Allocated

430TiB

Signer Address

f1msap4wvgzzv4xlzeq6kycmgx55ferfloxnt2rcy

Id

fd9c3b6e-8f99-46a8-89f5-9c5348bdbc0b

You can check the status here https://filfox.info/en/message/bafy2bzacecygh7ekqdrz3zae6lfuwjkkh77wqeqsjnbehsqlyxoosflbjpa6i

Copy link
Contributor

datacap-bot bot commented Jan 21, 2025

Application is Granted

@datacap-bot datacap-bot bot added granted and removed Refill labels Jan 21, 2025
@dampud
Copy link

dampud commented Jan 28, 2025

checker:manualTrigger

Copy link
Contributor

datacap-bot bot commented Jan 28, 2025

DataCap and CID Checker Report Summary1

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 50% of total datacap - f03046733: 100.00%

⚠️ 1 storage providers have unknown IP location - f03046733

⚠️ All storage providers are located in the same region.

⚠️ 100.00% of Storage Providers have retrieval success rate equal to zero.

⚠️ 100.00% of Storage Providers have retrieval success rate less than 75%.

⚠️ The average retrieval success rate is 0.00%

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients2

✔️ No CID sharing has been observed.

Full report

Click here to view the CID Checker report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants