Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DataCap Refresh] <4th> Review of <EF> #263

Open
Lind111 opened this issue Dec 20, 2024 · 15 comments
Open

[DataCap Refresh] <4th> Review of <EF> #263

Lind111 opened this issue Dec 20, 2024 · 15 comments
Assignees
Labels
Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards Refresh Applications received from existing Allocators for a refresh of DataCap allowance

Comments

@Lind111
Copy link

Lind111 commented Dec 20, 2024

Basic info

  1. Type of allocator: [manual]
  1. Paste your JSON number: [1056]

  2. Allocator verification: [yes]

  1. Allocator Application
  2. Compliance Report
  1. Previous reviews

Current allocation distribution

Client name DC granted
globalnightlight 5 PiB
Art-related data 2.92 PiB

I. globalnightlight

  • DC requested: 8 PiB
  • DC granted so far: 8 PiB

II. Dataset Completion

aws s3 ls --no-sign-request s3://globalnightlight/

III. Does the list of SPs provided and updated in the issue match the list of SPs used for deals?

Client is disclosed in advance when adding new SPs
IV. How many replicas has the client declared vs how many been made so far:

10 vs 12
The client has explained accordingly and will keep an eye on the subsequent data distribution
image

V. Please provide a list of SPs used for deals and their retrieval rates

SP ID % retrieval Meet the >75% retrieval?
f01969779 62.52% NO
f02201190 58.94% NO
f03175168 73.40% NO
f03166677 71.81% NO
f03166688 66.83% NO
f02639492 79.26% YES
f02362412 75.03% YES
f02368946 0.00% NO
f03166666 60.83% NO
f03166668 0.00% NO
f03175111 71.30% NO
f2822222 35.76% NO
f03253497 25.59% NO
f03253580 90.69% YES

The issue of retrieval rate has been followed up, and no client has been found to send deal to SPs with 0 retrieval rate subsequently

Allocation summary

  1. Notes from the Allocator

Issues such as copies of appeals and retrieval rates are being followed up on an ongoing basis.

  1. Did the allocator report up to date any issues or discrepancies that occurred during the application processing?

yes

  1. What steps have been taken to minimize unfair or risky practices in the allocation process?

Regularly generate cid reports to follow up on data distribution

  1. How did these distributions add value to the Filecoin ecosystem?

are publicly available datasets that can be viewed by all at any time

  1. Please confirm that you have maintained the standards set forward in your application for each disbursement issued to clients and that you understand the Fil+ guidelines set forward in your application

yes
12. Please confirm that you understand that by submitting this Github request, you will receive a diligence review that will require you to return to this issue to provide updates.

yes

@Kevin-FF-USA Kevin-FF-USA added Refresh Applications received from existing Allocators for a refresh of DataCap allowance Awaiting Community/Watchdog Comment DataCap Refresh requests awaiting a public verification of the metrics outlined in Allocator App. labels Dec 20, 2024
@Kevin-FF-USA
Copy link
Collaborator

HI @Lind111,

Thanks for using the new template. Wanted to check in with timelines for you on this application.

With the Holiday break coming up the teams will be ooo until January which means you may not see a Watchdog or Governance comment until that time.

Warmly,
-Kevin

@filecoin-watchdog
Copy link
Collaborator

@Lind111
SmallArt Ltd.
The client did not update all the Storage Providers (SPs) used for deals.
For most SPs on the list, there was duplication, averaging about 10% per SP. Although this percentage is not high, it should still be monitored.
Out of 7 SPs:

  • 4 SPs have a retrieval success rate of 0%,
  • 1 SP has a retrieval rate of 7%,
  • The remaining 2 SPs have a retrieval rate of 41%.

WorldBankGroup
In the previous review, it was noted that the client is working with multiple allocators simultaneously. Was this issue discussed with the client after the review?
Of 14 SPs, only 2 have a retrieval success rate at a satisfactory level.
SP performance details: SPs F03166666, F02639492, F02368946, and F02362412 have duplication rates below 20%. While this rate is relatively low, it should still be monitored.
The allocator continues to run CID reports and requests explanations from the client when discrepancies are found.

AMEstadium
The data preparation information provided by the client is inaccurate. The allocator did not ask follow-up questions, even though the client mentioned that the dataset would be prepared in tar or zip files. These formats are not suitable for storing data on Filecoin.
Although the client claimed the data had not been stored before, at least two applications have been identified with identical data samples and descriptions:

The dataset is marked as "open," but its description suggests it is private and not publicly accessible. There is no index or verification to justify the reported 650 TiB of data.

Additional issues:

  • The allocator requires at least four SPs, but the client is currently using only three.
  • SPs f03238633 and f01531188 are using a VPN, which violates the application rules.

@filecoin-watchdog filecoin-watchdog added Awaiting Response from Allocator If there is a question that was raised in the issue that requires comment before moving forward. and removed Awaiting Community/Watchdog Comment DataCap Refresh requests awaiting a public verification of the metrics outlined in Allocator App. labels Jan 13, 2025
@Lind111
Copy link
Author

Lind111 commented Jan 14, 2025

@Lind111 SmallArt Ltd. The client did not update all the Storage Providers (SPs) used for deals.

Are you referring to the updates to be made to the application form?
Clients have disclosed in advance in the comments
Lind111/EF#34 (comment)
Lind111/EF#34 (comment)

  • 4 SPs have a retrieval success rate of 0%,
  • 1 SP has a retrieval rate of 7%,
  • The remaining 2 SPs have a retrieval rate of 41%.

I have been following up on the retrieval rate, and at this point the client has not provided a reasonable explanation and will not trigger a subsequent DC

I'm wondering about the open source retrieval tool that the client mentioned here, is this recognized?

WorldBankGroup In the previous review, it was noted that the client is working with multiple allocators simultaneously. Was this issue discussed with the client after the review?

This one I did notice, and I realized that the client was storing different data, so I didn't ask too much about it
image

Of 14 SPs, only 2 have a retrieval success rate at a satisfactory level.
Client explains database crash, it's fixed

AMEstadium The data preparation information provided by the client is inaccurate. The allocator did not ask follow-up questions, even though the client mentioned that the dataset would be prepared in tar or zip files. These formats are not suitable for storing data on Filecoin.

My understanding is that the client compresses it into a tar or zip file and then uses the appropriate tool to make a car file for storage.

Although the client claimed the data had not been stored before, at least two applications have been identified with identical data samples and descriptions:

The dataset is marked as "open," but its description suggests it is private and not publicly accessible. There is no index or verification to justify the reported 650 TiB of data.

This client has just started applying and is currently in the second round,I'll follow up on those questions.

Additional issues:

  • The allocator requires at least four SPs, but the client is currently using only three.

The latest report now has 4 SPs spread across 3 different regions

  • SPs f03238633 and f01531188 are using a VPN, which violates the application rules.

The current client explains and provides supporting documentation in the comments

@filecoin-watchdog Finally thank you for your review, if there are any other issues please feel free to point them out, thanks again!

@Lind111
Copy link
Author

Lind111 commented Jan 14, 2025

  • SPs f03238633 and f01531188 are using a VPN, which violates the application rules.

The current client explains and provides supporting documentation in the comments

@filecoin-watchdog I would like to know the way you check VPN,As well as confirming that the client's response proves that he is not using a VPN

@filecoin-watchdog
Copy link
Collaborator

@Lind111
One additional observation regarding the AMEstadium client:

I retrieved a random piece from their dataset. Fortunately, the data is retrievable; however, the content of piece baga6ea4seaqibmmmvxms6k5uw7qfborr7jb5zvevu7zs2heicrhl3b4q26vwsdq from SP f03238633 consists of screen recordings of someone playing Subway Surfers. Below are two example files from this piece:

image

Questions to consider:

  1. Is this really the data that was declared to be stored?
  2. Does this data hold significant value for humanity as claimed?

@filecoin-watchdog
Copy link
Collaborator

I'm wondering about the open source retrieval tool that the clienthttps://github.com/Lind111/EF/issues/34#issuecomment-2563410039, is this recognized?

I don't recognize this tool. You can always ask the client what they are using. It also looks like the clients might not understand how that works. I recommend checking the official spark documentation https://docs.filspark.com/troubleshooting-miner-score#block-3c41a58cb03f4a8593924d0af9e8800b


This one I did notice, and I realized that the client was storing different data, so I didn't ask too much about it

In the previous review, Galen has said:

Overall good diligence and compliant behavior. It would be good to see how this allocator is investigating clients that are working with multiple allocator pathways. It is reasonable for a client or data preparer to be working with multiple teams, but there should be investigation, diligence, transparency, and justification for that behavior. (...)

This is why I asked if you did any follow up after the third review.


My understanding is that the client compresses it into a tar or zip file and then uses the appropriate tool to make a car file for storage.

Was this confirmed with the client or is it your assumption?


I would like to know the way you check VPN,As well as confirming that the client's response proves that he is not using a VPN

For SP f03238633: https://www.ipqualityscore.com/vpn-ip-address-check/lookup/103.25.202.111
For SP f01531188: https://www.ipqualityscore.com/vpn-ip-address-check/lookup/47.242.91.210

@Lind111
Copy link
Author

Lind111 commented Jan 15, 2025

@bashyang Thanks for the clarification.

@Lind111
Copy link
Author

Lind111 commented Jan 15, 2025

@filecoin-watchdog I'll follow up on these omissions, I've benefited greatly, and thank you again for your detailed review that

@filecoin-watchdog
Copy link
Collaborator

@Lind111 Would you like to respond to my last comments, or do you feel everything has been explained?

@Lind111
Copy link
Author

Lind111 commented Jan 16, 2025

Client accounts are banned and comments as well as request forms have disappeared
The following was previously clarified

Image

@Lind111
Copy link
Author

Lind111 commented Jan 16, 2025

@filecoin-watchdog Sorry, I was waiting for your reply.

@filecoin-watchdog
Copy link
Collaborator

@Lind111 I didn't see the above comment before it disappeared, hence the confusion.
Well, I guess there is nothing more to say about AMEstadium.
If any DC needs to be revoked from this client, please notify the gov team.

@filecoin-watchdog filecoin-watchdog added Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards and removed Awaiting Response from Allocator If there is a question that was raised in the issue that requires comment before moving forward. labels Jan 16, 2025
@Lind111
Copy link
Author

Lind111 commented Jan 16, 2025

This is his last report.
and documents proving his geographic location.

Image

Image

@Lind111
Copy link
Author

Lind111 commented Jan 16, 2025

If any DC needs to be revoked from this client, please notify the gov team.

Got it. Thank you.

Checked to see that the client has already used up the second round of DC

Image

@Kevin-FF-USA
Copy link
Collaborator

Hi @Lind111,

Thanks for submitting this application for refresh.
Wanted to send you a friendly update - as this works its way through the system you should see a comment from Galen on behalf of the Governance this week. If you have any questions or need support until then, please let us know.

Warmly,
-Kevin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards Refresh Applications received from existing Allocators for a refresh of DataCap allowance
Projects
None yet
Development

No branches or pull requests

4 participants
@filecoin-watchdog @Kevin-FF-USA @Lind111 and others