Add statistical tests to battery of tests - Parking Lot Test?

A common shenanigan that we have observed in NFT launches is that project insiders will mint unrevealed NFTs to themselves and then prior to the reveal phase alter the metadata of the NFTs they have to give themselves rare NFTs. An NFT in the top 1% by rarity will normally be worth several fold more than the average NFT in a collection, which can make this shenanigan quite lucrative.

Up until now, we’ve been detecting these using a Kolmogorov–Smirnov (K-S) test however I think it’s not sensitive enough in some cases. I have some ideas about how to develop a better test.
Developing a statistically robust test that is more sensitive than the K-S test is a priority.

The shenanigan scanning work flow is roughly as follows:

  1. Download metadata for every NFT in a collection.
  2. Using the metadata, calculate rarity rank for each NFT in a collection. At this point we can make a “rarity map” as shown in the figures below. (Sometimes we can replace steps 1 and 2 by downloading the data directly from a rarity ranking site such as rarity.tools or raritysniffer.com)
  3. Obtain minting data for each NFT in a collection. Now we have an array with 3 main columns: [Token ID, Rarity Rank, Minter] (there’s more columns but these are the main ones).
  4. Get the rarities of each NFT minted by each account. Compare the NFTs minted by an account to a uniform distribution using the K-S test. This allows us to reject the null hypothesis which is that NFTs minted by an account were selected from a random distribution.

Example K-S test that isn’t working well:


The array at the beginning shows the rarity rank of the NFTs minted by an account. 10 of the top 50 ranked NFTs in a collection size of 3179 were minted by the same account (52 NFTs total minted by account). Here, I use the K-S test but it doesnt give a very strong result and gives a relatively high p-value despite being clearly anomalous. This is because the account also minted a 42 NFTs which were probably randomly distributed. (Below) Rarity map for the entire collection. note the cluster in the lower left corner (token IDs 1-50).

Silopete117 has made a website with hundreds of rarity maps.


Rarity map showing the “rarity rank” for each NFT as a function of token ID. Note that NFTs are almost always minted sequentially by token ID
Source: silopete117’s NFT rarity distributions

The rough idea:

We need a better statistical test. I think some variation of the Parking Lot Test, best known for its use in verifying pseudorandom number generators, might be a good starting point. I really want to do this myself and I am pretty sure it will yield interesting results but I have a very long backlog of other things to work on. Basically we’re looking for anomalous clusters that were all minted by the same account and had lower rarity ranks.

If anybody has any ideas for improved statistical tests, let me know. Happy to bounce of ideas and obviously we have a lot of data to run the test on.

OUT these abusers. Begin by publishing the Kolmogorov-Smirov test results. Record the addresses of those who abuse. Publish the test results. Undoubtedly new statistical calculations will occur to you after observation.

Without honesty and transparency the pubic market collapses. The NFT game is not a game for insiders to abuse public market place trust. If these shenanigans are allowed to continue, we are all in trouble.

1 Like

OUT these abusers. Begin by publishing the Kolmogorov-Smirov test results. Record the addresses of those who abuse. Publish the test results

We actually used to have a website maintained by Fun that had the K-S test results, mostly automated. It’s not actively maintained right now but if someone wants to resurrect it we’d throw big streaming bounties that way. (And we can hire a designer and potentially a front-end dev to make it look really slick).

Gotta be careful with the K-S test results though. It’s not always insiders – sometimes just people using our code, or code similar to ours, on leaky metadata endpoints.

Update on the stats side: flyingfish brought up complete spatial randomness in the discord channel – pretty sure I learned about this like 14 years ago and then completely forgot about it. But this is the methodology we’ll want.

Looking into implementing it now.