A common shenanigan that we have observed in NFT launches is that project insiders will mint unrevealed NFTs to themselves and then prior to the reveal phase alter the metadata of the NFTs they have to give themselves rare NFTs. An NFT in the top 1% by rarity will normally be worth several fold more than the average NFT in a collection, which can make this shenanigan quite lucrative.
Up until now, we’ve been detecting these using a Kolmogorov–Smirnov (K-S) test however I think it’s not sensitive enough in some cases. I have some ideas about how to develop a better test.
Developing a statistically robust test that is more sensitive than the K-S test is a priority.
The shenanigan scanning work flow is roughly as follows:
- Download metadata for every NFT in a collection.
- Using the metadata, calculate rarity rank for each NFT in a collection. At this point we can make a “rarity map” as shown in the figures below. (Sometimes we can replace steps 1 and 2 by downloading the data directly from a rarity ranking site such as rarity.tools or raritysniffer.com)
- Obtain minting data for each NFT in a collection. Now we have an array with 3 main columns: [Token ID, Rarity Rank, Minter] (there’s more columns but these are the main ones).
- Get the rarities of each NFT minted by each account. Compare the NFTs minted by an account to a uniform distribution using the K-S test. This allows us to reject the null hypothesis which is that NFTs minted by an account were selected from a random distribution.
Example K-S test that isn’t working well:
The array at the beginning shows the rarity rank of the NFTs minted by an account. 10 of the top 50 ranked NFTs in a collection size of 3179 were minted by the same account (52 NFTs total minted by account). Here, I use the K-S test but it doesnt give a very strong result and gives a relatively high p-value despite being clearly anomalous. This is because the account also minted a 42 NFTs which were probably randomly distributed. (Below) Rarity map for the entire collection. note the cluster in the lower left corner (token IDs 1-50).
Silopete117 has made a website with hundreds of rarity maps.
Rarity map showing the “rarity rank” for each NFT as a function of token ID. Note that NFTs are almost always minted sequentially by token ID
Source: silopete117’s NFT rarity distributions
The rough idea:
We need a better statistical test. I think some variation of the Parking Lot Test, best known for its use in verifying pseudorandom number generators, might be a good starting point. I really want to do this myself and I am pretty sure it will yield interesting results but I have a very long backlog of other things to work on. Basically we’re looking for anomalous clusters that were all minted by the same account and had lower rarity ranks.
If anybody has any ideas for improved statistical tests, let me know. Happy to bounce of ideas and obviously we have a lot of data to run the test on.