ZORASCAN:data

OVERVEW

ZORASCAN is being built to provide real-time data, historical data, analytics, search, and interaction services for Zora smart contract users.

Base Data is the first level of functionality needed to create ZORASCAN and so it is the first service to be made available for use. This page will move to the documentation section as the user interface emerges.

BASE DATA

ZORASCAN defines base data as the data that is defined by any Zora related smart contracts. This can include the Zora smart contract itself, ERC20 tokens used in Zora transactions, or any new contracts that call the ERC721+ Zora smart contract. Base data comes from the Ethereum and IPFS peer-to peer networks.

Ethereum Data

Ethereum nodes maintain the full history of all mined ethereum blocks so as to verify that their view of the ethereum blockchain data is correct and to use this verified data to maintain a correctly derived view of the ethereum virtual machine. It is the ethereum virtual machine that maintains the smart contract information about tokens such as the Zora ERC721 non fungible token (NFTs) and the ERC20 fungible tokens such as WETH and DAI that are used to purchase them. Ethereum history is represented as a block chain - a list of mined blocks that are agreed to over time as acceptable to a qorum of network peers. A verified block is simply a list of transactions that the successful miner of a block has accepted for the block. These transactions are, in broad terms, either an ethereum value transfer transaction, a smart contract address creation transaction, or a message sent to an existing smart contract address. The ethereum blockchain is therefore the list of all transactions that are contained in all the blocks that have been accepted by a qorum of peers as being validly mined, and in what order. ethereum nodes also monitor the network in "real time" for the notification of successfully mined blocks in order to move their view of the ethereum blockchain forward and to participate with other peers in the qorum process.

The WEB3 RPC protocol allows node user programs to query all existing blocks and to subscribe to notification events when it receives new blocks from the peer to peer network. This is the mechanism that ZORASCAN uses to establishes and maintain its view of Zora protocol data. ZORASCAN scans blockchain history to get itself up to date and reconcile its position, and then waits for new block events to initiate its own processes.

IPFS Data

IPFS is simply a peer to peer file system. ZORASCAN attempts to download files that are arguments to a Zora mint call from the source defined in the URI in the smart contract call, and if that fails it then queries its own IPFS node using the parsed content id. ZORASCAN archives all data that it can find on its own server.

ZORASCAN Data

ZORASCAN maintains a historical database of the base data it creates and provides a data feed to notify users when changes are made to the database. Both of these features use the same process:

The ZORASCAN Database

The ZORASCAN Database is available via restful HTTP GET requests at https://zorascan.io/data/base.

The directory structure is as follows:

The ZORASCAN server will respect a large MaxKeepAliveRequests header value so that users can make multiple GET requests over the same connection. This is more efficient than websocket RPC requests for static data as it removes the need for server/client data framing and concurrency management.

The ZORASCAN Data Feed

The ZORASCAN data feed is a notification service that is designed specifically for feeding data to ZORASCAN analytics and search functions. It is also available for users.

The ZORASCAN data feed has the dual purpose of being near real-time, loss-less, and recoverable. Near real-time means that the notification is sent as soon as possible after a recieving a new block event from an ethereum node, loss-less means that even if a node drops an event or the server is interrupted, the ZORASCAN data feed will detect the error and back fill history before moving back to real-time, and recoverable means that a client that restarts a data feed session is guaranteed to receive all messages including when the inevitable gremlin turns up and bites the wire.

The way ZORASCAN achieves this is to maintain a persistent linked list of the completion states of all data processing events that it completes. The lists are organised into a dependency tree where lower nodes are triggered to start on the completion of higher nodes. The leaf nodes in the tree are the lists that are available to be posted to users on a websocket and are called a feed-line. A ZORASCAN client session will read the feed-line until it reaches the end, at which time the websocket is in a wait state until a new link is added to the feed-line. As the feed-line lists are persistent, a client can simply start at the head of the list and read the feed-line until it is in the wait state. Once in this state the client is guaranteed to have received the total history of the feed-line, and will receive new notifications when they happen.

The way that ZORASCAN ensures that the feed-line is lossless is by being driven by a poll process that emits real-time node events on receipt of a block from an ethereum node. This node is then gated by a child node called a scan node that checks if the last block received from the poll node is an increment by 1 from the last block seen by the scan node. If it is, the scan node simply records it which triggers the next stage of the process. If it isn't, the scan node will scan the block chain history from the last sent block and record these until it catches up to the last poll node block number it sees (it keeps seeing them advance concurrently while it is scanning history so it's a race to catch up). Both the poll node and the scan node write their data to the 0/ directory in the database.

ZORASCAN feed-line sessions are stateless - they simply read feed-lines from the link the client asks to start from when they open a feed-line session. This can be one of "head" (all history), "tail" (just new events), or any link number in between. Link numbers are included in each feed-line message received and the client can save their own restart state.

The ZORASCAN base data feed-line is at https://zorascan.io/feed/base/[/head|tail|link-nmbr]

Demo

after I've had a sleep