Hi everyone. I wanted to ask the community for feedback on my open web fellowship application (see below). Thank you in advance!
name: Jakub Smekal, email: [email protected], location: Durham, United Kingdom, timezone: GMT, discordID: #3317, discourseID:
smejak, current role: student, Natural Sciences, Durham University, 2nd year
mentor: Shady El Damaty, category: token-community
The goal of this project is to research value representation in a decentralized science ecosystem that incentivizes active participation in high-quality scientific research and the community that benefits from it.
In area-specific decentralized systems such as open science, one of the main problems is the representation of value for the different goods and services that are used/produced within them. For instance, in open science, storage of data and computation act as commodities that need to be properly represented in order to provide the incentive for trade. As a result, a decentralized ecosystem needs a token that appropriately represents the underlying value of all assets. Furthermore, such a system needs to efficiently record all participants’ contributions and accordingly distribute rewards, therefore it’s important to identify what contributions are desirable, how to reward them and in what time frame.
By identifying this problem, I propose a project exploring different open science ecosystems by simulating different combinations of stakeholders, the actions they can take, and the rewards they would receive for actively participating in the community.
This project will identify the key commodities in a decentralized science ecosystem, which are subsequently going to be used to construct the system’s objective function. The objective function is essentially the goal of the system as a whole and it determines the actions that should to be taken by different stakeholders and the rewards that each stakeholder receives for participating. For further details, I recommend this article from the Ocean Protocol.
Using the TokenSPICE simulator, I will design and run simulations for different token ecosystems with multiple agents to test the system’s longevity and expansive potential. For this, I will need to define the goals of a decentralized science system and the corresponding KPIs that will be used to measure that. TokenSPICE already offers a wide variety of agents that can be used to model different token systems.
The data from the simulations can be used to create different charts depicting the aggregate income/expenditure for the different tasks specified. An example from a simulation on web3 sustainability loop is shown below.
The TokenSPICE simulator can be found here.
The implementation will go as follows:
- identify the objective and reward functions for an open science ecosystem (will be used as a metric to track the success of the open science token community)
- identify stakeholders and their potential contributions
- represent reward for contributing as a token/multiple tokens
- either find existing TokenSPICE agents or develop new ones that correspond to all stakeholders
- develop a netlist connecting all agents and giving them appropriate rewards at each step
- run the simulations with all netlists
- identify the optimal token community based on constraints defined in the objective function.
Steps 3-6 can be repeated several times.
The minimum outcome of this research will be a report on the findings from the various token community simulations. The report should serve as not only an overview of the possible token ecosystems for optimal research incentives, but also as a guide for anyone wanting to conduct similar research, since TokenSPICE is a relatively new tool and still under development.
The first aim of this project is to concretely define the objective function of an open science system and develop different multi-agent communities centered around a token. Furthermore, this project will look into tokens as a representation of value for different commodities, such as computing or storage, to determine whether it is better to have a single token for everything, or perhaps different tokens providing different incentives.
Throughout the project, I will use the TokenSPICE simulator, written in the Python programming language. This research will encompass the development of new agent classes for the TokenSPICE simulator and new netlists, which define how the agents are connected within the ecosystem. An example of an ecosystem from the Ocean Protocol and a possible open science community is shown below.
The results of the research will be interpreted with KPI’s that are to be determined at the beginning of the project. For a decentralized science community, some indicators of a successful token system might include number of research papers published (measured through a staking mechanism), income of researchers, time (how long does the community thrive? Does the income of researchers vary?), etc.
The next aim of the project is to extend the research of multi-agent cryptoeconomic communities further, develop new systems based on the feedback from the initial simulations and perhaps provide feedback to the TokenSPICE community where appropriate. After this project has finished, its conclusion should be a detailed report on my findings.
|1||Deliverable : get familiar with TokenSPICE and begin the development of the first token ecosystem|
|2||Deliverable : schema of first token ecosystem, architecture design, development of agents|
|3||Deliverable : set KPI’s, develop SimStrategy, reward functions, develop the first token ecosystem (netlist)|
|4||Deliverable : finish development of first ecosystem (+debugging), evaluate the results of the first simulation|
|5||Deliverable : begin developing new agent classes for new simulation (increased complexity with DataTokens), schema of new token ecosystems, think about sources of value/commodities|
|6||Deliverable : development of EVM in the loop simulation, create presentation current results, Mid-term performance review / community feedback|
|7||Deliverable : finish development of EVM in the loop opsci simulation (+debugging) and run the simulation|
|8||Deliverable : analysis of results, points for improvement|
|9||Deliverable : a graph of all simulations run so far with their respective parameters, work on design of the final token ecosystem based on previous experiments|
|10||Deliverable : development of the final simulation netlist|
|11||Deliverable : finish development of final simulation, evaluate results, work on putting all results together for the final project report|
|12||Deliverable : final project report and presentation, Submit Final Report + Code/Materials on Discourse #pulse|
Communication with the mentor will take place online, primarily via Zoom and Discord. Regular meetings would initially be held weekly and in the second half of the fellowship on a bi-weekly basis. Furthermore, I would regularly update my mentor through text and additional meetings may take place if they are required.
I am extremely interested in the development of a more efficient and fair science community and I would very much like to contribute to its development even after this fellowship.
I believe science will greatly benefit the adoption of blockchain technology and decentralization in general. Opscientia’s mission strongly coincides with my personal values and modeling token ecosystems can provide real insight into how to develop the best possible decentralized science community.
I an undergraduate mathematics and physics student and I presume these fields will help me in solving the problem of modeling token communities and value representation. Furthermore, I have experience with machine learning development and data analysis in Python, which may also prove useful when working with TokenSPICE.
I plan on devoting 15-20 hours per week to this project. During the fellowship, I will be attending university where I am a full-time student, however, I see no reason why I cannot fit this commitment into my schedule. If any issues arise, my first point of contact would be my mentor.
The first objective is to develop a naive baseline model of an open science community. Below is a schema of a potential naive model that only uses OCEAN and USD for transactions between the open science market and for tracking the overall value of the system.
ResearcherAgentpublishes a grant proposal (fixed price) to
ProposalStorageAgentsends data about the proposal to
OCEANMinterAgentto mint the corresponding amount of OCEAN
OCEANMinterAgentsends the minted OCEAN to
ResearcherAgentsends all funds to
OCEANBurnerAgentin a fixed ratio
OpsciMarketplaceAgentsends all funds evenly to all instances of
SellerAgentand a fixed amount to
OpscientiaDAOAgent(equivalent to partial ownership of the research assets by the DAO)
OpscientiaDAOAgentsends a fixed amount of OCEAN to
OCEANBurnerAgentburns everything in its wallet
SellerAgentis created (corresponding to a researcher selling assets from research)
The diagram above shows a researcher minter, however, it will be easier if we only create a new
SellerAgent rather than destroy the existing
ResearcherAgent, then create a new
SellerAgent, and then a new
ProposalStorageAgent(actually not needed in the baseline simulation because it currently doesn’t store anything)
The key metrics measured in this simulation are:
- number of assets in the marketplace
- number of sellers (corresponding to the number of researchers who have finished their work and continue to be compensated for it)
- amount of OCEAN
- price of OCEAN
- monthly revenue of sellers
Since this is a naive implementation of the open science token ecosystem, it is bound to be restricted in its precise representation of true open science market behavior. Here is a short list of the identified limitations:
- fixed number of researchers, grant size, project length, and asset output size (not reflecting the real-world variability of research projects, which usually have teams of multiple people actively working for long periods of time)
- fixed price for marketplace assets (clearly, different services will have different prices, datasets may vary in size, algorithms may vary in the price for their utility)
- even distribution of funds to asset sellers (this will result in a gradual decrease in revenue/seller (since the money flowing through the marketplace is constant), which is on one hand slightly representative of the scenario where more people are selling their assets, thus increasing the competition and supply, hence lowering revenue, but on the other, it fails to represent the variability of different assets offered)
This week, my plan is to fully define the baseline model and work on the development of the baseline model netlist in TokenSPICE. As per the feedback I received from Shady, the baseline model should represent the current status quo of the research ecosystem or at least a simplified version of it (within a model where funding is curated by a university, it doesn’t make sense to model every detail that often doesn’t even relate to scientific research). The purpose of the baseline model is to identify the weaknesses of the current status quo (such as the dependence on a centralized agency to provide funds for research).
A schema of a possible baseline model for the scientific research pipeline
The schema above can serve as a reference point for how the baseline model can work. In this model, two researchers are competing for funding that comes from an external grant-giving agency and the university acts as the agent that judges their research proposals based on predefined properties (as of right now, this includes: grant requested, the proposed outcome of the research, number of researchers working on the project, and the access of knowledge (which is indicative of whether the researchers can successfully complete the project)). Once a researcher (or a research team) wins a grant, they use the funding to publish knowledge assets to knowledge curators (e.g. scientific journals). In this model, every successful research proposal leads to an increase in knowledge access for the researchers working on the project (in simple terms, doing research increases your level of knowledge), thus increasing the likelihood of them receiving grants in the future (I suppose in this model, the access to knowledge is treated similarly as reputation, which presumably also increases with more research projects completed).
Now, how is the competition modeled?
In this model, each researcher generates a proposal. The proposal is itself just a Python dictionary with the following attributes:
grant_requested: random integer in the range [10 000, 50 000]
assets_generated: random integer in the range [1, 10], represents the expected outcome of the research and its value (e.g. number of new algorithms, amount of new data, etc.)
no_researchers: fixed for now, but could be randomized in another experiment
knowledge_access: starts at 1, then increments by 1 if a research proposal is accepted. It should serve as a representation of researchers gaining more knowledge by doing research and then being more able to do further research (in this simulation, it actually works like a very rudimentary reputation system). The main idea is to give an advantage to researchers who have been funded previously.
These parameters are then read by the university agent and evaluated according to this equation:
index = ((grant_requested / no_researchers) / assets_generated) / knowledge_access
It is a very basic evaluation function, but it should still make some intuitive sense. In this case, we assume that lower grants will be more likely to be funded as they don’t pose that high of a risk, and a lower grant means more money available for other grants. The next assumption is that a project with many people working on it is more likely to be funded than one with few people since the work can be distributed better (this doesn’t take into account the law of diminishing returns). Additionally, if a research project produces a lot of new things of value, it is more likely to be funded than one that doesn’t. Lastly, the researchers with more expertise about a subject (or reputation) are more likely to get their research funded.
With this setup, the proposal with the lower index gets funded. Furthermore, each research project lasts a fixed length of time, after which all researchers submit new proposals and compete for funding again.
Below are 6 plots from 2 different runs of the same simulations (hopefully I’ll be able to fix the formatting).
|Knowledge access index||Number of proposals submitted||Funding received|
As can be seen from the first triple of plots, the researcher that one initially was almost a definitive winner of all the other grants as well. This is due to the higher
knowledge_access index that gives the first winner a great advantage. The second triple is slightly more interesting than the first one. In this run of the simulation, there was no absolute winner and the two agents received funding more or less evenly. I ran the experiment 10 times and only 2 runs showed this kind of progression, whereas the other 8 always converged to a single winner.
This week’s goal is to continue running iterations of the baseline model, to re-evaluate the metrics used for the research proposal abstraction and their subsequent usage in the proposal evaluation, and lastly to begin the development of the second model, which will be the web3 profit-sharing model.
The web3 profit-sharing model will be slightly more complicated than the baseline, given by more actions available to the agents and the addition of a community staker agent. A schema of such a profit-sharing model is shown below and I will include an overview of it as well.
We start again with two researchers competing for funding (I might also explore models with more than two researchers to see whether certain characteristics such as “winner-takes-all” are more likely to arise with more or fewer agents). After both proposals are submitted, only one of them will receive funding (initially the choice depends solely on the randomness of the proposal parameters). As in the baseline model, the researcher then uses all the funds to create new knowledge assets and publish them to a knowledge curator, which in this case knowledge marketplace. Then, the
knowledge_access parameter of the winning researcher increases. Unlike the baseline model, the profit-sharing model allows the second researcher to use their own money to buy into the marketplace and gain the same
knowledge_access point as the other researcher, thus keeping the competition for the next grant the same as in the last step. The first researcher now has assets in the marketplace and will receive corresponding rewards in the following steps when the funds of a new research project are used to get access to compute services/data/algorithms (equivalent to getting paid when people buy your product). The number of knowledge assets in the knowledge market increases with each research project finished.
The staker agent stakes a fixed amount of a specified token in the knowledge marketplace and receives rewards from the transaction fees according to the size of the stake (there is room for experimentation regarding the behavior of the staker, but a simple approach is that the staker will add all rewards to the staked tokens to receive more rewards in the future). Lastly, the treasury takes a fixed ratio from the transaction fees (this happens anytime a transaction in the knowledge market is made).
This model should be more sustainable than the baseline model, meaning there should be more research projects funded. At the end of the simulation, I presume most of the value will be concentrated in the knowledge market, however, there might be variations in the amount of tokens the researchers hold (depending on whether there will be a single winner of most of the grants or whether we will see fair competition throughout).
At the heart of the profit-sharing model are DataTokens, which essentially act as wrappers for data assets. Each DataToken (DT) represents a different asset in the marketplace and whoever holds a DT can gain access to that asset (the type of access is specified at the minting of the token and stored as metadata, so for instance you could have a DT that would grant you one-time-access to look at a specific dataset/algorithm). What makes DTs quite convenient is that you can specify quite a lot of parameters, e.g. the type of asset that the DT represents. While the profit-sharing model in TokenSPICE doesn’t specify any of these parameters for a DT, it uses DTs for any transactions that happen in the KnowledgeMarket (see schema above). With this in mind, I’ll give a brief description of two “research cycles” of the simulation:
- Initial Conditions: All agents have a fixed amount of OCEAN to begin with (DAO Treasury has the most) |
KnowledgeMarkethas 1 knowledge asset
ResearcherAgentssubmit their proposals (with some parameters fixed and some randomized, see the baseline description above)
DAO Treasuryalgorithmically determines the winner and sends them the requested amount of OCEAN
- The winner (
Researcher) uses part of the funding (determined by a fixed ratio) to buy a DT from the
KnowledgeMarketand immediately consumes it to gain
knowledge_access += 1. Then the remainder of OCEAN is used to publish results and new assets to the
KnowledgeMarket. (Note: the
KnowledgeMarkettakes fees from buying the DT and sends them to
Staker(ratio determined algorithmically))
- The loser (
Researcher) uses part of their own money to buy a DT and immediately consume it to gain
knowledge_access += 1(same thing with fees as before)
Stakerstakes all the rewards received in this round (so that next time, the rewards will be greater)
- New Research Round Steps 1-2 repeated
- Same as (5) with the exception that the OCEAN spent for the DT is now transferred to the owners of assets in the
KnowledgeMarket, i.e. the previous winners of the grants (when both
Researchershave assets in the
KnowledgeMarket, the OCEAN is split between them according to the ratio of the number of assets they published).
There are two main parts of the ecosystem which this model simplifies. Firstly, there should be separate DTs for each asset published (with separate pools that automatically exchange OCEAN for DTs). This is simplified into one big pool with all assets represented by the same DT (minted multiple times). This means that we don’t have to worry about there not being enough liquidity in a portion of all the pools. In a future iteration, we could add new
Stakers to the model that would always add liquidity to the different pools, but right now, we’d like to answer the question whether this new model can outperform the baseline in terms of number of papers published, number of knowledge assets created, and the fairness of the competition. Once we have established that this model is superior to the baseline according to these metrics, we can ask additional questions about the specific functionality of the
KnowledgeMarket, which might include: How are knowledge assets going to get enough liquidity to support long-term knowledge sharing? What happens to the model when some of the knowledge assets become public goods available to anyone?
If we compare this to the profit-sharing model described above, we can see certain similarities. Firstly, the
DAO Treasury and the
Stakers are more or less the parts that form the Token generation and Curate $ boxes in the WSL. The
Researchers are the ones who receive funding to do work (i.e. research) and create knowledge assets from it, which are then published to the
KnowledgeMarket (the web3 ecosystem) to generate even more value than what was previously present in the ecosystem. Part of this value then circles back to the community which has incentive to again distribute it to
Researchers who create more value. This comparison with the Web3 Sustainability Model serves as a sanity check to make sure this initial open science model has some merit.
I have created an initial version of the web3 profit-sharing model (the code still needs some work) and ran a couple of simulations to see whether the results are in line with what we would expect from the schema above.
The plots above show the amount of OCEAN that researchers acquire over time. This OCEAN belongs to them and does not come directly from the research grants, rather, it comes from the researchers buying and selling assets from the knowledge marketplace. There are many variations of this model that can be tested, including
- variable number of researchers
- variable funding periods
- variable number of stakers
- different transaction fees for the knowledge market
and much more. The results from the plots should serve as initial reassurance that this model might work, but there are still some limitations that I didn’t cover yet. For instance, the simulation seems to terminate after a fixed number of timesteps, which needs to be changed if we want to compare the number of possible research projects funded by this model versus the baseline.
This week, I focused on refactoring the profit-sharing netlist and agents to
- make the code more readable
- allow easier experimentation with different parameters (number of
ResearcherAgents, fees, etc.
While doing so, I wrote two tests for the
OpscientiaDAOAgent and the
ResearcherAgent, which revealed some issues with the profit-sharing netlist that needed addressing. Firstly, the
knowledge_access index should always be aligned across all
ResearcherAgents provided they have sufficient funding to buy into the
KnowledgeMarket. Furthermore, writing the unit test has forced me to choose a fixed wiring of the agents within the netlist because the order in which the agents made their actions determines whether or not they will be correctly aligned. In the current configuration, the agents are aligned in the following sequence:
I have also improved the
KPIs.py to be able to log and plot results for an arbitrary number of researchers. Below are some plots showing the results for simulations with 50 and 100
In these plots, we can see that we are reaching a law of diminishing returns (of sorts), because even though many researchers are able to receive funding, publish datasets, and receive rewards when other researchers buy their data, some researchers will inevitably not be supported. This might be improved upon, however, by testing different initial amounts of OCEAN for the researchers, funding more than one research project at a time (and perhaps fund projects on a rolling basis rather than in fixed time intervals), etc.
Drawing from the results received so far, here is a list of possible iterations of/improvements to the profit sharing web3 model:
- funding of multiple proposals at a time
- rolling basis funding (once one project is finished, another can be funded immediately)
- differentiation of public & private assets, types of knowledge assets (data, algorithms, cloud services)
- enable throttling of the availability of funds (for the DAO Treasury)
- inflation model with token minter (& experimentation with the distribution of funds)
- community voting for public goods funding
These models shall be the focus for the rest of the open web fellowship. As a side note, I also need to run the baseline model for the same amount of time as the profit-sharing model to get a fair comparison of the two models. In addition, the baseline model can be compared to real-world data from grant funding agencies.
The main focus for this week was the development of the rolling-basis funding profit-sharing model with the option to fund multiple proposals at a time. To do this, I needed to develop 6 new agents:
where the first three agents are adapted for having multiple proposals funded at a time and the last three agents remove the time-dependent funding mechanism.
Below are some plots from the profit sharing model with multiple proposals funded at a time.
As expected in this model, the funding is depleted from the treasury much sooner than in the previous model since the only adjustment is that more proposals are funded at a time.
Comparing that with the rolling basis funding model: