Opscientia Open Web Fellowship: Simulations for Science Token Communities

Hi everyone. I wanted to ask the community for feedback on my open web fellowship application (see below). Thank you in advance!

Opscientia Open Web Fellowship: Simulations for Science Token Communities

name: Jakub Smekal, email: [email protected], location: Durham, United Kingdom, timezone: GMT, discordID: #3317, discourseID:
smejak, current role: student, Natural Sciences, Durham University, 2nd year

mentor: Shady El Damaty, category: token-community


The goal of this project is to research value representation in a decentralized science ecosystem that incentivizes active participation in high-quality scientific research and the community that benefits from it.

In area-specific decentralized systems such as open science, one of the main problems is the representation of value for the different goods and services that are used/produced within them. For instance, in open science, storage of data and computation act as commodities that need to be properly represented in order to provide the incentive for trade. As a result, a decentralized ecosystem needs a token that appropriately represents the underlying value of all assets. Furthermore, such a system needs to efficiently record all participants’ contributions and accordingly distribute rewards, therefore it’s important to identify what contributions are desirable, how to reward them and in what time frame.

By identifying this problem, I propose a project exploring different open science ecosystems by simulating different combinations of stakeholders, the actions they can take, and the rewards they would receive for actively participating in the community.

Implementation Plan

This project will identify the key commodities in a decentralized science ecosystem, which are subsequently going to be used to construct the system’s objective function. The objective function is essentially the goal of the system as a whole and it determines the actions that should to be taken by different stakeholders and the rewards that each stakeholder receives for participating. For further details, I recommend this article from the Ocean Protocol.

Using the TokenSPICE simulator, I will design and run simulations for different token ecosystems with multiple agents to test the system’s longevity and expansive potential. For this, I will need to define the goals of a decentralized science system and the corresponding KPIs that will be used to measure that. TokenSPICE already offers a wide variety of agents that can be used to model different token systems.

The data from the simulations can be used to create different charts depicting the aggregate income/expenditure for the different tasks specified. An example from a simulation on web3 sustainability loop is shown below.

The TokenSPICE simulator can be found here.

The implementation will go as follows:

  1. identify the objective and reward functions for an open science ecosystem (will be used as a metric to track the success of the open science token community)
  2. identify stakeholders and their potential contributions
  3. represent reward for contributing as a token/multiple tokens
  4. either find existing TokenSPICE agents or develop new ones that correspond to all stakeholders
  5. develop a netlist connecting all agents and giving them appropriate rewards at each step
  6. run the simulations with all netlists
  7. identify the optimal token community based on constraints defined in the objective function.

Steps 3-6 can be repeated several times.

Minimum Deliverables

The minimum outcome of this research will be a report on the findings from the various token community simulations. The report should serve as not only an overview of the possible token ecosystems for optimal research incentives, but also as a guide for anyone wanting to conduct similar research, since TokenSPICE is a relatively new tool and still under development.

Aim/Deliverable 1

The first aim of this project is to concretely define the objective function of an open science system and develop different multi-agent communities centered around a token. Furthermore, this project will look into tokens as a representation of value for different commodities, such as computing or storage, to determine whether it is better to have a single token for everything, or perhaps different tokens providing different incentives.

Throughout the project, I will use the TokenSPICE simulator, written in the Python programming language. This research will encompass the development of new agent classes for the TokenSPICE simulator and new netlists, which define how the agents are connected within the ecosystem. An example of an ecosystem from the Ocean Protocol and a possible open science community is shown below.

The results of the research will be interpreted with KPI’s that are to be determined at the beginning of the project. For a decentralized science community, some indicators of a successful token system might include number of research papers published (measured through a staking mechanism), income of researchers, time (how long does the community thrive? Does the income of researchers vary?), etc.

Aim/Deliverable 2

The next aim of the project is to extend the research of multi-agent cryptoeconomic communities further, develop new systems based on the feedback from the initial simulations and perhaps provide feedback to the TokenSPICE community where appropriate. After this project has finished, its conclusion should be a detailed report on my findings.


Week Activity
1 Deliverable : get familiar with TokenSPICE and begin the development of the first token ecosystem
2 Deliverable : schema of first token ecosystem, architecture design, development of agents
3 Deliverable : set KPI’s, develop SimStrategy, reward functions, develop the first token ecosystem (netlist)
4 Deliverable : finish development of first ecosystem (+debugging), evaluate the results of the first simulation
5 Deliverable : begin developing new agent classes for new simulation (increased complexity with DataTokens), schema of new token ecosystems, think about sources of value/commodities
6 Deliverable : development of EVM in the loop simulation, create presentation current results, Mid-term performance review / community feedback
7 Deliverable : finish development of EVM in the loop opsci simulation (+debugging) and run the simulation
8 Deliverable : analysis of results, points for improvement
9 Deliverable : a graph of all simulations run so far with their respective parameters, work on design of the final token ecosystem based on previous experiments
10 Deliverable : development of the final simulation netlist
11 Deliverable : finish development of final simulation, evaluate results, work on putting all results together for the final project report
12 Deliverable : final project report and presentation, Submit Final Report + Code/Materials on Discourse #pulse

Plan for Communication with Mentor

Communication with the mentor will take place online, primarily via Zoom and Discord. Regular meetings would initially be held weekly and in the second half of the fellowship on a bi-weekly basis. Furthermore, I would regularly update my mentor through text and additional meetings may take place if they are required.

Plans to Continue with Project

I am extremely interested in the development of a more efficient and fair science community and I would very much like to contribute to its development even after this fellowship.

Candidate Details


I believe science will greatly benefit the adoption of blockchain technology and decentralization in general. Opscientia’s mission strongly coincides with my personal values and modeling token ecosystems can provide real insight into how to develop the best possible decentralized science community.


I an undergraduate mathematics and physics student and I presume these fields will help me in solving the problem of modeling token communities and value representation. Furthermore, I have experience with machine learning development and data analysis in Python, which may also prove useful when working with TokenSPICE.

Working time and commitments

I plan on devoting 15-20 hours per week to this project. During the fellowship, I will be attending university where I am a full-time student, however, I see no reason why I cannot fit this commitment into my schedule. If any issues arise, my first point of contact would be my mentor.

Curricula Vitae




Weekly progress updates

Week 1

The first objective is to develop a naive baseline model of an open science community. Below is a schema of a potential naive model that only uses OCEAN and USD for transactions between the open science market and for tracking the overall value of the system.

Description of one step in the loop

  1. ResearcherAgent publishes a grant proposal (fixed price) to ProposalStorageAgent

  2. ProposalStorageAgent sends data about the proposal to OpscientiaDAOAgent

  3. OpscientiaDAOAgent tells OCEANMinterAgent to mint the corresponding amount of OCEAN

  4. OCEANMinterAgent sends the minted OCEAN to ResearcherAgent

  5. ResearcherAgent sends all funds to OpsciMarketplaceAgent and OCEANBurnerAgent in a fixed ratio

  6. OpsciMarketplaceAgent sends all funds evenly to all instances of SellerAgent and a fixed amount to OpscientiaDAOAgent (equivalent to partial ownership of the research assets by the DAO)

  7. OpscientiaDAOAgent sends a fixed amount of OCEAN to OCEANBurnerAgent

  8. OCEANBurnerAgent burns everything in its wallet

  9. New SellerAgent is created (corresponding to a researcher selling assets from research)

The diagram above shows a researcher minter, however, it will be easier if we only create a new SellerAgent rather than destroy the existing ResearcherAgent, then create a new SellerAgent, and then a new ResearcherAgent.

List of Agents

  • ResearcherAgent
  • OpsciMarketplaceAgent
  • SellerAgent
  • OCEANBurnerAgent
  • OCEANMinterAgent
  • OpscientiaDAOAgent
  • ProposalStorageAgent (actually not needed in the baseline simulation because it currently doesn’t store anything)


The key metrics measured in this simulation are:

  • number of assets in the marketplace
  • number of sellers (corresponding to the number of researchers who have finished their work and continue to be compensated for it)
  • amount of OCEAN
  • price of OCEAN
  • monthly revenue of sellers

Limitations of the model

Since this is a naive implementation of the open science token ecosystem, it is bound to be restricted in its precise representation of true open science market behavior. Here is a short list of the identified limitations:

  • fixed number of researchers, grant size, project length, and asset output size (not reflecting the real-world variability of research projects, which usually have teams of multiple people actively working for long periods of time)
  • fixed price for marketplace assets (clearly, different services will have different prices, datasets may vary in size, algorithms may vary in the price for their utility)
  • even distribution of funds to asset sellers (this will result in a gradual decrease in revenue/seller (since the money flowing through the marketplace is constant), which is on one hand slightly representative of the scenario where more people are selling their assets, thus increasing the competition and supply, hence lowering revenue, but on the other, it fails to represent the variability of different assets offered)

Week 2

This week, my plan is to fully define the baseline model and work on the development of the baseline model netlist in TokenSPICE. As per the feedback I received from Shady, the baseline model should represent the current status quo of the research ecosystem or at least a simplified version of it (within a model where funding is curated by a university, it doesn’t make sense to model every detail that often doesn’t even relate to scientific research). The purpose of the baseline model is to identify the weaknesses of the current status quo (such as the dependence on a centralized agency to provide funds for research).

A schema of a possible baseline model for the scientific research pipeline

The schema above can serve as a reference point for how the baseline model can work. In this model, two researchers are competing for funding that comes from an external grant-giving agency and the university acts as the agent that judges their research proposals based on predefined properties (as of right now, this includes: grant requested, the proposed outcome of the research, number of researchers working on the project, and the access of knowledge (which is indicative of whether the researchers can successfully complete the project)). Once a researcher (or a research team) wins a grant, they use the funding to publish knowledge assets to knowledge curators (e.g. scientific journals). In this model, every successful research proposal leads to an increase in knowledge access for the researchers working on the project (in simple terms, doing research increases your level of knowledge), thus increasing the likelihood of them receiving grants in the future (I suppose in this model, the access to knowledge is treated similarly as reputation, which presumably also increases with more research projects completed).

Now, how is the competition modeled?

In this model, each researcher generates a proposal. The proposal is itself just a Python dictionary with the following attributes:

  • grant_requested: random integer in the range [10 000, 50 000]
  • assets_generated: random integer in the range [1, 10], represents the expected outcome of the research and its value (e.g. number of new algorithms, amount of new data, etc.)
  • no_researchers: fixed for now, but could be randomized in another experiment
  • knowledge_access: starts at 1, then increments by 1 if a research proposal is accepted. It should serve as a representation of researchers gaining more knowledge by doing research and then being more able to do further research (in this simulation, it actually works like a very rudimentary reputation system). The main idea is to give an advantage to researchers who have been funded previously.

These parameters are then read by the university agent and evaluated according to this equation:
index = ((grant_requested / no_researchers) / assets_generated) / knowledge_access

It is a very basic evaluation function, but it should still make some intuitive sense. In this case, we assume that lower grants will be more likely to be funded as they don’t pose that high of a risk, and a lower grant means more money available for other grants. The next assumption is that a project with many people working on it is more likely to be funded than one with few people since the work can be distributed better (this doesn’t take into account the law of diminishing returns). Additionally, if a research project produces a lot of new things of value, it is more likely to be funded than one that doesn’t. Lastly, the researchers with more expertise about a subject (or reputation) are more likely to get their research funded.

With this setup, the proposal with the lower index gets funded. Furthermore, each research project lasts a fixed length of time, after which all researchers submit new proposals and compete for funding again.

Below are 6 plots from 2 different runs of the same simulations (hopefully I’ll be able to fix the formatting).

Knowledge access index Number of proposals submitted Funding received

| number_of_proposals_LINEAR | USD_funding_LINEAR
Knowledge_access_index_LINEAR | number_of_proposals_LINEAR | USD_funding_LINEAR

As can be seen from the first triple of plots, the researcher that one initially was almost a definitive winner of all the other grants as well. This is due to the higher knowledge_access index that gives the first winner a great advantage. The second triple is slightly more interesting than the first one. In this run of the simulation, there was no absolute winner and the two agents received funding more or less evenly. I ran the experiment 10 times and only 2 runs showed this kind of progression, whereas the other 8 always converged to a single winner.

Week 3

This week’s goal is to continue running iterations of the baseline model, to re-evaluate the metrics used for the research proposal abstraction and their subsequent usage in the proposal evaluation, and lastly to begin the development of the second model, which will be the web3 profit-sharing model.

The web3 profit-sharing model will be slightly more complicated than the baseline, given by more actions available to the agents and the addition of a community staker agent. A schema of such a profit-sharing model is shown below and I will include an overview of it as well.


We start again with two researchers competing for funding (I might also explore models with more than two researchers to see whether certain characteristics such as “winner-takes-all” are more likely to arise with more or fewer agents). After both proposals are submitted, only one of them will receive funding (initially the choice depends solely on the randomness of the proposal parameters). As in the baseline model, the researcher then uses all the funds to create new knowledge assets and publish them to a knowledge curator, which in this case knowledge marketplace. Then, the knowledge_access parameter of the winning researcher increases. Unlike the baseline model, the profit-sharing model allows the second researcher to use their own money to buy into the marketplace and gain the same knowledge_access point as the other researcher, thus keeping the competition for the next grant the same as in the last step. The first researcher now has assets in the marketplace and will receive corresponding rewards in the following steps when the funds of a new research project are used to get access to compute services/data/algorithms (equivalent to getting paid when people buy your product). The number of knowledge assets in the knowledge market increases with each research project finished.
The staker agent stakes a fixed amount of a specified token in the knowledge marketplace and receives rewards from the transaction fees according to the size of the stake (there is room for experimentation regarding the behavior of the staker, but a simple approach is that the staker will add all rewards to the staked tokens to receive more rewards in the future). Lastly, the treasury takes a fixed ratio from the transaction fees (this happens anytime a transaction in the knowledge market is made).
This model should be more sustainable than the baseline model, meaning there should be more research projects funded. At the end of the simulation, I presume most of the value will be concentrated in the knowledge market, however, there might be variations in the amount of tokens the researchers hold (depending on whether there will be a single winner of most of the grants or whether we will see fair competition throughout).

Further description of the profit-sharing model

At the heart of the profit-sharing model are DataTokens, which essentially act as wrappers for data assets. Each DataToken (DT) represents a different asset in the marketplace and whoever holds a DT can gain access to that asset (the type of access is specified at the minting of the token and stored as metadata, so for instance you could have a DT that would grant you one-time-access to look at a specific dataset/algorithm). What makes DTs quite convenient is that you can specify quite a lot of parameters, e.g. the type of asset that the DT represents. While the profit-sharing model in TokenSPICE doesn’t specify any of these parameters for a DT, it uses DTs for any transactions that happen in the KnowledgeMarket (see schema above). With this in mind, I’ll give a brief description of two “research cycles” of the simulation:

  1. Initial Conditions: All agents have a fixed amount of OCEAN to begin with (DAO Treasury has the most) | KnowledgeMarket has 1 knowledge asset
  2. ResearcherAgents submit their proposals (with some parameters fixed and some randomized, see the baseline description above)
  3. DAO Treasury algorithmically determines the winner and sends them the requested amount of OCEAN
  4. The winner (Researcher) uses part of the funding (determined by a fixed ratio) to buy a DT from the KnowledgeMarket and immediately consumes it to gain knowledge_access += 1. Then the remainder of OCEAN is used to publish results and new assets to the KnowledgeMarket. (Note: the KnowledgeMarket takes fees from buying the DT and sends them to DAO Treasury and Staker (ratio determined algorithmically))
  5. The loser (Researcher) uses part of their own money to buy a DT and immediately consume it to gain knowledge_access += 1 (same thing with fees as before)
  6. Staker stakes all the rewards received in this round (so that next time, the rewards will be greater)
  7. New Research Round Steps 1-2 repeated
  8. Same as (5) with the exception that the OCEAN spent for the DT is now transferred to the owners of assets in the KnowledgeMarket, i.e. the previous winners of the grants (when both Researchers have assets in the KnowledgeMarket, the OCEAN is split between them according to the ratio of the number of assets they published).

There are two main parts of the ecosystem which this model simplifies. Firstly, there should be separate DTs for each asset published (with separate pools that automatically exchange OCEAN for DTs). This is simplified into one big pool with all assets represented by the same DT (minted multiple times). This means that we don’t have to worry about there not being enough liquidity in a portion of all the pools. In a future iteration, we could add new Stakers to the model that would always add liquidity to the different pools, but right now, we’d like to answer the question whether this new model can outperform the baseline in terms of number of papers published, number of knowledge assets created, and the fairness of the competition. Once we have established that this model is superior to the baseline according to these metrics, we can ask additional questions about the specific functionality of the KnowledgeMarket, which might include: How are knowledge assets going to get enough liquidity to support long-term knowledge sharing? What happens to the model when some of the knowledge assets become public goods available to anyone?

Comparison with the Web3 Sustainability Loop

The Web3 Sustainability Loop is an excellent model to compare the profit-sharing open science model with. Below is the schema for the WSL taken from here.

If we compare this to the profit-sharing model described above, we can see certain similarities. Firstly, the DAO Treasury and the Stakers are more or less the parts that form the Token generation and Curate $ boxes in the WSL. The Researchers are the ones who receive funding to do work (i.e. research) and create knowledge assets from it, which are then published to the KnowledgeMarket (the web3 ecosystem) to generate even more value than what was previously present in the ecosystem. Part of this value then circles back to the community which has incentive to again distribute it to Researchers who create more value. This comparison with the Web3 Sustainability Model serves as a sanity check to make sure this initial open science model has some merit.

Initial results

I have created an initial version of the web3 profit-sharing model (the code still needs some work) and ran a couple of simulations to see whether the results are in line with what we would expect from the schema above.


The plots above show the amount of OCEAN that researchers acquire over time. This OCEAN belongs to them and does not come directly from the research grants, rather, it comes from the researchers buying and selling assets from the knowledge marketplace. There are many variations of this model that can be tested, including

  • variable number of researchers
  • variable funding periods
  • variable number of stakers
  • different transaction fees for the knowledge market

and much more. The results from the plots should serve as initial reassurance that this model might work, but there are still some limitations that I didn’t cover yet. For instance, the simulation seems to terminate after a fixed number of timesteps, which needs to be changed if we want to compare the number of possible research projects funded by this model versus the baseline.

Week 4

This week, I focused on refactoring the profit-sharing netlist and agents to

  1. make the code more readable
  2. allow easier experimentation with different parameters (number of ResearcherAgents, fees, etc.

While doing so, I wrote two tests for the OpscientiaDAOAgent and the ResearcherAgent, which revealed some issues with the profit-sharing netlist that needed addressing. Firstly, the knowledge_access index should always be aligned across all ResearcherAgents provided they have sufficient funding to buy into the KnowledgeMarket. Furthermore, writing the unit test has forced me to choose a fixed wiring of the agents within the netlist because the order in which the agents made their actions determines whether or not they will be correctly aligned. In the current configuration, the agents are aligned in the following sequence:
2. OpscientiaDAOAgent
3. SimpleStakerAgent
4. all ResearcherAgents

I have also improved the KPIs.py to be able to log and plot results for an arbitrary number of researchers. Below are some plots showing the results for simulations with 50 and 100 ResearcherAgents respectively.


In these plots, we can see that we are reaching a law of diminishing returns (of sorts), because even though many researchers are able to receive funding, publish datasets, and receive rewards when other researchers buy their data, some researchers will inevitably not be supported. This might be improved upon, however, by testing different initial amounts of OCEAN for the researchers, funding more than one research project at a time (and perhaps fund projects on a rolling basis rather than in fixed time intervals), etc.

Drawing from the results received so far, here is a list of possible iterations of/improvements to the profit sharing web3 model:

  • funding of multiple proposals at a time
  • rolling basis funding (once one project is finished, another can be funded immediately)
  • differentiation of public & private assets, types of knowledge assets (data, algorithms, cloud services)
  • enable throttling of the availability of funds (for the DAO Treasury)
  • inflation model with token minter (& experimentation with the distribution of funds)
  • community voting for public goods funding

These models shall be the focus for the rest of the open web fellowship. As a side note, I also need to run the baseline model for the same amount of time as the profit-sharing model to get a fair comparison of the two models. In addition, the baseline model can be compared to real-world data from grant funding agencies.

Week 5

The main focus for this week was the development of the rolling-basis funding profit-sharing model with the option to fund multiple proposals at a time. To do this, I needed to develop 6 new agents:

  • MultResearcherAgent
    where the first three agents are adapted for having multiple proposals funded at a time and the last three agents remove the time-dependent funding mechanism.

Below are some plots from the profit sharing model with multiple proposals funded at a time.


As expected in this model, the funding is depleted from the treasury much sooner than in the previous model since the only adjustment is that more proposals are funded at a time.

Comparing that with the rolling basis funding model:


Love this.

I’ve also just started thinking about how to do this in a data science eco-system with Algovera. This likely involves digging into a higher fidelity model of the data consumer agent. For example, the arrow with “consumeDT(): sendDT, lag, get $O” means that the data consumer buys the dataset and takes some time to create value on top of the dataset. Maybe this involves developing a science model or a data science algorithm. How can we model this innovation component better?

Would be great if you could join the weekly TokenSPICE hack on Mondays with Trent from Ocean and Angela from TE Academy (although there’s a clash with Opscientia DAO). We also have funding for contributors through oceanSPICE (see oceanDAO R7). We’re currently working on implementing and comparing Ocean V3 vs V4 dynamics.

Anyway, seems like there might be some synergies. Happy to unofficially co-supervise (or even better, maybe we can talk about co-sponsoring - would have to check with the rest of the team).

1 Like

Hi Richard,

Thanks for chiming in! We’d love to ideate around different mechanisms and simulation designs. We can move the Opscientia DAO Hall to accommodate for this! It would be great to get community feedback on what we are building and funding to supplement Jakub’s fellowship.

We’d love to bring you on as the first community co-mentor for a research project! Jakub and I have a weekly meeting at 1:00pm CET, would you be able to make this next week? I will send you an invite.

RE: innovation component
There are so many free parameters in tokenspice which makes it powerful but also potentially bewildering for open ended problems. I can see the lag component defined either as very high time resolution (i.e. an autonomous agent that must process data, submit to sell, and wait for revenue) or low time res (an actual team or human doing analog work that results in direct/indirect value accrual). In terms of modeling innovation, I like the idea of reputation that compounds over time with trust - similar to the stack overflow model.

Lots to discuss!! Excited to get into it soon.

1 Like

Hi Richard,

Thanks for the feedback. I’ll definitely try to join the TokenSPICE hack this following Monday, it would be great to talk about this more. In terms of modeling innovation, I think that in addition to the lag component there might be a way to use staking as an indicator of innovation, so for instance datasets or algorithms that bring a lot of value to the community would be determined by the number of tokens staked on that specific dataset/algorithm and then there could be some graph tracking the relationship between the usage of a dataset in a research project and the corresponding number of datasets/algorithms that that research produced thanks to the original one (this partly overlaps with the reputation system as Shady mentioned I think).


Hi Jakub,

This is an excellent project and fits well in line with the Opscientia mission to engineer token communities that are parameterized around an objective function of producing, curating, and sharing digital knowledge objects in the form of algorithms, open data, and executable notebooks. I support funding this project with the funds set aside for the Open Web Fellowship.

Below is my feedback!

Quality & Impact

The proposal is high quality, well defined, and fits within the Opsci mission. It is well formulated and should lead to interesting results and contributions that benefit the wider community.


I believe this proposal is feasible within the 12 week period but requires a bit of clarification around the deliverables. I recommend setting aside two weeks for becoming familiar with the TokenSPICE toolkit, meeting with the token engineering community (Mondays 5-7p CET), and specifying the design space for parameter choice.

Suggestion: Baseline Model

I would set aside an additional two weeks for defining KPI’s, agents, and token architecture. Your first simulation should be a baseline toy model that represents the current status quo:

  • An exhaustible external grant funding agency, with multiple agents competing for research funds
  • Commodities for time (i.e., full time research / half time teaching), data storage, computation, knowledge (i.e., publications)
  • Value should flow from grant funding source to researchers which use funding to produce knowledge assets
  • Researchers use funds to publish assets to knowledge curators (journals)
  • Researchers pay to access assets from knowledge curators
  • Researchers need a threshold level of access to knowledge to successfully compete for funding

In this model, knowledge curators should trap the value in the system and all activity should cease when external grant agencies cease to provide funding. It would be useful to see the “carrying capacity” for research activity in this sort of model. We should see a winner take-all emerge where successful grant winners with prudent spending can continue winning future grants.

Suggestion: Profit Sharing Model

An alternative model would be a web3 based token community with a positive feedback loop.

  • A community treasury that accumulates funds from fees on activity in a knowledge market
  • Research grants awarded pseudo-algorithmically with human oversight
  • Researchers paid from treasury to publish open access data
  • Researchers can post algorithims for other researchers to purchase
  • Researchers can post computed datasets for sale to consume by other researchers
  • Executable Notebooks are available for free to view but must consume credits to execute upon fork
  • Staked tokens accumulate portion of fees from market activity

In this model, the knowledge curators are automated away as a set of smart contracts that foster findability, accessibility, and interoperable reproducibility. Instead the community treasury accumulates value produced in the form of activity on the knowledge market. Value is redistributed to active researchers that post quality proposals, publish data or algorithms, or make their notebooks available for use.

A useful comparison to the baseline model would be long-term sustainability of the community - at what point of network participation does system equilibrium occur in which the number of grants given result in increasingly optimal output of knowledge objects? Does the carrying capacity of this model outstrip the baseline model? And what are potential attack vectors that allow winner-take-all mechanisms? Do we need “cool-down” periods to prevent domination by individual agents?

Suggestion: Parameter Selection & Agent Definition

Please list out the parameters you will be using, which will be static, dynamic, random. Also please list out each of the agents and a rough description of their expected behavior. Lastly consider adding a flowchart that demonstrate the experiments.

Community & Contribution

As part of the mid-term evaluation, please consider preparing a presentation with your first set of results to present to the community for feedback. We will expect a final github repository and report to be hosted on the Opsci notion at the end of the project.

Contingencies for Alternate Outcomes

There are no foreseeable issues regarding alternate outcomes for this pure simulation-based exploratory study. All insights are useful, whether positive or negative because we don’t know what to expect.

Conflict of Interest and Ethical Concerns

No living creatures are affected by this study. No data from humans or animals is being used. All data and findings will be open sourced with appropriate licenses for proper attribution. No COI is anticipated.

1 Like

Delighted to get such a positive response from you both. And honoured to be the first community co-mentor!

Thinking about modelling reputation would be really interesting. And tracking the relationship between datasets that produce new datasets or datasets that produce algorithms is a really interesting one (this isn’t currently possible within Ocean Protocol as far as I know, but is an important piece to make sure that value flows back to data providers when an algorithm is called). Also, love the idea of incorporating public goods funding.

I’m excited to start working more closely with you and look forward to exploring more at the weekly meeting :slight_smile:

Hi Jakub, thanks so much for submitting this really interesting interesting proposal. I learnt a lot by going over it! My main comment is about the scope: You say above that you’re modelling the marketplace component of the ecosystem component to begin with, but Shady has suggested for the baseline model you also look into data storage, publications, etc (so value around the wider ecosystem/community). When reading your proposal, I did have in mind to ask how the marketplace component will fit with the other parts, so I’m glad Shady has suggested this, but I’m unsure if this is possible in your project and if instead it’s best to stay focused on this for now. If you do focus on marketplace only, it’d be good to still add ideas/next steps for how this fits with the wider ecosystem. Really interesting and well written proposal - thanks so much for submitting!

1 Like

Hi Jakub,

Thank you for your submission, and for the initiative to submit a proposal. It’s good to get involved early, so I first want to commend you on this step. Congratulations, well done!

I’ve had a look at your proposal and given a review for you and the Opscientia community. Overall, you’re proposing to model tokenized DeSci communities using an agent-based approach. To do this, you’ve proposed to expand the toolset of a simulator of tokenized ecosystems called ‘TokenSPICE’, to give it the ability to specifically model DeSci communities. Such modelling efforts will ultimately aid in the design of such tokenized communities. I believe it is important to do such work.

Your proposal needs a bit of work, especially on clarifying your aim and objectives. Your aim as it stands is not well aligned with the scope of your approach and needs focus and refining. There is some important context missing, and you might need to rethink your use of figures. As it stands, I have a general idea of what you might be trying to do, but there is some work to do to make it a lot clearer. If this proposal does get funded, you can use my review as constructive feedback for your report too.

I have some doubts about the workload of your second year of studies, and being able to fit in 20 hours of work per week. To be honest, I don’t think this will work, even though your CV shows a lot of extracurricular experience. It might be better to postpone the work to the summer, or else the fellowship might have to be spread out over more than 12 weeks, if this is something Opscientia is prepared to amend.

You will find my more detailed comments below:


You need to be more specific on your approach and the likely outcome and impact of your study. You can add a few lines below your closing paragraph to explain the approach (agent-based modelling?), and how your results will feed into further research or decision making. Make sure you clearly state your aim and your objectives, which you will outline in the sections below.

Implementation plan:

There are a few things you seem to jump over, and which are very important to clearly state in this section. This again ties in with a well-focused aim and objectives that lead from that aim. For example, it is not clear to me how you go from running simulations, to identifying the optimal community. I think the problem here is that your aim is too ambitious, and you need to focus your aim to fit with the scope of your project. I’m also missing some justification for your approach and use of tools, which needs some context. See more detailed comments below:

  • Could you explain your first figure? You say there is an example below, but you don’t say much about the example. Can you label it Fig. 1 and give it a caption? This will help readers understand what you’re trying to say with the figure.
  • As a big picture question, I’d like to hear your reasoning for the focus on tokens. E.g. NFTs might become an important part of the DAO ecosystem through e.g. recognition of contributions and status within the community. This could be just a few sentences, or maybe you could consider this as part of your project, acknowledging of course your limited time resource.
  • You jump straight into your approach using TokenSPICE. Could you give a quick justification of using this approach? Is this the only tool that exists to do agent-based modelling of tokenized communities? Could you give a quick intro?
  • Can you define KPI before using the acronym?
  • How will you test your model against reality? E.g. would you test it against data of known community dynamics? It is important to set out what your aim is here: to provide possible models that will then be available for testing against real data, or to actually identify the optimal token community? Before you reach that latter aim, I believe you will have to do more than run a few simulations. Thus, you might have to focus your overall aim on what you can deliver.

Minimum deliverables;

A more appropriate aim might come from what you’ve written here: e.g. establishing a methodology for simulating token ecosystems, with a particular focus on DeSci applications.

Aim/deliverable 1:

I think this should be called an objective/deliverable (semantics, but helps to separate aims and objectives: one aim, comprising of two objectives serving that aim).

“centered”: if you’re writing in British English (noting you’re based in the UK) this would be spelled as “centred”.

Again, can you give this figure a label and caption it? You also need to say something about this figure to let us know what you want the reader to understand from the figure. There’s a lot going on! There are also a lot of acronyms and I have no idea what’s going on. It might be better to make your own simplified figure here, supporting the message you want to convey.

Now, the key question here is: what is the aim of a DeSci community? Ultimately, it will be the same as science at large: produce, curate and improve our knowledge and understanding of the world. So there might be some KPIs that assess this general aim. What particularly about a DeSci community might be an important aim? Maybe the diversity of scientists working on a problem, e.g. from geographic locations? The accessibility of publications/funding? The speed of proposal to funding? It might be worth thinking a bit more about these questions. I don’t think number of papers published our researcher’s income are particularly helpful assessment tools.

What is your deliverable? New agent classes and a new netlist → so in essence your deliverable would be an expansion of the TokenSPICE software to include tools to model DeSci communities? These tools can be used by future researchers wanting to model DeSci token dynamics?

Aim/deliverable 2:

I think this needs a rephrase, since I don’t quite get what you’re trying to do here. Is it creating a methodology of multi-agent based modelling of DeSci communities? And this methodology is to be used by designers of DAOs? How does this differ from your first deliverable? Maybe it is that your first deliverable is contribution to the software (agent classes and netlist), and your second deliverable is a written report/paper about the methodology. A readme for those wanting to use the expanded toolset?

Time and commitments:

15 -20 hours per week on top of full-time studies is very ambitious. I can see many reasons why that won’t fit in your schedule. Have you talked about wanting to do this with your personal mentor/ programme director? I think you will need to have some conversations at uni about this project and possible support that they could give you. Second years tend to be much more involved than first year studies. I can see you’ve done a lot of extra-curricular activities, so you might be very good at multi-tasking. Still, I’d feel more comfortable supporting your proposal if you had a statement from your university. This might also work better as a summer project, so consider postponing the fellowship. Ultimately, I would prioritize those with a break in their education or career in this case, aligning with the expectations and responsibilities of the fellowship.


Hi Jesse,

Thank you very much for the detailed feedback! I really appreciate it and will update my proposal accordingly.

I definitely agree that I need to explain the objectives, tools, terminology, and diagrams used better so that there isn’t room for confusion. Regarding the aims of a DeSci community, I think it might be interesting to explore the diversity of scientists, accessibility/speed of funding, and overall access to resources as the indicators of success, although I do still believe that income of researchers must necessarily play a crucial role as well, simply because it can be viewed as 1) the utility that researchers provide and 2) an indicator of long term sustainability (meaning that researchers can keep doing research in the long run). I think this is also connected to the number of research papers published through the community, which is firstly an indicator of the engagement within the system, and also of a successful incentive mechanism to complete research projects and produce research papers (although some work needs to be done on determining the indicators of a high-quality research paper). Lastly, concerning the time commitment, it is hard for me to comment on it as of right now because my term has just started this week, therefore it is difficult for me to say with certainty whether it is possible to spend 15-20 hours per week on this project. I will communicate with my mentor to think of alternative arrangements if it isn’t.

1 Like

Hi Sarah,

Thank you very much for the feedback. I think the marketplace component should in itself contain the data storage, publications, etc. since it should serve as the component that makes the whole ecosystem sustainable. I’d say the question is, how do we capture as much value from a research project within the ecosystem as possible? Since research projects can add a lot of external value (through the invention of new technologies or algorithms, for instance), I think it’s important to try to capture as much of that value within the open science ecosystem to provide a long-term incentive for all participants to stay within that ecosystem, which is why I think the marketplace component should serve not only as something researchers use to get everything they need for their research, but also as the place where researchers publish their findings after their project is over, and in doing so keeping the value of that research within the system.

1 Like

Week 5 Summary Continued

I have reached the maximum number of characters for the main post, so from now on, I’ll post my weekly updates here.

The rolling basis funding model has a number of major changes from the previous models. Firstly, the individual agents are more inter-connected, since their actions are now influenced by the actions of other agents rather than by the number of ticks that have passed since the previous proposal. In addition, since there are multiple proposals being funded at a time, the knowledge_access index is no longer synchronized between researchers, since if two researchers conduct independent research projects, the researchers that have not been funded can buy into the market for both research projects to actually gain more knowledge_access points than the ones who have received funding. This means that whenever a research project finishes, the researchers who have not been funded previously have a much higher chance of being funded. The plot below shows how the knowledge_access varies in a simulation of 5 researchers with 3 proposals funded at a time.


Notice that the researcher with the highest knowledge_access did not win the initial funding. Since funding is granted on a rolling basis, each proposal now has an additional time property that is randomly generated (in the range of approximately 6-16 months) and is considered as a variable in the proposal evaluation (done by the MultTimeDAOTreasury.

Week 6 Symmary

This week I focused on the development of the public funding profit sharing model. More details below.

Question that this model answers:

To what extent can we use the web3 profit sharing model to fund public science?


Is the model scalable in that private researchers will want to use the market? Are they going to be compensated enough?

Schema of the public/private open science model

This model should be thought of as simply a higher resolution profit-sharing model, because the main mechanics are quite similar to the previous profit-sharing models. The key difference is that this model differentiates public/private goods, data/algorithms/compute, and the researchers are also split into data providers, algorithm providers, and compute providers. Further, the DAO Treasury only funds public goods. I want to model the adoption of the market by private research agents as well to see what that does to the overall sustainability of the system.

The top loop is actually almost exactly the same as the profit sharing model, the only modifications to that are:

MultTimeResearcherAgentVersatileResearcherAgent needs to have the public property and one of these two properties: data, algorithm (if there is a list of agents, these can just be randomly, perhaps in some ratio so that we have certain number of each type)

  • data: means this is a Data Provider researcher → part of the funding is used to collect new data (by running experiments, think of a neuroscientist collecting EEG data), the rest is used to publish to the public market
  • algorithm: means this is an Algorithm Provider researcher → part of the funding is spent in the knowledge market (collecting data), the rest is used to publish the algorithm to the public market (Note: should this agent only buy from the public market or from the private market as well? If from both, how is that action determined?)

PublicKnowledgeMarketAgent and PrivateKnowledgeMarketAgent work very similarly to MultTimeKnowledgeMarketAgent from previous models, but now they keep track of the more complex parameters and they are less dependent on the DAOTreasuryAgent for signals.

The bottom part of the model can be thought of as the second loop. It is essentially representing all the private research that is being done within the open science space. Firstly, let’s consider the incentives a private entity might have to participate in this ecosystem:

  1. greater and cheaper access to data, algorithms, and compute services
  2. the continuous rewards from other people using your data, algorithms, and/or compute services

Now let’s take a look at the agents:

VersatileResearcherAgent: as above, but has the private property and then one of 3 properties: data, algorithm, compute, signifying whether this researcher sells data, algorithms, or compute-to-data service (which means they have a dataset too large to store in a cloud storage, so they allow other researchers to run compute services on that data)

  • data: Data Provider
    • publishes datasets to private knowledge market (and receives rewards from those)
    • buys algorithms to produce new data
  • algorithm: Algorithm Provider (more active researcher)
    • buys data and compute services from either private or public market (as part of their research)
    • publishes algorithms to private market (and receives rewards from those)
  • compute: Compute Provider (rarest)
    • publishes compute service to private market (where compute service means allowing other people to run their algorithms on this agent’s data on this agent’s machine) (receives rewards from that, rewards are higher than for normal data)

The algorithm agent is a representation of a researcher who conducts private research and uses the knowledge market to access all necessary resources. The data agent represents a researcher who is regularly running experiments and they want to share their data to get more value out of it. The compute agent can be thought of as a centralized organization coming to the DeSci space to earn more income, they already have a lot of data and compute power and their research does not really make use of the knowledge market.

How are the actions of the agents determined at each step?

The two agents who are buying from the market are data and algorithm. Here is what their buyAssets method could look like:

  1. choose whether to buy from public or private market
  2. choose whether to buy data or compute service (only for algorithm)
  3. choose whether to buy algorithm or compute service (only for data)

How is Public & Private Market differentiated?

PublicMarket has lower prices, but a smaller number of assets


Below are the plots for a simulation with 3 private researchers and 2 public researchers.


As expected, the PrivateKnowledgeMarket has more value than the PublicKnowledgeMarket, simply because it costs more to publish there and there are more private publishers than public ones. In addition, the PublicKnowledgeMarket collects the most fees because buying from it does not represent buying from another researcher, but rather from the DAO. It is also interesting to see that even after 30 years, the DAOTreasury still has around 300k OCEAN, therefore this system seems to be quite effective at funding public goods in the long run. Note: The Researcher OCEAN plot incorrectly only displays the OCEAN of the public researchers, namely their salary from the grants. The OCEAN of the private researchers is not zero, it is just tracked by a different variable and needs to be plotted.


One major limitation of this model is the setup of the actions of the private researchers. Currently their taking actions synchronously with the public researchers, however, this leads to their funds eventually being depleted, because even when they get rewards after somebody buys their service, they are going to spend even more the next round on publishing new assets. This will be improved in the next iteration of the model, where the private researchers would wait for some threshold ROI before publishing any new research. Furthermore, this would go hand in hand with modelling the growth of the community as a whole based on the value in the marketplace (or some other metric).

Week 7 Summary

This week, I focused on cleaning up all my code, making sure everything was running as expected and I gave my midterm presentation at the DAO Hall. Doing so made me uncover some bugs in my code in the earlier models, for instance in the first profit_sharing model, where I was still using a set to store and connect all my agents, which led to some problems. I worked on debugging all these issues and I wrote a new unit test for the KnowledgeMarketAgent to make sure everything is running accordingly in the future.

I also started researching NIH funding data that will hopefully serve as a validation of the baseline model and have found some interesting results. I am specifically interested in R01 grants and I also looked specifically at neuroscience projects. Interestingly, the two biggest grants have been given to projects that have been funded consecutively for over a decade. This leads me to the overall R01 success rate data, which showed that in 2019:

  • Success rate for new research project grants: 18.7%
  • Success rate for continuing research project grants: 44%

I have only just started getting into this data, but even from the limited datasets I looked at, it seems to be the case that continuing research projects are more preferred than new research projects.

Week 8 Summary

This week I focused on improving the profit sharing model, cleaning up my code, and simulating community growth. Firstly, I took some time going through the models to see that everything is still working correctly. I identified some inconsistencies between the models in addition to a number of static typing errors, which are all now fixed and the models are all running as expected, with multiple unit tests for the individual agents (which are still being added on a rolling basis). I also made sure to complete all model summaries, which will be helpful when writing the documentation and guide on GitHub.

The public funding model had a number of issues that needed addressing. Firstly, I expanded the KPIs.py to correctly plot the data from private and public researchers and to include their asset types as well (each researcher is classified as either public or private and as one of compute, data, or algo). Secondly, I refactored SimState.py to support an arbitrary number of private/public researchers.



I also improved the actions of private researcher agents to always wait for ROI after they publish knowledge assets to the PrivateMarketAgent, which led to them publishing a fixed number of knowledge assets and not really making a profit in the long run. This makes sense because it is very expensive to publish to the PrivateMarketAgent and it is not as expensive to buy from it, but there isn’t enough volume of transactions so that researchers get their ROI.

This brings me to the newest addition to the public funding model: community growth. I created a new ResearcherGeneratorAgent which adds new private researchers to the simulation based on either the knowledge assets in the marketplace, the treasury updates, or time, with the option to either have linear, decreasing, or exponential growth of the community.

Lastly, I have been researching the data from NIH to extract information about the success rates of R01 research proposals.


Week 9 Summary

This week I took a step back to get a higher level overview of this entire project. What is token engineering for science all about? What’s more, what is an effective and fair science ecosystem optimizing for? Based on my discussing with Shady and Richard, it became clear that this needs to be addressed, so I spent some time this week thinking about the objective function of science and began writing a blog post on it. Furthermore, I wrote 3 additional blog posts on the work I’ve done so far which are to be published on the Opscientia pulse.

I took the time thinking about the objective function of science also as a means to get more information about existing grant funding agencies (since they are already optimizing for some metrics related to science) and have found interesting information about the grant decision process implemented by the NSF, which serves as a good reference point for the development of higher-resolution open science models in TokenSPICE.

Development-wise, I added new metrics to the proposals generated by VersatileResearcherAgents that are also used in the decision making process of the VersatileDAOTreasuryAgent, namely integration denoting the interdisciplinary nature of the research proposed, novelty, which is a measure of how independent a research project is from previous research, and lastly impact, which is a measure determined by the maximum of integration and novelty, multiplied by 10 (integration and novelty are always numbers between 0 and 1). All of these parameters were added to the evaluateProposal function of the VersatileDAOTreasuryAgent and we’re keeping track of them in KPIs.py. The question we’re trying to answer is: given all the parameters in the proposals, what kind of research are we funding (in terms of the new parameters)? Presumably, we want to fund research that is both novel in its own field, and will have a high degree of integration across the board.

| Integration, Novelty | <0.5 | >0.5 |

| — | — | — |

| <0.5 | :frowning: | :slightly_smiling_face: |

| >0.5 | :slightly_smiling_face: | :grinning_face_with_smiling_eyes: |

The table above shows this basic idea. This relates back to the objective function of science and more work needs to be done in that regard, because there is value in research for the sake of research which might not have any impact for decades.


The plots above show the integration, novelty, and impact of the projects funded by the VersatileDAOTreasuryAgent. The top plot also has a in_index, which is just integration multiplied by novelty. We can see that the majority of research projects funded have an impact higher than 5, but there are still some projects that are in the lower impact category.

Week 10 Summary

This week, I spent some time fixing a bug where the growth public funding model was incorrectly logging data into a csv file, which affected the subsequent plotting of that data. After that was fixed, I started working on the final model that builds upon the consideration of the objective function of science and the work from the previous week on the improved metrics. Firstly, I expanded the KPIs.py to make use of the metrics we have tracked so far on the level of individual agents to show some characteristics of the entire ecosystem. To recap what data we are gathering:

Available parameters:

  • number of research projects funded

  • number of research projects funded per researcher

  • OCEAN per researcher

  • OCEAN in DAO Treasury

  • OCEAN in market (representing knowledge assets)

  • number of knowledge assets in market

  • number of knowledge assets in market per asset type

  • number of knowledge assets in market per asset type per researcher

  • average integration, novelty, and impact of currently funded research projects

  • total fees collected through market

These parameters are then used to calculate these additional metrics:

  1. The value of the knowledge markets - currently the best indicator of the objective function of science (as value in the knowledge market literally means the value of the knowledge that is contained within them)

  2. The value that ends up in the hands of researchers - an indicator of the fairness of the system, i.e. are the people doing the research getting rewarded?

We are also keeping track of the value flow. This yielded the following plots for the growth public funding model:


Note, this model has a large number of private researchers with considerable amounts of OCEAN in their wallets, hence their relative value dominance in the system. Nevertheless, the knowledge markets are increasing in value over time (denoting the new knowledge that is encapsulated within them). I added these new KPIs to the older models as well. For instance, in the mult-time-profit-sharing model:

Relative_value_distribution_in_the_system_LINEAR 1

This value distribution is more representative of how an open science community should function, where the value from the treasury is transformed into knowledge (the value in markets) and researchers (their compensation for producing this new knowledge).

Now to the final model. Based on my conversation with Shady, I would like to introduce a new agent to the public research funding model that will fill in the gaps of the previous iterations. Going back to what the role of science is, a major limitation of our approach was that we only focused on the researchers, i.e. the stakeholders who actively participate to maximize the science objective function. Why is this approach limited? Because the good that comes from maximizing the objective function of science doesn’t take shape in the research, but rather in how that research impacts the lives of everyone, how it affects the technologies we use, how it affects education, etc. In essence, an open science community, on top of enabling scientists to conduct research, should maximize scientific engagement. Scientific engagement leads to faster adoption of current research (or integration of existing knowledge assets), and it leads to an expansion of the scientific community itself (think of the role of popular science writers in onboarding people to science). With this in mind, I propose an iteration of the public funding model that adds a CommunityAgent and I will design the netlist so as to maximize community engagement. A CommunityAgent should be rewarded for certain actions by the DAOTreasuryAgent, but should also contribute to the DAOTreasuryAgent once certain parameters reach some kind of threshold (think about playing an open source game that gives you rewards that you contribute to support future development).

Week 11 Summary

This week I focused on improving the final model with the CommunityAgent. As the models become more complex, it is important that we effectively track all the relevant data and that the simulations are as modular as possible, hence, I expanded the latest model by introducing a new ResearchProject class. ResearchProject is not an agent as it does not inherit from AgentBase and does not take any steps, however, it is stored in the simulation state and tracked by KPIs.py. Each ResearchProject is created by a VVersatileResearcherAgent which has been funded by the VVersatileDAOTreasury. Currently, a ResearchProject has the following attributes:

  • name
  • creator (the ResearcherAgent)
  • value (corresponds to the funding the researcher received for it)
  • impact (initialized to a value from 0-10)
  • integration (value from 0-1)
  • novelty (value from 0-1)
  • engagement (unbounded).

The ResearchProject allows for tracking new metrics such as academic lineage, how different research projects are related to each other, and more (although most of these metrics are currently work in progress). In the current simulation, the CommunityAgent randomly interacts with a ResearchProject from the available pool, which increases that particular project’s engagement and impact, further interactions might be possible in the future. Below are some plots showing the metrics of all ResearchProject instances in a simulation. Note that this expansion of the simulation required additional functionality to the tsp commands, which currently break all simulations of the previous models not using ResearchProject. If anyone wants to run the previous simulations, just retrieve an earlier commit.


Another change to this newest public funding model is the addition of salaries for public researchers. Since public researchers are not getting any passive income from the knowledge assets they publish to the knowledge market, it makes sense for them to be compensated via a salary to cover their living costs, which, up until now, wasn’t reflected in the simulations.


Now, public researcher’s OCEAN is increasing over time as in the previous profit-sharing models.

1 Like