BD-4245 working notes
development branches
fio
feature/BD-4245-fio-develop-12132022
fio.test
feature/BD-4245-fiotest-develop-12142022
fio.devtools
feature/BD-4245-fiodevtools-develop-12142022
PRs
fio.devtools
fio
fio.test
instructions.
pulll all branches,
build and install fio repo.
start the chain.
run the fio.test branch this will load the chain with 21k domains (the public private keys and account name of the account receiving the domains is printed at the top of the tests, use the pub key in get_fio_domains testing once the chain is set up.
desired tests --
run a full set of regression tests, verify these run.
load the chain with the 21k domains.
then run a script that constantly calls get_fio_domains, while doing this run a regression
we will see if the overhead of the 21k domains causes failures in regression tests.
Integrated testing on 3 node private test net
Ed will set up on the following AWS server
ssh -i "DapixWestPair2.pem" ubuntu@ec2-18-237-87-177.us-west-2.compute.amazonaws.com
Ed will complete the above testing and also will explore various mods to the timeout for secondary index queries on the API node.
results will be capture here as they occur.
environment setup and establishment of regression tests is ongoing.
tactics
I started by re-establishing tests to load the domains with “toooo many domains per fio address”.
I then removed the time constraint on get table rows in the chain plugin.
I then ran tests to see how this version performs.
1) first I loaded an account with 20k domains.
2) I made a script to get these domains on this account in the background with .1 seconds sleep between calls to get_fio_domains.
3) I loaded the chain, then kicked off the query while creating the 20k domains, this all ran as expected.
4) then I ran testnet smok test tests while querying get_fio_domains. these ran as expected
5) then I ran the multiple runs of the testnet smoke tests concurrently while running the query to get_fio_domains. these ran as expected.
6) then I ran two instances of the queries on get_fio_domains concurrently while running the smoke tests, these all ran as expected.
next steps:
ill take a step back and regroup and try to find a way to demonstrably crush this server…
I met with Eric and we agreed to do this same kind of test but increase the number of domains on the account to become 100k
im setting up to run these tests next
testing round 2
I increased the number of domains on an account to 104k.
the pub key for this account is FIO6jPe1dqPzyT15HuvBe8kpGdNA471tR6B2UixFLbR4sACQAc2tK
I then repeated my testing….my dev box became very sluggish, I think due to the amount of text being managed by terminal windows from the LARGE results from get_fio_domains..
from what I observed it seems that the node was still responsive…
I would like to have Eric run the smoke tests while I run the get_fio_domains, and see if this is responsive for Eric running the tests…
once we do this test we will regroup and meet up to discuss.
communications to the community
The FIO core team is looking into the feasibility of removing the processing time limit on state getters in the chain_plugin (in the present version used by FIO the time limit for a getter is set to 100ms in the chain plugin).
for background, please refer to EOSIO issue Non-deterministic output from get table with large limits · Issue #3965 · EOSIO/eos
One question the FIO Core team would like to answer is this, what is the intended purpose of the time limit called WALKVALUE
within the chain plugin for get_table_rows_ex
, and get_table_rows_by_seckey
. This value is defined in the chain_plugin.hpp
Can this timeout be removed, or made to be configurable by an API node hosted in the FIO protocol. If so it could provide tunable and more consistent getter performance on API nodes when there are many 10s of thousands of rows per secondary key within a state table.
The next step for this will be to make changes that make the timeout configurable on the API node…this will be done under another story.