Blockchain structure  

  RSS

jerry
(@jerry)
Active Member
Joined:3 months  ago
Posts: 18
10/06/2018 5:32 pm  

I am trying to understand this video -- how blocks are chained in RChain

According to this slide, A block is consist of following fields

  • Header
    • parents hash ordered list
    • post state hash
    • new code hash
    • reductions/receipt hash
  • Body
    • post state which is a state trie
    • new code
    • reduction/receipts
  • signature
  • justifications

Suppose there are three regions {A, B, C} and they form namespaces = Powerset({A, B, C}).

From the perspective of validators from region A, they only execute Rholang term in namespaces {A}, {A,B}, {A,C}, {A,B,C}

First question, such a validator from region A will not pack executions/reductions from different namespaces into a single block. Right?
In other words,  An individual block only contains executions/reductions from the same namespace. Right?

Second question, how does "state trie" represent "post state" exactly? 
Is it represented as changes applied to tuplespace?  Or the post state of changed name in tuplespace after change?

Third question, what's the logic behind to link to parent block?
I have tons of questions here. So here I am trying to explain my thought but it is very possible wrong. Please correct me.

To make things simple, I want to exclude the impact from Casper's fork choice and suppose each region only has one trusted validator.

Step 1, Suppose there are three names.

  • name a from namespace A created in some block
  • name b from namespace B created in some block
  • name c from namespace C created in some block

Now I suppose each block only contains one step on rholang execution/reduction.

Step 2, the following code run.

for( x <- a ) {
    b!(*x + 1)
}

According to the rule, NS(@{for(x <- y) P}) = NS(y) U NS(@P), I think this term at least has to run in namespace AUB, right?

So both validators from region A and region B construct a block for namespace AUB. And the parents of this block are the latest block from namespace A and namespace B because the code refers to name a & b.

Step 3, suppose rholang term a!(1) execute in namespace A. There is a new block created in namespace A and its parent block is the latest one from namespace A.

Step 4,  b!(*x + 1) is executed in namespace AUB, and a new block is generated whose parents are the lastest ones from namespace A and namespace AUB.

I am not sure if my assumption is correct, please find the attached diagram.

Looking forward to any response 🙂


ReplyQuote
MichaelBirch
(@michaelbirch)
Member Moderator
Joined:4 months  ago
Posts: 33
11/06/2018 1:59 pm  

Hi Jerry,

Question 1: You can think of each namespace as an independently replicated blockchain, so yes, a single block will exist in a single namespace and only contain executions from that namespace.

Question 2: The state trie is a structure similar to Ethereum's Merkle Patricia Trie. The data which is stored in the trie is the entire tuplespace after the executions of the code in that block (which is why we call it the post-state). Of course, in the block itself is only the root hash of the trie, which is enough to validate that you have the right tuplespace if you got it from someone else or computed it yourself from genesis.

Question 3: In the region model, each namespace is an independent blockchain, as I mentioned earlier, so no blocks exist which have parents from multiple namespaces. However, the information in the blockchains in coupled through a process in which code can move from one namespace to another. The details of how this is done are still under development, but the idea is as follows. Each namespace has a set of names which it is responsible for. These sets are all disjoint from one another, so the namespace (A ^ B) has a validator set given by the union of the validators in A and validators in B, but the names it is responsible for are totally separate from the names that A and B are responsible for. Therefore, any `for` or bang (`!`) can be associated a namespace, based on the names it uses (note: cross-namespace joins are not allowed in this model) and each term is moved to the namespace it belongs to for execution. In the example you gave the code would begin in namespace A because `a` is the responsibility of A. Once the term `a!(1)` appears in A the reduction will happen, leaving the term `b!(2)`, which the validators in A will recognize as a term which is the responsibility of B, so will send that term to that namespace for execution.

I'm glossing over some details about how exactly the code is "sent" between namespaces because we haven't totally settled on what it will look like. But a simple way you can imagine it working is that if all validators are watching all namespaces then they can see when a term which belongs their namespace is "safely" included in another namespace, then propose it be moved. And validators in the original namespace will be able to see when the term is "safely" included in the correct namespace, then propose to drop it from their tuplespace state. Of course the problem with this is that it is not a sharding solution, as all validators need to watch all other namespaces, so the work is not really split, but that's why we're still discussing the details.


ReplyQuote
jerry
(@jerry)
Active Member
Joined:3 months  ago
Posts: 18
12/06/2018 2:24 am  

Thank you @michaelbirch

no blocks exist which have parents from multiple namespaces.

Does this imply parents of a single block must be from the same namespace but the parents' namespace could be different from current block's ?

I had the impression from this diagram. As you see, the block in AUB has two parents - one from A and the other is from B.
1

Perhaps things changed.
So the following diagram reflects the reality, right?
2

each namespace is an independent blockchain

The header field parents-hash-ordered-list may link to more than one parent blocks.  But if all parents must be from the same namespace and each namespace is a single chain, I can't image a case why there should be more than one parents. Do you have an example?

 

so the namespace (A ^ B) has a validator set given by the union of the validators in A and validators in B, but the names it is responsible for are totally separate from the names that A and B are responsible for.Therefore, any `for` or bang (`!`) can be associated a namespace, based on the names it uses (note: cross-namespace joins are not allowed in this model) and each term is moved to the namespace it belongs to for execution. 

@stay presented a slide in the Developer Conf, and that slide disclosed the rules below

  • NS(@{“stdout”}) = ┷
  • NS(@{x!(Q)}) = NS(x)
  • NS(@{for(x <-y) P}) = NS(y) U NS(@P)
  • NS(@{P|Q}) = NS(@P) U NS(@Q)

For this particular rule NS(@{for(x <-y) P}) = NS(y) U NS(@P) , isn't it saying `for(x <-y) P` must be executed in the namespace NS(y) union NS(@P) ?   

But I get another impression now after reading your reply --

  • `for(x <-y)` is first executed in NS(y)
  •  if a message arrives in x, after `for(x <-y)` reaches a finalized state(eventually consistency)
    The next term P is executed in NS(@P)

It seems to be a confliction with Mike's slide? 😥 

 

Edited: 2 weeks  ago

ReplyQuote
MichaelBirch
(@michaelbirch)
Member Moderator
Joined:4 months  ago
Posts: 33
12/06/2018 1:49 pm  

Your first picture is an old model where there is a notion of a "merge block". We are no longer considering merge blocks. The second picture could conceivably be accurate, as we have not discussed the details of how a new region could be created and it might be the case that its genesis block points to the block in which the new region was approved for creation. That's just one possibility though, as I said, we have not worked out those details yet.

As for multiple parents, the use case is as follows. Suppose that you have a fork in the blockchain do that normally your fork-choice would need to pick one side the fork, orphaning the other. However, in a situation where the two sides are independent (execute totally different sets of transactions), this seems wasteful as the they are not actually in conflict. So instead we might want to join the fork back together and continue the chain. This is when multiple parents is helpful; we can make a block which has both sides of the fork as parents. In general, the resulting structure is a directed acyclic graph (DAG), instead of a chain. This is important for for scalability because it means that validators can do independent work simultaneously and and have it all make it into the canonical blockDAG.

You need to be careful about how you interpret the the rules there. They are telling you how to assign namespaces to arbitrary names. Remember that in Rholang all names are quoted processes. Often, we simply use so-called "public" names which are names created by quoting some "forgeable" process, i.e. one that anyone can write with Rholang code, e.g. @"Hello, World"; or we use "unforgeable" names, which are those generated by `new`. The first rule above says that public names do have a specific namespace -- they can be run anywhere. The remaining rules tell you how to compute what the namespace is for "compound" names created by quoting a more complicated process. In the example I gave I was only using simple unforgable names, each belonging to a particular namespace. I did not have anything like `@{ for(x <- a){ b!(*x + 1) } }!("data")` for example, which would use the third rule above in order to figure out what namespace that send should execute in.


ReplyQuote
jerry
(@jerry)
Active Member
Joined:3 months  ago
Posts: 18
13/06/2018 1:44 am  

Hi @michaelbirch 

We, the Chinese community, are concerned about RChain's scalability. Hence we would like to learn how exactly it works under the hood. Thanks again for your patience.  😀 

This is important for for scalability because it means that validators can do independent work simultaneously and and have it all make it into the canonical blockDAG.

Consider an individual namespace is a logic entity of tuplespace, which contains a set of names belonging to namespace. The rholang term execution/reduction eventually results in the state change of a set of names in the tuplespace.

At this moment let's ignore Casper and its choice on fork, only focus on a particular namespace.

Suppose there is a single namespace with four names (a/b/c/d) within this namespace. After some rholang terms execution/reduction, their states change.

  • name a state changes  a1->a2
  • name b state changes  b1->b2
  • name c state changes  c1->c2
  • name d state does not change

Now suppose there are two validators, and the changed state are distributed on them. And each of them packs state change into a block which is disjointed with the other. So we have the following diagram.

multiparents

 

Question 1 : How do these two validators ensure the change set of names in block are disjoint with each other?

Suppose block#2.2 packed b1->b2 and c1->c2, then that does not work because it may conflict with block#2.1  which also packs b1->b2. Right? Because Rholang can write non-deterministic code which means same code may result in different result on different validator. Hence from what I see, if the changed set of names overlappes with each other, it may conflict.  When there are a big number of names in the tuplespace of a single namespace, the probability of confliction would increase a lot and that's not scalable.

So, what's the mechanism in RChain to ensure the change set of names in block are disjoint?

Question 2 : Do the validators in the same namespace shard the tuplespace?

Continue with above example, the change set of names are disjoint, what about tuplespace?  Will the validators in the same namespace shard the tuplespace and only host a part of the names.  e.g.   Validator1(which generates block#2.1) hosts only names a & b in its RSpace; And Validator2(which generates block#2.2)  hosts only names c & d in its RSpace; 
Or, both validators host the same names(a/b/c/d) in their RSpace and they only pack them in a disjoint way?


ReplyQuote
MichaelBirch
(@michaelbirch)
Member Moderator
Joined:4 months  ago
Posts: 33
13/06/2018 1:44 pm  

1. Once a block is created, the output of the code inside that block is then fixed. Even if it was non-deterministic to start with a validator receiving that block will "replay" the execution exactly how it was done when the block was created. So checking that two blocks are disjoint is simply a matter of looking at the content of both blocks. You do raise a valid point though about how likely it is for blocks to be disjoint from one another and the short answer is that right now we don't know. It is something we will need to investigate as the system gets closer to test net and production.

2. All validators in a single namespace must keep full copies of the tuplespace. This is necessary in order to have the security guarantees we want.


ReplyQuote
  
Working

Please Login or Register