Posted  by  admin

Insert Tab A Into Slot B Is Something You Might Read In The Assembly

  1. To output the first number simply drop the OFFSET tag. Now you display the address of the Num1 variable. To output all integers that were used without using a loop just repeat the operation for each number. You might want to output a CRLF between the numbers for readability.
  2. Helo Guys, i need your help, because i am searching code to read string in assembler, but i can't found nothing if you can help me, thaks!!!

Code name for nailing. Dude: You know what babe, I could go for a great romp session right now. Girl: Not tonight Im tired.

Solidity defines an assembly language that can also be used without Solidity.This assembly language can also be used as “inline assembly” inside Soliditysource code. We start with describing how to use inline assembly and how itdiffers from standalone assembly and then specify assembly itself.

Inline Assembly¶

For more fine-grained control especially in order to enhance the language by writing libraries,it is possible to interleave Solidity statements with inline assembly in a language closeto the one of the virtual machine. Due to the fact that the EVM is a stack machine, it isoften hard to address the correct stack slot and provide arguments to opcodes at the correctpoint on the stack. Solidity’s inline assembly tries to facilitate that and other issuesarising when writing manual assembly by the following features:

  • functional-style opcodes: mul(1,add(2,3)) instead of push13push12addpush11mul
  • assembly-local variables: letx:=add(2,3)lety:=mload(0x40)x:=add(x,y)
  • access to external variables: functionf(uintx)public{assembly{x:=sub(x,1)}}
  • labels: letx:=10repeat:x:=sub(x,1)jumpi(repeat,eq(x,0))
  • loops: for{leti:=0}lt(i,x){i:=add(i,1)}{y:=mul(2,y)}
  • if statements: ifslt(x,0){x:=sub(0,x)}
  • switch statements: switchxcase0{y:=mul(x,2)}default{y:=0}
  • function calls: functionf(x)->y{switchxcase0{y:=1}default{y:=mul(x,f(sub(x,1)))}}

We now want to describe the inline assembly language in detail.

Warning

Inline assembly is a way to access the Ethereum Virtual Machineat a low level. This discards several important safetyfeatures of Solidity.

Note

TODO: Write about how scoping rules of inline assembly are a bit differentand the complications that arise when for example using internal functionsof libraries. Furthermore, write about the symbols defined by the compiler.

Example¶

The following example provides library code to access the code of another contract andload it into a bytes variable. This is not possible at all with “plain Solidity” and theidea is that assembly libraries will be used to enhance the language in such ways.

Inline assembly could also be beneficial in cases where the optimizer fails to produceefficient code. Please be aware that assembly is much more difficult to write becausethe compiler does not perform checks, so you should use it for complex things only ifyou really know what you are doing.

Syntax¶

Assembly parses comments, literals and identifiers exactly as Solidity, so you can use theusual // and /**/ comments. Inline assembly is marked by assembly{...} and insidethese curly braces, the following can be used (see the later sections for more details)

  • literals, i.e. 0x123, 42 or 'abc' (strings up to 32 characters)
  • opcodes (in “instruction style”), e.g. mloadsloaddup1sstore, for a list see below
  • opcodes in functional style, e.g. add(1,mlod(0))
  • labels, e.g. name:
  • variable declarations, e.g. letx:=7, letx:=add(y,3) or letx (initial value of empty (0) is assigned)
  • identifiers (labels or assembly-local variables and externals if used as inline assembly), e.g. jump(name), 3xadd
  • assignments (in “instruction style”), e.g. 3=:x
  • assignments in functional style, e.g. x:=add(y,3)
  • blocks where local variables are scoped inside, e.g. {letx:=3{lety:=add(x,1)}}

Opcodes¶

This document does not want to be a full description of the Ethereum virtual machine, but thefollowing list can be used as a reference of its opcodes.

If an opcode takes arguments (always from the top of the stack), they are given in parentheses.Note that the order of arguments can be seen to be reversed in non-functional style (explained below).Opcodes marked with - do not push an item onto the stack, those marked with * arespecial and all others push exactly one item onto the stack.Opcodes marked with F, H, B or C are present since Frontier, Homestead, Byzantium or Constantinople, respectively.Constantinople is still in planning and all instructions marked as such will result in an invalid instruction exception.

In the following, mem[a...b) signifies the bytes of memory starting at position a up to(excluding) position b and storage[p] signifies the storage contents at position p.

The opcodes pushi and jumpdest cannot be used directly.

In the grammar, opcodes are represented as pre-defined identifiers.

InstructionExplanation
stop-Fstop execution, identical to return(0,0)
add(x, y)Fx + y
sub(x, y)Fx - y
mul(x, y)Fx * y
div(x, y)Fx / y
sdiv(x, y)Fx / y, for signed numbers in two’s complement
mod(x, y)Fx % y
smod(x, y)Fx % y, for signed numbers in two’s complement
exp(x, y)Fx to the power of y
not(x)F~x, every bit of x is negated
lt(x, y)F1 if x < y, 0 otherwise
gt(x, y)F1 if x > y, 0 otherwise
slt(x, y)F1 if x < y, 0 otherwise, for signed numbers in two’s complement
sgt(x, y)F1 if x > y, 0 otherwise, for signed numbers in two’s complement
eq(x, y)F1 if x y, 0 otherwise
iszero(x)F1 if x 0, 0 otherwise
and(x, y)Fbitwise and of x and y
or(x, y)Fbitwise or of x and y
xor(x, y)Fbitwise xor of x and y
byte(n, x)Fnth byte of x, where the most significant byte is the 0th byte
shl(x, y)Clogical shift left y by x bits
shr(x, y)Clogical shift right y by x bits
sar(x, y)Carithmetic shift right y by x bits
addmod(x, y, m)F(x + y) % m with arbitrary precision arithmetics
mulmod(x, y, m)F(x * y) % m with arbitrary precision arithmetics
signextend(i, x)Fsign extend from (i*8+7)th bit counting from least significant
keccak256(p, n)Fkeccak(mem[p…(p+n)))
sha3(p, n)Fkeccak(mem[p…(p+n)))
jump(label)-Fjump to label / code position
jumpi(label, cond)-Fjump to label if cond is nonzero
pcFcurrent position in code
pop(x)-Fremove the element pushed by x
dup1 … dup16Fcopy ith stack slot to the top (counting from top)
swap1 … swap16*Fswap topmost and ith stack slot below it
mload(p)Fmem[p..(p+32))
mstore(p, v)-Fmem[p..(p+32)) := v
mstore8(p, v)-Fmem[p] := v & 0xff (only modifies a single byte)
sload(p)Fstorage[p]
sstore(p, v)-Fstorage[p] := v
msizeFsize of memory, i.e. largest accessed memory index
gasFgas still available to execution
addressFaddress of the current contract / execution context
balance(a)Fwei balance at address a
callerFcall sender (excluding delegatecall)
callvalueFwei sent together with the current call
calldataload(p)Fcall data starting from position p (32 bytes)
calldatasizeFsize of call data in bytes
calldatacopy(t, f, s)-Fcopy s bytes from calldata at position f to mem at position t
codesizeFsize of the code of the current contract / execution context
codecopy(t, f, s)-Fcopy s bytes from code at position f to mem at position t
extcodesize(a)Fsize of the code at address a
extcodecopy(a, t, f, s)-Flike codecopy(t, f, s) but take code at address a
returndatasizeBsize of the last returndata
returndatacopy(t, f, s)-Bcopy s bytes from returndata at position f to mem at position t
create(v, p, s)Fcreate new contract with code mem[p..(p+s)) and send v weiand return the new address
create2(v, n, p, s)Ccreate new contract with code mem[p..(p+s)) at addresskeccak256(<address> . n . keccak256(mem[p..(p+s))) and send vwei and return the new address
call(g, a, v, in,insize, out, outsize)Fcall contract at address a with input mem[in..(in+insize))providing g gas and v wei and output areamem[out..(out+outsize)) returning 0 on error (eg. out of gas)and 1 on success
callcode(g, a, v, in,insize, out, outsize)Fidentical to call but only use the code from a and stayin the context of the current contract otherwise
delegatecall(g, a, in,insize, out, outsize)Hidentical to callcode but also keep callerand callvalue
staticcall(g, a, in,insize, out, outsize)Bidentical to call(g,a,0,in,insize,out,outsize) but donot allow state modifications
return(p, s)-Fend execution, return data mem[p..(p+s))
revert(p, s)-Bend execution, revert state changes, return data mem[p..(p+s))
selfdestruct(a)-Fend execution, destroy current contract and send funds to a
invalid-Fend execution with invalid instruction
log0(p, s)-Flog without topics and data mem[p..(p+s))
log1(p, s, t1)-Flog with topic t1 and data mem[p..(p+s))
log2(p, s, t1, t2)-Flog with topics t1, t2 and data mem[p..(p+s))
log3(p, s, t1, t2, t3)-Flog with topics t1, t2, t3 and data mem[p..(p+s))
log4(p, s, t1, t2, t3,t4)-Flog with topics t1, t2, t3, t4 and data mem[p..(p+s))
originFtransaction sender
gaspriceFgas price of the transaction
blockhash(b)Fhash of block nr b - only for last 256 blocks excluding current
coinbaseFcurrent mining beneficiary
timestampFtimestamp of the current block in seconds since the epoch
numberFcurrent block number
difficultyFdifficulty of the current block
gaslimitFblock gas limit of the current block

Literals¶

You can use integer constants by typing them in decimal or hexadecimal notation and anappropriate PUSHi instruction will automatically be generated. The following creates codeto add 2 and 3 resulting in 5 and then computes the bitwise and with the string “abc”.Strings are stored left-aligned and cannot be longer than 32 bytes.

Functional Style¶

You can type opcode after opcode in the same way they will end up in bytecode. For exampleadding 3 to the contents in memory at position 0x80 would be

As it is often hard to see what the actual arguments for certain opcodes are,Solidity inline assembly also provides a “functional style” notation where the same codewould be written as follows

Might

Functional style expressions cannot use instructional style internally, i.e.12mstore(0x80,add) is not valid assembly, it has to be written asmstore(0x80,add(2,1)). For opcodes that do not take arguments, theparentheses can be omitted.

Note that the order of arguments is reversed in functional-style as opposed to the instruction-styleway. If you use functional-style, the first argument will end up on the stack top.

Access to External Variables and Functions¶

Solidity variables and other identifiers can be accessed by simply using their name.For memory variables, this will push the address and not the value onto thestack. Storage variables are different: Values in storage might not occupy afull storage slot, so their “address” is composed of a slot and a byte-offsetinside that slot. To retrieve the slot pointed to by the variable x, youused x_slot and to retrieve the byte-offset you used x_offset.

In assignments (see below), we can even use local Solidity variables to assign to.

Functions external to inline assembly can also be accessed: The assembly willpush their entry label (with virtual function resolution applied). The calling semanticsin solidity are:

  • the caller pushes returnlabel, arg1, arg2, …, argn
  • the call returns with ret1, ret2, …, retm

This feature is still a bit cumbersome to use, because the stack offset essentiallychanges during the call, and thus references to local variables will be wrong.

Note

If you access variables of a type that spans less than 256 bits(for example uint64, address, bytes16 or byte),you cannot make any assumptions about bits not part of theencoding of the type. Especially, do not assume them to be zero.To be safe, always clear the data properly before you use itin a context where this is important:uint32x=f();assembly{x:=and(x,0xffffffff)/*nowusex*/}To clean signed types, you can use the signextend opcode.

Labels¶

Note

Labels are deprecated. Please use functions, loops, if or switch statements instead.

Another problem in EVM assembly is that jump and jumpi use absolute addresseswhich can change easily. Solidity inline assembly provides labels to make the use ofjumps easier. Note that labels are a low-level feature and it is possible to writeefficient assembly without labels, just using assembly functions, loops, if and switch instructions(see below). The following code computes an element in the Fibonacci series.

Please note that automatically accessing stack variables can only work if theassembler knows the current stack height. This fails to work if the jump sourceand target have different stack heights. It is still fine to use such jumps, butyou should just not access any stack variables (even assembly variables) in that case.

Furthermore, the stack height analyser goes through the code opcode by opcode(and not according to control flow), so in the following case, the assemblerwill have a wrong impression about the stack height at label two:

Declaring Assembly-Local Variables¶

You can use the let keyword to declare variables that are only visible ininline assembly and actually only in the current {...}-block. What happensis that the let instruction will create a new stack slot that is reservedfor the variable and automatically removed again when the end of the blockis reached. You need to provide an initial value for the variable which canbe just 0, but it can also be a complex functional-style expression.

Assignments¶

Assignments are possible to assembly-local variables and to function-localvariables. Take care that when you assign to variables that point tomemory or storage, you will only change the pointer and not the data.

There are two kinds of assignments: functional-style and instruction-style.For functional-style assignments (variable:=value), you need to provide a value in afunctional-style expression that results in exactly one stack valueand for instruction-style (=:variable), the value is just taken from the stack top.For both ways, the colon points to the name of the variable. The assignmentis performed by replacing the variable’s value on the stack by the new value.

Note

Instruction-style assignment is deprecated.

If¶

The if statement can be used for conditionally executing code.There is no “else” part, consider using “switch” (see below) ifyou need multiple alternatives.

The curly braces for the body are required.

Switch¶

Insert Tab A Into Slot B Is Something You Might Read In The Assembly

You can use a switch statement as a very basic version of “if/else”.It takes the value of an expression and compares it to several constants.The branch corresponding to the matching constant is taken. Contrary to theerror-prone behaviour of some programming languages, control flow doesnot continue from one case to the next. There can be a fallback or defaultcase called default.

The list of cases does not require curly braces, but the body of acase does require them.

Loops¶

Assembly supports a simple for-style loop. For-style loops havea header containing an initializing part, a condition and a post-iterationpart. The condition has to be a functional-style expression, whilethe other two are blocks. If the initializing partdeclares any variables, the scope of these variables is extended into thebody (including the condition and the post-iteration part).

The following example computes the sum of an area in memory.

For loops can also be written so that they behave like while loops:Simply leave the initialization and post-iteration parts empty.

Functions¶

Assembly allows the definition of low-level functions. These take theirarguments (and a return PC) from the stack and also put the results onto thestack. Calling a function looks the same way as executing a functional-styleopcode.

Functions can be defined anywhere and are visible in the block they aredeclared in. Inside a function, you cannot access local variablesdefined outside of that function. There is no explicit returnstatement.

If you call a function that returns multiple values, you have to assignthem to a tuple using a,b:=f(x) or leta,b:=f(x).

The following example implements the power function by square-and-multiply.

Insert Tab A Into Slot B Is Something You Might Read In The Assembly Position

Things to Avoid¶

Inline assembly might have a quite high-level look, but it actually is extremelylow-level. Function calls, loops, ifs and switches are converted by simplerewriting rules and after that, the only thing the assembler does for you is re-arrangingfunctional-style opcodes, managing jump labels, counting stack height forvariable access and removing stack slots for assembly-local variables when the endof their block is reached. Especially for those two last cases, it is importantto know that the assembler only counts stack height from top to bottom, notnecessarily following control flow. Furthermore, operations like swap will onlyswap the contents of the stack but not the location of variables.

Conventions in Solidity¶

In contrast to EVM assembly, Solidity knows types which are narrower than 256 bits,e.g. uint24. In order to make them more efficient, most arithmetic operations justtreat them as 256-bit numbers and the higher-order bits are only cleaned at thepoint where it is necessary, i.e. just shortly before they are written to memoryor before comparisons are performed. This means that if you access such a variablefrom within inline assembly, you might have to manually clean the higher order bitsfirst.

Solidity manages memory in a very simple way: There is a “free memory pointer”at position 0x40 in memory. If you want to allocate memory, just use the memoryfrom that point on and update the pointer accordingly.

Insert Tab A Into Slot B Is Something You Might Read In The Assembly Of One

The first 64 bytes of memory can be used as “scratch space” for short-termallocation. The 32 bytes after the free memory pointer (i.e. starting at 0x60)is meant to be zero permanently and is used as the initial value forempty dynamic memory arrays.

Elements in memory arrays in Solidity always occupy multiples of 32 bytes (yes, this iseven true for byte[], but not for bytes and string). Multi-dimensional memoryarrays are pointers to memory arrays. The length of a dynamic array is stored at thefirst slot of the array and then only the array elements follow.

Warning

Statically-sized memory arrays do not have a length field, but it will be added soonto allow better convertibility between statically- and dynamically-sized arrays, soplease do not rely on that.

Standalone Assembly¶

The assembly language described as inline assembly above can also be usedstandalone and in fact, the plan is to use it as an intermediate languagefor the Solidity compiler. In this form, it tries to achieve several goals:

Insert Tab A Into Slot B Is Something You Might Read In The Assembly For A

  1. Programs written in it should be readable, even if the code is generated by a compiler from Solidity.
  2. The translation from assembly to bytecode should contain as few “surprises” as possible.
  3. Control flow should be easy to detect to help in formal verification and optimization.

In order to achieve the first and last goal, assembly provides high-level constructslike for loops, if and switch statements and function calls. It should be possibleto write assembly programs that do not make use of explicit SWAP, DUP,JUMP and JUMPI statements, because the first two obfuscate the data flowand the last two obfuscate control flow. Furthermore, functional statements ofthe form mul(add(x,y),7) are preferred over pure opcode statements like7yxaddmul because in the first form, it is much easier to see whichoperand is used for which opcode.

The second goal is achieved by compiling thehigher level constructs to bytecode in a very regular way.The only non-local operation performedby the assembler is name lookup of user-defined identifiers (functions, variables, …),which follow very simple and regular scoping rules and cleanup of local variables from the stack.

Scoping: An identifier that is declared (label, variable, function, assembly)is only visible in the block where it was declared (including nested blocksinside the current block). It is not legal to access local variables acrossfunction borders, even if they would be in scope. Shadowing is not allowed.Local variables cannot be accessed before they were declared, but labels,functions and assemblies can. Assemblies are special blocks that are usedfor e.g. returning runtime code or creating contracts. No identifier from anouter assembly is visible in a sub-assembly.

If control flow passes over the end of a block, pop instructions are insertedthat match the number of local variables declared in that block.Whenever a local variable is referenced, the code generator needsto know its current relative position in the stack and thus it needs tokeep track of the current so-called stack height. Since all local variablesare removed at the end of a block, the stack height before and after the blockshould be the same. If this is not the case, a warning is issued.

Using switch, for and functions, it should be possible to writecomplex code without using jump or jumpi manually. This makes it mucheasier to analyze the control flow, which allows for improved formalverification and optimization.

Furthermore, if manual jumps are allowed, computing the stack height is rather complicated.The position of all local variables on the stack needs to be known, otherwiseneither references to local variables nor removing local variables automaticallyfrom the stack at the end of a block will work properly.

Example:

We will follow an example compilation from Solidity to assembly.We consider the runtime bytecode of the following Solidity program:

The following assembly will be generated:

Assembly Grammar¶

Insert Tab A Into Slot B Is Something You Might Read In The Assembly Book

The tasks of the parser are the following:

  • Turn the byte stream into a token stream, discarding C++-style comments(a special comment exists for source references, but we will not explain it here).
  • Turn the token stream into an AST according to the grammar below
  • Register identifiers with the block they are defined in (annotation to theAST node) and note from which point on, variables can be accessed.

The assembly lexer follows the one defined by Solidity itself.

Whitespace is used to delimit tokens and it consists of the charactersSpace, Tab and Linefeed. Comments are regular JavaScript/C++ comments andare interpreted in the same way as Whitespace.

Grammar: