An Introduction to Bitcoin’s Scripting Language
In the following introduction, BTCManager will investigate the simple, yet powerful, coding language used in the Bitcoin Network. The Bitcoin Scripting language, or Bitcoin Script, was designed with only a few functions in mind; it is compact, Turing incomplete, and stack-based. In this way, the language serves these ends efficiently and securely.
Despite its minimal functionality, in comparison to networks like Ethereum, it has nonetheless proven itself throughout a decade to be powerful enough to support transactions in value adequately.
Bitcoin Script and “Programmable Money”
The programming language behind the pioneer cryptocurrency is, in the eyes of many, a perfect example of Occam’s Razor.
It is elementary, even compared to pre-cryptocurrency coding languages. More importantly, Satoshi Nakamoto designed this simplicity intentionally. A language that has multiple capabilities and allows for complex transactions of data also allows for a greater number of attack vectors. Critics have explained that a language such as Solidity, while impressive in its scope, falls short as far as security goes.
Moving along this point, Bitcoin Script is Turing incomplete. By comparison, Solidity is Turing complete, meaning it can replicate any Turing machine or an abstract machine capable of autonomously following a particular algorithm. Grasping this concept, one can begin to understand how a smart contract operates.
Returning to the primary focus, Bitcoin Script does not offer this feature, or not in the same way, a deeper dive into smart contracts using the Bitcoin Blockchain will be the subject of later articles.
Bitcoin’s main use-case has always been cryptocurrencies and the transfer of value. The added characteristics of Turing complete languages were, thus, not necessary. That, however, does not mean Script is limited.
Furthermore, the limitations in Bitcoin Script prevents a “logic bomb,” or an infinite loop from being included in any single transaction. This restriction eliminates the possibility of a denial-of-service (DoS) attack on the network. The extent of these constraints, such as transactions that extend beyond merely sending a value to X and Y, will be covered in upcoming installments.
Characteristics of the Bitcoin Scripting Language
Bitcoin’s coding language uses “reverse polish” as a system of notation, meaning lines such as “3 + 4” will appear as “3 4+” with growing complexity. Another feature harks back to Bitcoin Script’s roots in “Forth-like.” This feature is relevant simply in that these two languages are both “stack based.”
Reverse Polish Language Image.
Stacks are a very common data structure which, in the words of Andreas Antonopolous, allows information on “top of the stack” to either “push” or “pop.” The former operation explains the process of adding information to the stack, while the latter describes removing information from a stack. Furthermore, the order in which information is popped or pushed follows the “LIFO” principal, or Last-In, First-Out.
An operation like “3 4+,” would behave in the following:
- Push “3” on to the stack.
- Push “4” on to the stack.
- The “+” operator, takes these two parameters, pops them both off the stack, adds them together, then pushes the result back on the stack. (i.e., pop, pop, add, push)
- The resulting operation leads, in this case, to a “7” on the stack and the program terminates.
In Bitcoin Script, this operation would follow the same steps, but would also include the prefix “OP” before each variable. Let’s next look into how all this new vocabulary comes together in a real Bitcoin transaction.
A Bitcoin Script in Action
The majority of operations are signature transactions. This includes payments, exchanges, and most workings involving public and private keys. For the sake of this article, let’s take apart an exchange between the author and his colleague, Eddie Mitchell. Here the author (sender) will specify the public key of Mitchell (recipient), who will redeem the bitcoin sent by specifying a signature using the same public key.
Following this, the first two instructions of such a transaction are the signature and the public key used to generate that signature. This information is identified as “<sig>” and “<pubKey>” and pushed onto the stack. Mitchell determines these values as he is the recipient. This first half of the transaction is often called “scriptSig” or the “Unlocking Script.” In this section of the operation, there is also reference to a previously existing Unspent Transaction Output (UTXO).
The inclusion of the UTXO ensures that the author indeed owns the amount of bitcoin he is looking to send to Mitchell. The Bitcoin network completes this validation via miners and Bitcoin full nodes. In Mastering Bitcoin, author Andreas Antonopoulos explains it thusly:
“Each input contains an unlocking script and refers to a previously existing UTXO. The validation software will copy the unlocking script, retrieve the UTXO referenced by the input, and copy the locking script from that UTXO.”
The second portion of the transaction, the “Locking Script” or “scriptPubkey,” is then executed by the author. Based on the above image, the next instruction “OP_DUP” pops off the <pubKey> from the stack, duplicates it, then returns it to the stack.
This top value, or the duplicate of the <pubKey>, is then cryptographically hashed by the “OP_HASH160” instruction and becomes “<pubKeyHash>.”
The specific hashing function used for Bitcoin transactions is called SHA-256 (Secure Hash Algorithm) and is part of a larger group of functions known as SHA-2, which comes from a National Security Agency development in 1993. Other members of the SHA-2 family include SHA-224, SHA-256, SHA-384, and SHA-512 with each number representing the bit length of the message they produce.
The applications are vast within the field of information security, with the most relevant being Bitcoin and Haschash’s Proof-of-Work (PoW) consensus mechanism. The most notable feature of SHA-256 is its ability to prevent DoS attacks as mentioned above.
Returning to the transaction between the author and his colleague, users still need to add another piece of data to the stack. This next bit of information is the public key that the author specified at the beginning of the transaction. It is needed to generate the signature to redeem the bitcoin requested.
At this point, there are two critical pieces of hashed data on top of the stack: The hash of the public key as specified by the author and the hash of the public key used by Mitchell. From there the “OP_EQUALVERIFY” command is engaged which ensures that the author has indeed used the correct public keys. Following a handful of failed bitcoin transactions in his early days, the author has triple-checked that the public keys are those of Mitchell. As the public keys do match, the OP_EQUALVERIFY command expends these data points. Users are now only left with a signature and a public key. The final step is to verify that the signature of this transaction is indeed correct.
Signature and Public Key Stack.
The Bitcoin scripting language is advantageous here as it needn’t draw from an extensive library to confirm the validity of the signature. All of this is built into the language.
The “OP_CHECKSIG” instruction at the end, then pops the remaining two items off of the stack, and if the <sig> matches the <pubKey>, then the operation will be rendered valid.
Compare, Contrast, and Adding Complexity
Although the following introduction was brief, it should give a basic idea of how a Bitcoin transaction is executed. Building on this, developers and enthusiasts can begin experimenting with more advanced operations, which will be the subject of later briefs.
Upcoming articles that build on this will dive deeper into digital signatures (ECDSA), multi-signature operations, Pay-to-Script-Hash (P2SH), and Timelocks.