Skip to content

Instantly share code, notes, and snippets.

@EthanHeilman
Last active August 24, 2024 20:19
Show Gist options
  • Save EthanHeilman/b12b9c834c61109aaccb6cb9ffc675e8 to your computer and use it in GitHub Desktop.
Save EthanHeilman/b12b9c834c61109aaccb6cb9ffc675e8 to your computer and use it in GitHub Desktop.
OP_FFS was an idea written up by Jeremy Rubin in 2017, during an email conversation with Ethan Heilman about a streaming hash function bitcoin opcode.

I, Ethan Heilman, am writing this in 2024.

OP_FFS was an idea written up by Jeremy Rubin in 2017, during an email conversation with Ethan Heilman about a streaming hash function bitcoin opcode. I am sharing it as it is sometimes referenced in public discussions but was not previously public and it felt like it should be public. For instance there was some discussion referring to OP_FFS on [the bitcoin-dev mailinglist in 2019] (https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-October/017355.html) and more recently on twitter in 2024.

This should not be read as a BIP proposal, BIP draft or endorsement, there are just notes written up during private correspondence. This is being published with Jeremy's express permission.

From Jeremy's email to me:

--------BEGIN BIP----------------

  BIP: XXX
  Layer: Consensus (soft fork)
  Title: FOLDFUNCTIONSTREAM
  Author: Jeremy Rubin 
          Ethan Heilman 
  Comments-Summary: No comments yet.
  Status: DRAFT
  Type: Standards Track
  Created: 2017-26-05
  License: PD

==Abstract==

This BIP describes a new opcode (FOLDFUNCTIONSTREAM) for the Bitcoin scripting system that adds a number of functional folds across data that are efficient to compute.

==Summary==

FOLDFUNCTIONSTREAM redefines the existing NOP4 opcode. When executed, if any of the following conditions are true, the script interpreter will terminate with an error:

  • the stack is not at least 2 elements
  • The top item is not a number (N) from -128 to 127. Let M = 0 if N == -1 else abs(N) - (1 if N < 0 else 0)
  • the 2nd to the top item on the stack is not a valid {fold type || fold midstate}
  • the stack is not at least M+2 elements
  • any items in between 2nd to top and M +2 to the top could not be applied under the fold function number given the midstate

Otherwise, script execution will proceed by popping the top number, stream processing and popping the top M elements in reverse order, and updating the midstate on the stack. If the top item on the stack is negative, then the stream is terminated and midstate is finalized.

==Motivation==

There are many computationally expensive constructs that enable useful types of scripts, but are difficult to implement in Bitcoin safely.

For instance, CAT would enable concatenation of input arguments followed by hashing. However CAT and DUP can be used to cause exponential memory growth.

FOLDFUNCTIONSTREAM with a SHA256 argument could be used to hash an arbitrary length concatenated string efficiently without worry of unbounded memory growth.

<a> <b>  <c> <FOLDFUNCTION_SHA256> 3 FOLDFUNCTIONSTREAM -1 FOLDFUNCTIONSTREAM

Which can be shortened to (because of the negative number optimization I made...)

<a> <b>  <c> <FOLDFUNCTION_SHA256> -4 FOLDFUNCTIONSTREAM

is equivalent to

<a> <b> CAT <c> CAT SHA256

==Fold Functions==

{| class="wikitable sortable" style="width: auto; text-align: center; font-size: smaller; table-layout: fixed;"
!Name
!Tag
!Purpose
|Midstate
|Argument
|- SHA256
| 1
| Computes SHA256(s_0 || s_1 || ... || s_n)
| Serialized SHA256 midstate
| Arbitrary Byte Vector
|- FACTORS
| 2
| Computes i_1 | i_0 /\ i_2 | i_0 /\ ... /\ i_n | i_0
| Arbitrary Byte Vector interpreted as unsigned integer
|- MAST
| 3
| Computes SHA256(SORT_CAT(SHA256(SORT_CAT(SHA256(s_0), s_1)), ...), s_n)
| s_0 is an arbitrary length vector, s_1...s_n are 32 byte vectors
|- SORT
| 4
| Computes SORT([i_0, i_1, ..., i_n])
| Arbitrary precision integers
|- MODMULT
| 5
| Computes prod(i_1...i_n) mod i_0
| i_* is an arbitrary integer
|- MODADD
| 6
| Computes sum(i_1...i_n) mod i_0
| i_* is an arbitrary integer
|- SETCONTAINS
| 7
| Computes i_2...i_n in i_0 given i_1
| i_0 is an accumulator, i_1 is a witness, i_2...i_n are candidates
|- UNDEFINED_FAIL
| 0
| Immediately returns FALSE and ends script
| n/a
|- UNDEFINED_SUCCEED
| 8-255
| Immediately returns FALSE and ends script
| n/a

==Examples==

==Specification==

Refer to the reference implementation, reproduced below, for the precise semantics and detailed rationale for those semantics.

int main() {
// whatever c++ code
}

==Reference Implementation==

A reference implementation will be provided by the following pull request:

https://github.com/bitcoin/bitcoin/pull/XXX

==Deployment==

This BIP is to be deployed by "versionbits" BIP9 using bit 6 and incrementing the SegWit script version. Signalling for bit 6 can happen safely before SegWit activates because FOLDFUNCTIONSTREAM is only usable from SegWit scripts.

For Bitcoin '''mainnet''', the BIP9 '''starttime''' will be midnight xxxxxxxxxxxx UTC (Epoch timestamp XXXXXXXXXX) and BIP9 '''timeout''' will be midnight xxxxxxxxxxxx UTC (Epoch timestamp XXXXXXXXXX).

For Bitcoin '''testnet''', the BIP9 '''starttime''' will be midnight xxxxxxxxxxxxxxx UTC (Epoch timestamp XXXXXXXXXX) and BIP9 '''timeout''' will be midnight xxxxxxxxxxxx UTC (Epoch timestamp XXXXXXXXXX).

==Credits==

The orginal idea of stream hash opcodes to solve concatenation exponential memory came from Ethan Heilman.

The generalization of this idea to a folding function with unbounded stream operations came from Jeremy Rubin.

==References==

[https://github.com/bitcoin/bips/blob/master/bip-0009.mediawiki BIP 9] Versionbits

==Copyright==

This document is placed in the public domain.

------------END BIP------------------

One thing I'm debating is if another order of the arguments makes more sense. I initially had the midstate/type go first, but I realized that made problems with evaluating with data passed in from a scriptsig/witness, so I switched it. Unclear if this way has some other problem I missed.

Also the UNDEFINED behavior is to make it easy to extend, but it also makes it less safe to pass in untrusted stream types... tradeoffs? (OP_WITHIN helps with bound checking the arg efficiently... arg * OP_WITHIN(min_expect, max_expect, arg) outputs either arg or 0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment