fields

Much of the core of gramfuzz lies in the field definitions.

Unlike other data-parsing libraries (such as pfp), gramfuzz only defines the most basic data types to be used for grammar-driven data generation.

Classes

Int/UInt/Float/UFloat

The gramfuzz.fields.Int, gramfuzz.fields.UInt, gramfuzz.fields.Float, and gramfuzz.fields.UFloat classes are the only numeric classes defined in gramfuzz.

Most things can be accomplished using these classes, their min/max settings, and their odds settings.

String

The gramfuzz.fields.String class can be used to generate arbitrary-length strings. All of the settings from the Int class still apply to the String class (min, max, odds), except that they influence the length of the string, not the characters it contains.

The String class has another paramter: charset. This is used to specify which characters should make up the random strings. Several pre-defined charsets exist within the String class:

  • charset_alpha_lower

  • charset_alpha_upper

  • charset_alpha

  • charset_spaces

  • charset_num

  • charset_alphanum

  • charset_all

For example:

s = String(charset="abcdefg", min=2, max=5)
print(s.build())
# 'aca'

And

The gramfuzz.fields.And class can be used to concatenate different values together:

a = And(Int, "hello")
print(a.build())
# '98hello'

And does not take any special parameters.

Or

The gramfuzz.fields.Or class can be used to choose randomly between several different options:

o = Or(Int, "Hello")
for x in range(10):
    print(o.build())
# Hello
# Hello
# -91
# -60
# 68
# Hello
# 13
# Hello
# Hello
# Hello

A gramfuzz.fields.WeightedOr class (aliased to gramfuzz.fields.WOr) also exists to allow weighted probabilities in Or choices:

WeightedOr(
   ("hello", 0.1), # 10% chance
   (UInt,    0.7), # 70% chance
   (3.14,    0.3), # 30% chance
)

# or

WOr(
   ("hello", 0.1), # 10% chance
   (UInt,    0.7), # 70% chance
   (3.14,    0.3), # 30% chance
)

Join

The gramfuzz.fields.Join class can be used to join values together using a separator. It also has a max value that can be used to indicate how many times the first value should be repeated (any values other than the first one will be ignored):

j = And(
    "some_function(",
    Join(
        Or(Int|Q(String)),
    sep=", ", max=5),
    ")"
)

for x in range(10):
    print(j.build())
# some_function(-4294967294, "sEaKWSOGabHf", "ZkLXWYAUyEuW", 95, "FHnVYTvB")
# some_function("koBklVcoJbDC", -60)
# some_function(-65537)
# some_function(96)
# some_function(-87, -82, "x", "LKvYXEJHegjMGh")
# some_function("TSMQeGZbXNH")
# some_function(-254, -55, -91, "N")
# some_function(-44, 84, 59, "FBPHBf", "NBZxlVq")
# some_function("BvASDsxrTnycyLBChsM", "p")
# some_function(-85, "X", "HiGdE", "XgJoNBk", 254)

Q

The gramfuzz.fields.Q class can be used to surround values in quotes, optionally specifying one of two string-escaping methods:

print(Q(String).build())
# "znFHLTkwgniAXtNhI"
print(Q("'he\"llo'", escape=True).build())
# '\'he"llo\''"
print(Q("<h1>'hello'</h1>", html_js_escape=True).build())
# '\x3ch1\x3e\'hello\'\x3c/h1\x3e'
  • escape - use Python’s repr to escape the string

  • html_js_escape - use Python’s "string_escape" (or "unicode_escape" in python3) encoding, as well as replacing < and > with their escaped hex formats

The Q class also accessts a quote keyword argument. This only applies if none of the escaping methods are specified, and is merely prepended and appended to the string.

Opt

The gramfuzz.fields.Opt inherits from the And class can be used to wrap values, optionally raising an OptGram exception when built.

If an OptGram exception is raised, the current value being built will be ignored.

j = And(
    "some_function(",
    Join(
        Int,
        Q(String),
        Opt(Float), # optional argument
    sep=", ", max=5),
    ")"
)

Details

Field Base Class

All fields inherit from the gramfuzz.fields.Field class.

Field classes may be used in grammar rule definitions, as well as instances of Field classes (thanks to the gramfuzz.fields.MetaField class and the gramfuzz.utils.val function).

For example, defining a rule that uses the Int field with the default settings may look something like this:

Def("integer_rule", Int)

However, if more specific settings for the Int class were desired, it may look something like this instead:

Def("integer_rule", Int(min=10, max=20))

Note that this can also be abstracted to something like this:

SPECIAL_INT = Int(min=10, max=20)

Def("integer_rule", SPECIAL_INT)
Def("integer_rule2", SPECIAL_INT, ",", SPECIAL_INT)

This pattern is highly recommended and prevents one from constantly hard-coding specific settings throughout a grammar.

Operator Overloading

gramfuzz defines two main classes for concatenating or randomly choosing between values: gramfuzz.fields.And and gramfuzz.fields.Or.

The And and Or classes can be explicitly used:

And(Int, ", ", Int)
Or(Int, Float)

Or they can be used using the overloaded and and or operators:

(Int & ", " & Int)
(Int | Float)

There are a few drawbacks however, mostly having to do with the fact that gramfuzz cannot tell where the parenthesis are in more complex scenarios, like the one below:

(Int & (Int & (Int & Int)))

Ideally, the statement above would generate something like the statement below, which uses explicit And s:

And(Int, And(Int, And(Int, Int)))

When in reality, gramfuzz ends up doing this instead:

And(Int, Int, Int, Int)
# from the python console:
# >>> a = (Int & (Int & (Int & Int)))
# >>> a
# <And[<Int>,<Int>,<Int>,<Int>]>

For this reason, I tend to use the overloaded operators only in simple situations. Complex logic/scenarious I tend to only use explicit And and Or.

Gotcha

One important gotcha is shown below:

5 | Int | Float

The above example does not work because 5 is the first operand in the or sequence. This is due to the way Python handles operator overloading.

However, this will work:

Int | Float | 5

Native types can only be used with the Field overloaded operators if they are not the first operand.

Odds

The gramfuzz.fields.Int, gramfuzz.fields.UInt, gramfuzz.fields.Float, gramfuzz.fields.UFloat, and gramfuzz.fields.String classes each make use of the gramfuzz.fields.Field.odds member when generating data, as well as (optionally) min and max members.

An example of using the odds member can be seen in the default values for the Int, Float, and String classes:

class Float(Int):
    # ...
    odds = [
        (0.75,    [0.0,100.0]),        # 75% chance of being in the range [0.0, 100.0)
        (0.05,    0),                  # 5% chance of having the value 0
        (0.10,    [100.0, 1000.0]),    # 10% chance of being in the range [100.0, 1000.0)
        (0.10,    [1000.0, 100000.0]), # 10% chance of being in the range [1000.0, 100000.0)
    ]
    # ...

It should be noted that the probability percents in each entry in the odds list should add up to 1.0.

See the documentation below for more details:

gramfuzz.fields Reference Documentation

This module defines all of the core gramfuzz fields:

  • Float

  • UFloat

  • Int

  • UInt

  • Join

  • Opt

  • Or

  • Q

  • Ref

  • String

Each field has a build() method, which accepts one argument (pre) that can be used to assign prerequisites of the build result.

class gramfuzz.fields.And(*values, **kwargs)[source]

A Field subclass that concatenates two values together. This class works nicely with Opt values.

__init__(*values, **kwargs)[source]

Create a new And field instance.

Parameters

values (list) – The list of values to be concatenated

build(pre=None, shortest=False)[source]

Build the And instance

Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

class gramfuzz.fields.Def(name, *values, **options)[source]

The Def class is used to define grammar rules. A defined rule has three parts:

# Name - A rule name can be declared multiple times. When a rule name with multiple definitions is generated, one of the rule definitions will be chosen at random.

# Values - The values of the rule. These will be concatenated (acts the same as an And).

# Category - Which category to define the rule in. This is an important step and guides the fuzzer into choosing the correct rule definitions when randomly choosing rules to generate.

For example, supposed we defined a grammar for various types of postal addresses. We could have a grammar for US addresses, UK addresses, and Australian addresses. When we want the fuzzer to generate a random address, we would want it to choose one from our US, UK, or Australian address rule and not choose to generate only a zipcode rule.

I often have a main X category, as well as an X_def category. The X category is what I tell to the fuzzer to choose from when randomly generating top-level rules. The X_def category is only used to help build the top-level rules.

__init__(name, *values, **options)[source]

Create a new rule definition. Simply instantiating a new rule definition will add it to the current GramFuzzer instance.

Parameters
  • name (str) – The name of the rule being defined

  • values (list) – The list of values that define the value of the rule (will be concatenated when built)

  • cat (str) – The category to create the rule in (default=``”default”``).

  • no_prune (bool) – If this rule should not be pruned EVEN IF it is found to be unreachable (default=``False``)

build(pre=None, shortest=False)[source]

Build this rule definition

Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

cat = 'default'

The default category of this Def class (default=``”default”``)

no_prune = False

Whether or not this rule should be pruned if the fuzzer cannot find a way to reach this rule. (default=``False``)

sep = b''

The separator of values for this rule definition (default=``””``)

class gramfuzz.fields.Field[source]

The core class that all field classes are based one. Contains utility methods to determine probabilities/choices/min-max/etc.

odds = []

odds is a list of tuples that define probability values.

Each item in the list must be a tuple of the form:

(X, Y)

Where X is the probability percent, and where Y is one of the following:

  • A single value

  • A list/tuple containing two values, the min and max of a range of numbers.

Note that the sum of each probability percent in the list must equal 1.0.

shortest_is_nothing = False

This is used during gramfuzz.GramFuzzer.find_shortest_paths. Sometimes the fuzzer cannot know based on the values in a field what that field’s minimal behavior will be.

Setting this to True will explicitly let the GramFuzzer instance know what the minimal outcome will be.

NOTE when implementing a custom Field subclass and setting shortest_is_nothing to True, be sure to handle the case when build(shortest=True) is called so that a gramfuzz.errors.OptGram error is raised (which skips the current field from being generated).

class gramfuzz.fields.Float(value=None, **kwargs)[source]

Defines a float Field with odds that define float values

class gramfuzz.fields.Int(value=None, **kwargs)[source]

Represents all Integers, with predefined odds that target boundary conditions.

__init__(value=None, **kwargs)[source]

Create a new Int object, optionally specifying a hard-coded value

Parameters
  • value (int) – The value of the new int object

  • min (int) – The minimum value (if value is not specified)

  • max (int) – The maximum value (if value is not specified)

  • odds (list) – The probability list. See Field.odds for more information.

build(pre=None, shortest=False)[source]

Build the integer, optionally providing a pre list that may be used to define prerequisites for a Field being built.

Parameters
  • pre (list) – A list of prerequisites to be collected during the building of a Field.

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

class gramfuzz.fields.Join(*values, **kwargs)[source]

A Field subclass that joins other values with a separator. This class works nicely with Opt values.

__init__(*values, **kwargs)[source]

Create a new instance of the Join class.

Parameters
  • values (list) – The values to join

  • sep (str) – The string with which to separate each of the values (default=``”,”``)

  • max (int) –

    The maximum number of times (inclusive) to build the first item in values. This can be useful when a variable number of items in a list is needed. E.g.:

    Join(Int, max=5, sep=",")
    

build(pre=None, shortest=False)[source]

Build the Join field instance.

Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

class gramfuzz.fields.MetaField[source]

Used as the metaclass of the core gramfuzz.fields.Field class. MetaField defines __and__ and __or__ and __repr__ methods. The overridden and and or operatories allow classes themselves to be wrapped in an gramfuzz.fields.And or gramfuzz.fields.Or without having to instantiate them first.

E.g. the two lines below are equivalent:

And(Int(), Float())
(Int & Float)

Or(Int(), Float())
(Int | Float)

Do note however that this can only be done if the first (farthest to the left) operand is a Field class or instance.

E.g. the first line below will work, but the second line will not will not:

Or(5, Int)
5 | Int

It is also recommended that using the overloaded & and | operators should only be done in very simple cases, since it is impossible for the code to know the difference between the two statements below:

(Int | Float) | Uint
Int | Float | UInt
class gramfuzz.fields.Opt(*values, **kwargs)[source]

A Field subclass that randomly chooses to either build the provided values (acts as an And in that case), or raise an errors.OptGram exception.

When an errors.OptGram exception is raised, the current value being built is then skipped

__init__(*values, **kwargs)[source]

Create a new Opt instance

Parameters
  • values (list) – The list of values to build (or not)

  • prob (float) – A float value between 0 and 1 that defines the probability

of cancelling the current build.

build(pre=None, shortest=False)[source]

Build the current Opt instance

Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

prob = 0.5

The probability of an Opt instance raising an errors.OptGram exception

class gramfuzz.fields.Or(*values, **kwargs)[source]

A Field subclass that chooses one of the provided values at random as the result of a call to the build() method.

__init__(*values, **kwargs)[source]

Create a new Or instance with the provide values

Parameters

values (list) – The list of values to choose randomly from

build(pre=None, shortest=False)[source]

Build the Or instance

Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

class gramfuzz.fields.PLUS(*values, **kwargs)[source]

Acts like the + in a regex - one or more of the values. The values are Anded together one or more times, up to max times.

__init__(*values, **kwargs)[source]

Create a new instance of the Join class.

Parameters
  • values (list) – The values to join

  • sep (str) – The string with which to separate each of the values (default=``”,”``)

  • max (int) –

    The maximum number of times (inclusive) to build the first item in values. This can be useful when a variable number of items in a list is needed. E.g.:

    Join(Int, max=5, sep=",")
    

class gramfuzz.fields.Q(*values, **kwargs)[source]

A Field subclass that quotes whatever value is provided.

__init__(*values, **kwargs)[source]

Create the new Quote instance

Parameters
  • escape (bool) – Whether or not quoted data should be escaped (default=``False``)

  • html_js_escape (bool) – Whether or not quoted data should be html-javascript escaped (default=``False``)

  • quote (str) – The quote character to be used if escape and html_js_escape are False

build(pre=None, shortest=False)[source]

Build the Quote instance

Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

escape = False

Whether or not the quoted data should be escaped (default=``False``). Uses repr(X)

html_js_escape = False

Whether or not the quoted data should be html-javascript escaped (default=``False``)

quote = b'"'

Which quote character to use if escape and html_js_escape are False (default=``’”’``)

class gramfuzz.fields.Ref(refname, **kwargs)[source]

The Ref class is used to reference defined rules by their name. If a rule name is defined multiple times, one will be chosen at random.

For example, suppose we have a rule that returns an integer:

Def("integer", UInt)

We could define another rule that creates a Float by referencing the integer rule twice, and placing a period between them:

Def("float", Ref("integer"), ".", Ref("integer"))
__init__(refname, **kwargs)[source]

Create a new Ref instance

Parameters
  • refname (str) – The name of the rule to reference

  • cat (str) – The name of the category the rule is defined in

build(pre=None, shortest=False)[source]

Build the Ref instance by fetching the rule from the GramFuzzer instance and building it

Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

cat = 'default'

The default category where the referenced rule definition will be looked for

class gramfuzz.fields.STAR(*values, **kwargs)[source]

Acts like the * in a regex - zero or more of the values. The values are Anded together zero or more times, up to max times.

build(pre=None, shortest=False)[source]

Build the STAR field.

Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

class gramfuzz.fields.String(value=None, **kwargs)[source]

Defines a string field

__init__(value=None, **kwargs)[source]

Create a new instance of the String field.

Parameters
  • value – The hard-coded value of the String field

  • min (int) – The minimum size of the String when built

  • max (int) – The maximum size of the String when built

  • charset (str) – The character-set to be used when building the string

build(pre=None, shortest=False)[source]

Build the String instance

Parameters
  • pre (list) – The prerequisites list (optional, default=None)

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.

charset = b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

The default character set of the String field class (default=charset_alpha)

charset_all = b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\xc2\x80\xc2\x81\xc2\x82\xc2\x83\xc2\x84\xc2\x85\xc2\x86\xc2\x87\xc2\x88\xc2\x89\xc2\x8a\xc2\x8b\xc2\x8c\xc2\x8d\xc2\x8e\xc2\x8f\xc2\x90\xc2\x91\xc2\x92\xc2\x93\xc2\x94\xc2\x95\xc2\x96\xc2\x97\xc2\x98\xc2\x99\xc2\x9a\xc2\x9b\xc2\x9c\xc2\x9d\xc2\x9e\xc2\x9f\xc2\xa0\xc2\xa1\xc2\xa2\xc2\xa3\xc2\xa4\xc2\xa5\xc2\xa6\xc2\xa7\xc2\xa8\xc2\xa9\xc2\xaa\xc2\xab\xc2\xac\xc2\xad\xc2\xae\xc2\xaf\xc2\xb0\xc2\xb1\xc2\xb2\xc2\xb3\xc2\xb4\xc2\xb5\xc2\xb6\xc2\xb7\xc2\xb8\xc2\xb9\xc2\xba\xc2\xbb\xc2\xbc\xc2\xbd\xc2\xbe\xc2\xbf\xc3\x80\xc3\x81\xc3\x82\xc3\x83\xc3\x84\xc3\x85\xc3\x86\xc3\x87\xc3\x88\xc3\x89\xc3\x8a\xc3\x8b\xc3\x8c\xc3\x8d\xc3\x8e\xc3\x8f\xc3\x90\xc3\x91\xc3\x92\xc3\x93\xc3\x94\xc3\x95\xc3\x96\xc3\x97\xc3\x98\xc3\x99\xc3\x9a\xc3\x9b\xc3\x9c\xc3\x9d\xc3\x9e\xc3\x9f\xc3\xa0\xc3\xa1\xc3\xa2\xc3\xa3\xc3\xa4\xc3\xa5\xc3\xa6\xc3\xa7\xc3\xa8\xc3\xa9\xc3\xaa\xc3\xab\xc3\xac\xc3\xad\xc3\xae\xc3\xaf\xc3\xb0\xc3\xb1\xc3\xb2\xc3\xb3\xc3\xb4\xc3\xb5\xc3\xb6\xc3\xb7\xc3\xb8\xc3\xb9\xc3\xba\xc3\xbb\xc3\xbc\xc3\xbd\xc3\xbe\xc3\xbf'

All possible binary characters (0x0-0xff)

charset_alpha = b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

Upper- and lower-case alphabet

charset_alpha_lower = b'abcdefghijklmnopqrstuvwxyz'

A lower-case alphabet character set

charset_alpha_upper = b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

An upper-case alphabet character set

charset_alphanum = b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'

Alpha-numeric character set (upper- and lower-case alphabet + numbers)

charset_num = b'1234567890'

Numeric character set

charset_spaces = b'\n\r\t '

Whitespace character set

odds = [(0.85, [0, 20]), (0.1, 1), (0.025, 0), (0.025, [20, 100])]

Unlike numeric Field types, the odds value for the String field defines the length of the field, not characters used in the string.

See the gramfuzz.fields.Field.odds member for details on the format of the odds probability list.

class gramfuzz.fields.UFloat(value=None, **kwargs)[source]

Defines an unsigned float field.

class gramfuzz.fields.UInt(value=None, **kwargs)[source]

Defines an unsigned integer Field.

gramfuzz.fields.WOr

alias of gramfuzz.fields.WeightedOr

class gramfuzz.fields.WeightedOr(*values, **kwargs)[source]

A Field subclass that chooses one of the provided values at random as the result of a call to the build() method. Takes an odds array rather than just direct values. Also aliased to WOr.

E.g.

WeightedOr(
    ("hello", 0.1), # 10% chance
    (UInt,    0.7), # 70% chance
    (3.14,    0.3), # 30% chance
)

# or

WOr(
    ("hello", 0.1), # 10% chance
    (UInt,    0.7), # 70% chance
    (3.14,    0.3), # 30% chance
)
__init__(*values, **kwargs)[source]

Create a new WeightedOr instance with the provided values.

Parameters

values (list) – A list of tuples of the form [(value, probability), ...]

build(pre=None, shortest=False)[source]
Parameters
  • pre (list) – The prerequisites list

  • shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.